The Harness class is a testing class for NLP models. It evaluates the performance of a given NLP model on a given task, dataset and test configuration. Users are able to generate test cases, save and re-use them, create reports and augment datasets based on test results.
# Import Harness from the LangTest library from langtest import Harness
Here is a list of the different parameters that can be passed to the
|task||Task for which the model is to be evaluated (‘text-classification’, ‘question-answering’, ‘ner’)|
|model||Specifies the model(s) to be evaluated. This parameter can be provided as either a dictionary or a list of dictionaries. Each dictionary should contain the following keys:
• model (mandatory): PipelineModel or path to a saved model or pretrained pipeline/model from hub.
• hub (mandatory): Hub (library) to use in back-end for loading model from public models hub or from path
|data||The data to be used for evaluation. A dictionary providing flexibility and options for data sources. It should include the following keys:
• data_source (mandatory): The source of the data.
• subset (optional): The subset of the data.
• feature_column (optional): The column containing the features.
• target_column (optional): The column containing the target labels.
• split (optional): The data split to be used.
• source (optional): Set to ‘huggingface’ when loading Hugging Face dataset.
|config||Path to the YAML file with configuration of tests to be performed|