Test Harness


Harness Class

The Harness class is a testing class for NLP models. It evaluates the performance of a given NLP model on a given task, dataset and test configuration. Users are able to generate test cases, save and re-use them, create reports and augment datasets based on test results.

# Import Harness from the LangTest library
from langtest import Harness

Here is a list of the different parameters that can be passed to the Harness class:


Parameter Description
task Task for which the model is to be evaluated (‘text-classification’, ‘question-answering’, ‘ner’)
model Specifies the model(s) to be evaluated. This parameter can be provided as either a dictionary or a list of dictionaries. Each dictionary should contain the following keys:
• model (mandatory): PipelineModel or path to a saved model or pretrained pipeline/model from hub.
• hub (mandatory): Hub (library) to use in back-end for loading model from public models hub or from path
data The data to be used for evaluation. A dictionary providing flexibility and options for data sources. It should include the following keys:
• data_source (mandatory): The source of the data.
• subset (optional): The subset of the data.
• feature_column (optional): The column containing the features.
• target_column (optional): The column containing the target labels.
• split (optional): The data split to be used.
• source (optional): Set to ‘huggingface’ when loading Hugging Face dataset.
config Path to the YAML file with configuration of tests to be performed
Last updated