Test Harness

Harness Class

The Harness class is a testing class for NLP models. It evaluates the performance of a given NLP model on a given task, dataset and test configuration. Users are able to generate test cases, save and re-use them, create reports and augment datasets based on test results.

# Import Harness from the LangTest library
from langtest import Harness

Here is a list of the different parameters that can be passed to the Harness class:

Parameters

Parameter	Description
task	Task for which the model is to be evaluated (‘text-classification’, ‘question-answering’, ‘ner’)
model	Specifies the model(s) to be evaluated. This parameter can be provided as either a dictionary or a list of dictionaries. Each dictionary should contain the following keys: • model (mandatory): PipelineModel or path to a saved model or pretrained pipeline/model from hub. • hub (mandatory): Hub (library) to use in back-end for loading model from public models hub or from path
data	The data to be used for evaluation. A dictionary providing flexibility and options for data sources. It should include the following keys: • data_source (mandatory): The source of the data. • subset (optional): The subset of the data. • feature_column (optional): The column containing the features. • target_column (optional): The column containing the target labels. • split (optional): The data split to be used. • source (optional): Set to ‘huggingface’ when loading Hugging Face dataset.
config	Path to the YAML file with configuration of tests to be performed