LangTest | Deliver Safe & Effective Models

Simple

Generate & run over 60 test types on the most popular NLP frameworks & tasks with 1 line of code

Comprehensive

Test all aspects of model quality - robustness, bias, fairness, representation and accuracy - before going to production

100% Open Source

The full code base is open under the Apache 2.0 license, designed for easy extension and AI community collaboration

Fully Integrated Workflow

Get Started

  
# Using PyPI
pip install langtest

In a Few Lines of Code

John Snow Labs Hugging Face OpenAI Spacy

                      
!pip install langtest[johnsnowlabs]

from langtest import Harness

# Create a Harness object
h = Harness(task='ner', model={'model': 'ner.dl', 'hub':'johnsnowlabs'})

# Generate, run and get a report on your test cases
h.generate().run().report()

                      
!pip install langtest[transformers] 

from langtest import Harness

# Create a Harness object
h = Harness(task='ner', model={'model': 'dslim/bert-base-NER', 'hub':'huggingface'})

# Generate, run and get a report on your test cases
h.generate().run().report()

                      
!pip install langtest[openai]
                 
from langtest import Harness

# Set API keys
os.environ['OPENAI_API_KEY'] = ''

# Create a Harness object
h = Harness(task="question-answering", 
              model={"model": "gpt-3.5-turbo-instruct","hub":"openai"}, 
              data={"data_source" :"BoolQ", "split":"test-tiny"}

# Generate, run and get a report on your test cases
h.generate().run().report()

                      
!pip install langtest[spacy]
from langtest import Harness                    

# Create a Harness object
h = Harness(task='ner', model={'model': 'en_core_web_sm', 'hub':'spacy'})

# Generate, run and get a report on your test cases
h.generate().run().report()

60+ Out-Of-The-Box Test Types

Robustness

This movie was beyond horrible NEGATIVE

This mvie wsa beyond hroieble NEUTRAL

Fairness

Coverage

She's a massive fan of

football SPORT

She's a massive fan of

cricket ANIMAL

Age Bias

An old man with

Parkinson's DISEASE

A young man with

Parkinson's OTHER

Origin Bias

The company's CEO is British NEUTRAL

The company's CEO is Syrian NEGATIVE

Ethnicity Bias

Jonas Smith is flying tomorrow NEUTRAL

Abdul Karim is flying tomorrow NEGATIVE

Accuracy

Gender Representation

Data Leakage

Auto-Generate Test Cases

                
h.generate().run().report()

Category	Test Type	Pass Rate	Minimum Pass Rate
Robustness	Add Typos	0.50	0.65
Bias	Ethnicity	0.85	0.75
Representation	Gender	0.80	0.75

Auto-Correct Models with Data Augmentation

                
h.augment(training_data=data, save_data_path='augmented_data')
new_model = nlp.load('model_name').fit('augmented_data')
Harness.load(save_dir='testsuite', model=new_model).run()

Before

Category	Test Type	Pass
Robustness	Add Typos
Bias	Ethnicity
Representation	Gender

After

Category	Test Type	Pass
Robustness	Add Typos
Bias	Ethnicity
Representation	Gender

Integrate Testing into CI/CD or MLOps

                
class DataScienceWorkFlow(FlowSpec):
    @step
    def train(self): ...

    @step
    def run_tests(self):
        harness = Harness.load(model=self.model, save_dir='testsuite')
        self.report = harness.run().report()

    @step
    def deploy(self):
        if self.report['score'] > self.threshold: ...

              

Get Started Now