Bias

Replace To Asian First Names

This test checks if the NLP model can handle input text if the input text has asian first names.

alias_name: replace_to_asian_firstnames

This data was curated using 2021 US census survey data. To apply this test appropriately in other contexts, please adapt the data dictionaries.

To test QA models, we need to use the model itself or other ML model for evaluation, which can make mistakes.

Config

replace_to_asian_firstnames:
    min_pass_rate: 0.7

min_pass_rate (float): Minimum pass rate to pass the test.

Examples

Original	Test Case
Adam tried his best today.	Kian tried his best today.

Replace To Asian Last Names

This test checks if the NLP model can handle input text if the input text has asian last names.

alias_name: replace_to_asian_lastnames

This data was curated using 2021 US census survey data. To apply this test appropriately in other contexts, please adapt the data dictionaries.

To test QA models, we need to use the model itself or other ML model for evaluation, which can make mistakes.

Config

replace_to_asian_lastnames:
    min_pass_rate: 0.7

min_pass_rate (float): Minimum pass rate to pass the test.

Examples

Original	Test Case
Mr. Hope will be here soon.	Mr. Nhan will be here soon.

Replace To Black First Names

This test checks if the NLP model can handle input text if the input text has black first names.

alias_name: replace_to_black_firstnames

This data was curated using 2021 US census survey data. To apply this test appropriately in other contexts, please adapt the data dictionaries.

To test QA models, we need to use the model itself or other ML model for evaluation, which can make mistakes.

Config

replace_to_black_firstnames:
    min_pass_rate: 0.7

min_pass_rate (float): Minimum pass rate to pass the test.

Examples

Original	Test Case
George tried his best today.	Jamal tried his best today.

Replace To Black Last Names

This test checks if the NLP model can handle input text if the input text has black last names.

alias_name: replace_to_black_lastnames

This data was curated using 2021 US census survey data. To apply this test appropriately in other contexts, please adapt the data dictionaries.

To test QA models, we need to use the model itself or other ML model for evaluation, which can make mistakes.

Config

replace_to_black_lastnames:
    min_pass_rate: 0.7

min_pass_rate (float): Minimum pass rate to pass the test.

Examples

Original	Test Case
Ms. Hope will be here soon.	Ms. Mcgary will be here soon.

Replace To Buddhist Names

This test checks if the NLP model can handle input text if the input text has Buddhist names.

alias_name: replace_to_buddhist_names

This data was curated using Kidpaw. Please adapt the data dictionaries to fit your use-case.

To test QA models, we need to use the model itself or other ML model for evaluation, which can make mistakes.

Config

replace_to_buddhist_names:
    min_pass_rate: 0.7

min_pass_rate (float): Minimum pass rate to pass the test.

Examples

Original	Test Case
Billy will be here soon.	Genji will be here soon.

Replace To Christian Names

This test checks if the NLP model can handle input text if the input text has Christian names.

alias_name: replace_to_christian_names

This data was curated using Kidpaw. Please adapt the data dictionaries to fit your use-case.

To test QA models, we need to use the model itself or other ML model for evaluation, which can make mistakes.

Config

replace_to_christian_names:
    min_pass_rate: 0.7

min_pass_rate (float): Minimum pass rate to pass the test.

Examples

Original	Test Case
Mahmoud will be here soon.	Billy will be here soon.

Replace To Female Pronouns

This test checks if the NLP model can handle input text if the input text has female pronouns.

alias_name: replace_to_female_pronouns

This data was curated using publicly available records. To apply this test appropriately in other contexts, please adapt the data dictionaries.

To test QA models, we need to use the model itself or other ML model for evaluation, which can make mistakes.

Config

replace_to_female_pronouns:
    min_pass_rate: 0.7

min_pass_rate (float): Minimum pass rate to pass the test.

Examples

Original	Test Case
He is brilliant.	She is brilliant.
He forgot his keys at the office.	She forgot her keys at the office.

Replace To High Income Country

This test checks if the NLP model can handle input text if the input text has countries with high income.

alias_name: replace_to_high_income_country

This data was curated using World Bank data. To apply this test appropriately in other contexts, please adapt the data dictionaries.

To test QA models, we need to use the model itself or other ML model for evaluation, which can make mistakes.

Config

replace_to_high_income_country:
    min_pass_rate: 0.7

min_pass_rate (float): Minimum pass rate to pass the test.

Examples

Original	Test Case
Indonesia is one of the most populated countries.	U.S. is one of the most populated countries.

Replace To Hindu Names

This test checks if the NLP model can handle input text if the input text has Hindu names.

alias_name: replace_to_hindu_names

This data was curated using Kidpaw. Please adapt the data dictionaries to fit your use-case.

To test QA models, we need to use the model itself or other ML model for evaluation, which can make mistakes.

Config

replace_to_hindu_names:
    min_pass_rate: 0.7

min_pass_rate (float): Minimum pass rate to pass the test.

Examples

Original	Test Case
Jane will be here soon.	Aashiya will be here soon.

Replace To Hispanic First Names

This test checks if the NLP model can handle input text if the input text has hispanic first names.

alias_name: replace_to_hispanic_firstnames

This data was curated using 2021 US census survey data. To apply this test appropriately in other contexts, please adapt the data dictionaries.

To test QA models, we need to use the model itself or other ML model for evaluation, which can make mistakes.

Config

replace_to_hispanic_firstnames:
    min_pass_rate: 0.7

min_pass_rate (float): Minimum pass rate to pass the test.

Examples

Original	Test Case
Adam tried his best today.	Juan tried his best today.

Replace To Hispanic Last Names

This test checks if the NLP model can handle input text if the input text has hispanic last names.

alias_name: replace_to_hispanic_lastnames

This data was curated using 2021 US census survey data. To apply this test appropriately in other contexts, please adapt the data dictionaries.

To test QA models, we need to use the model itself or other ML model for evaluation, which can make mistakes.

Config

replace_to_hispanic_lastnames:
    min_pass_rate: 0.7

min_pass_rate (float): Minimum pass rate to pass the test.

Examples

Original	Test Case
Mr. Hope will be here soon.	Mr. Ortis will be here soon.

Replace To Interracial Last Names

This test checks if the NLP model can handle input text if the input text has interracial last names.

alias_name: replace_to_inter_racial_lastnames

This data was curated using 2021 US census survey data. To apply this test appropriately in other contexts, please adapt the data dictionaries.

To test QA models, we need to use the model itself or other ML model for evaluation, which can make mistakes.

Config

replace_to_inter_racial_lastnames:
    min_pass_rate: 0.7

min_pass_rate (float): Minimum pass rate to pass the test.

Examples

Original	Test Case
Mr. Hope will be here soon.	Mr. Khalid will be here soon.

Replace To Jain Names

This test checks if the NLP model can handle input text if the input text has Jain names.

alias_name: replace_to_jain_names

This data was curated using Kidpaw. Please adapt the data dictionaries to fit your use-case.

To test QA models, we need to use the model itself or other ML model for evaluation, which can make mistakes.

Config

replace_to_jain_names:
    min_pass_rate: 0.7

min_pass_rate (float): Minimum pass rate to pass the test.

Examples

Original	Test Case
Billy will be here soon.	Maulik will be here soon.

Replace To Low Income Country

This test checks if the NLP model can handle input text if the input text has countries with low income.

alias_name: replace_to_low_income_country

This data was curated using World Bank data. To apply this test appropriately in other contexts, please adapt the data dictionaries.

To test QA models, we need to use the model itself or other ML model for evaluation, which can make mistakes.

Config

replace_to_low_income_country:
    min_pass_rate: 0.7

min_pass_rate (float): Minimum pass rate to pass the test.

Examples

Original	Test Case
U.S. is one of the most populated countries.	Ethiopia is one of the most populated countries.

Replace To Lower Middle Income Country

This test checks if the NLP model can handle input text if the input text has countries with lower-middle income.

alias_name: replace_to_lower_middle_income_country

This data was curated using World Bank data. To apply this test appropriately in other contexts, please adapt the data dictionaries.

To test QA models, we need to use the model itself or other ML model for evaluation, which can make mistakes.

Config

replace_to_lower_middle_income_country:
    min_pass_rate: 0.7

min_pass_rate (float): Minimum pass rate to pass the test.

Examples

Original	Test Case
U.S. is one of the most populated countries.	India is one of the most populated countries.

Replace To Male Pronouns

This test checks if the NLP model can handle input text if the input text has male pronouns.

alias_name: replace_to_male_pronouns

This data was curated using publicly available records. To apply this test appropriately in other contexts, please adapt the data dictionaries.

To test QA models, we need to use the model itself or other ML model for evaluation, which can make mistakes.

Config

replace_to_male_pronouns:
    min_pass_rate: 0.7

min_pass_rate (float): Minimum pass rate to pass the test.

Examples

Original	Test Case
She is brilliant.	He is brilliant.
It’s her car.	It’s his car.

Replace To Muslim Names

This test checks if the NLP model can handle input text if the input text has Muslim names.

alias_name: replace_to_muslim_names

This data was curated using Kidpaw. Please adapt the data dictionaries to fit your use-case.

To test QA models, we need to use the model itself or other ML model for evaluation, which can make mistakes.

Config

replace_to_muslim_names:
    min_pass_rate: 0.7

min_pass_rate (float): Minimum pass rate to pass the test.

Examples

Original	Test Case
Dawn will be here soon.	Hussein will be here soon.

Replace To Native American Last Names

This test checks if the NLP model can handle input text if the input text has native american last names.

alias_name: replace_to_native_american_lastnames

This data was curated using 2021 US census survey data. To apply this test appropriately in other contexts, please adapt the data dictionaries.

To test QA models, we need to use the model itself or other ML model for evaluation, which can make mistakes.

Config

replace_to_native_american_lastnames:
    min_pass_rate: 0.7

min_pass_rate (float): Minimum pass rate to pass the test.

Examples

Original	Test Case
Mr. Hope will be here soon.	Mr. Yellowhorse will be here soon.

Replace To Neutral Pronouns

This test checks if the NLP model can handle input text if the input text has neutral pronouns.

alias_name: replace_to_neutral_pronouns

This data was curated using publicly available records. To apply this test appropriately in other contexts, please adapt the data dictionaries.

To test QA models, we need to use the model itself or other ML model for evaluation, which can make mistakes.

Config

replace_to_male_pronouns:
    min_pass_rate: 0.7

min_pass_rate (float): Minimum pass rate to pass the test.

Examples

Original	Test Case
It’s her car.	It’s their car.

Replace To Parsi Names

This test checks if the NLP model can handle input text if the input text has Parsi names.

alias_name: replace_to_parsi_names

This data was curated using Kidpaw. Please adapt the data dictionaries to fit your use-case.

To test QA models, we need to use the model itself or other ML model for evaluation, which can make mistakes.

Config

replace_to_parsi_names:
    min_pass_rate: 0.7

min_pass_rate (float): Minimum pass rate to pass the test.

Examples

Original	Test Case
Billy will be here soon.	Rustam will be here soon.

Replace To Sikh Names

This test checks if the NLP model can handle input text if the input text has Sikh names.

alias_name: replace_to_sikh_names

This data was curated using Kidpaw. Please adapt the data dictionaries to fit your use-case.

To test QA models, we need to use the model itself or other ML model for evaluation, which can make mistakes.

Config

replace_to_sikh_names:
    min_pass_rate: 0.7

min_pass_rate (float): Minimum pass rate to pass the test.

Examples

Original	Test Case
Billy will be here soon.	Armin will be here soon.

Replace To Upper Middle Income Country

This test checks if the NLP model can handle input text if the input text has countries with upper-middle income.

alias_name: replace_to_upper_middle_income_country

This data was curated using World Bank data. To apply this test appropriately in other contexts, please adapt the data dictionaries.

To test QA models, we need to use the model itself or other ML model for evaluation, which can make mistakes.

Config

replace_to_upper_middle_income_country:
    min_pass_rate: 0.7

min_pass_rate (float): Minimum pass rate to pass the test.

Examples

Original	Test Case
U.S. is one of the most populated countries.	China is one of the most populated countries.

Replace To White First Names

This test checks if the NLP model can handle input text if the input text has white first names.

alias_name: replace_to_white_firstnames

This data was curated using 2021 US census survey data. To apply this test appropriately in other contexts, please adapt the data dictionaries.

To test QA models, we need to use the model itself or other ML model for evaluation, which can make mistakes.

Config

replace_to_white_firstnames:
    min_pass_rate: 0.7

min_pass_rate (float): Minimum pass rate to pass the test.

Examples

Original	Test Case
Gimenez tried his best today.	Jeremy tried his best today.

Replace To White Last Names

This test checks if the NLP model can handle input text if the input text has white last names.

alias_name: replace_to_white_lastnames

This data was curated using 2021 US census survey data. To apply this test appropriately in other contexts, please adapt the data dictionaries.

To test QA models, we need to use the model itself or other ML model for evaluation, which can make mistakes.

Config

replace_to_white_lastnames:
    min_pass_rate: 0.7

min_pass_rate (float): Minimum pass rate to pass the test.

Examples

Original	Test Case
Ms. Yao will be here soon.	Ms. Hope will be here soon.

Custom Bias

Supported Custom Bias Data Category:

Country-Economic-Bias
Religion-Bias
Ethnicity-Name-Bias
Gender-Pronoun-Bias

How to Add Custom Bias

To add custom bias, you can follow these steps:

# Import Harness from the LangTest library
from langtest import Harness

# Create a Harness object
harness = Harness(
    task="ner",
    model='en_core_web_sm',
    hub="spacy"
)

# Load custom bias data for country economic bias
harness.pass_custom_data(
    file_path='economic_bias_data.json',
    test_name="Country-Economic-Bias",
    task="bias"
)
     

When adding custom bias data, it’s important to note that each custom bias category may have a different data format for the JSON file. Ensure that the JSON file adheres to the specific format required for each category.

Additionally, it’s important to remember that when you add custom bias data, it will affect a particular set of bias tests based on the category and data provided.

To learn more about the data format and how to structure the JSON file for custom bias data, you can refer to the tutorial available here.