Min Country Economic Representation Count
This test checks the data regarding the sample counts of countries by economic levels.
alias_name: min_country_economic_representation_count
This data was curated using World Bank data. To apply this test appropriately in other contexts, please adapt the data dictionaries.
Config
min_country_economic_representation_count:
min_proportion:
high_income: 50
low_income: 50
- min_count (int): Minimum count to pass the test.
Min Country Economic Representation Proportion
This test checks the data regarding the sample proportions of countries by economic levels.
alias_name: min_country_economic_representation_proportion
This data was curated using World Bank data. To apply this test appropriately in other contexts, please adapt the data dictionaries.
Config
min_country_economic_representation_proportion:
min_proportion:
high_income: 0.6
low_income: 0.1
- min_proportion (float): Minimum proportion to pass the test.
Min Ethnicity Representation Count
This test checks the data regarding the sample counts of ethnicities.
alias_name: min_ethnicity_name_representation_count
This data was curated using 2021 US census survey data. To apply this test appropriately in other contexts, please adapt the data dictionaries.
Config
min_ethnicity_name_representation_count:
min_count:
white: 50
black: 10
asian: 40
hispanic: 30
- min_count (int): Minimum count to pass the test.
Min Ethnicity Representation Proportion
This test checks the data regarding the sample proportions of ethnicities.
alias_name: min_ethnicity_name_representation_proportion
This data was curated using 2021 US census survey data. To apply this test appropriately in other contexts, please adapt the data dictionaries.
Config
min_ethnicity_name_representation_proportion:
min_proportion:
white: 0.20
black: 0.36
- min_proportion (float): Minimum proportion to pass the test.
Min Gender Representation Count
This test checks the data regarding the sample counts of genders.
alias_name: min_gender_representation_count
*The underlying gender classifier is a rule based classifier which outputs one of 3 categories: male, female and neutral. *
Config
min_gender_representation_count:
min_count:
male: 20
female: 30
- min_count (int): Minimum count to pass the test.
Min Gender Representation Proportion
This test checks the data regarding the sample proportions of genders.
alias_name: min_gender_representation_proportion
*The underlying gender classifier is a rule based classifier which outputs one of 3 categories: male, female and neutral. *
Config
min_gender_representation_count:
min_count:
male: 0.2
female: 0.3
- min_proportion (float): Minimum proportion to pass the test.
Min Label Representation Count
This test checks the data regarding the sample counts of labels.
alias_name: min_label_representation_count
Config
min_label_representation_count:
min_count:
positive: 10
negative: 10
- min_count (int): Minimum count to pass the test.
Min Label Representation Proportion
This test checks the data regarding the sample proportions of labels.
alias_name: min_label_representation_proportion
Config
min_label_representation_proportion:
min_proportion:
O: 0.2
LOC: 0.2
PER: 0.2
- min_proportion (float): Minimum proportion to pass the test.
Min Religion Name Representation Count
This test checks the data regarding the sample counts of religions.
alias_name: min_religion_name_representation_count
This data was curated using Kidpaw. Please adapt the data dictionaries to fit your use-case.
Config
min_religion_name_representation_count:
min_count:
christian: 10
muslim: 5
hindu: 8
parsi: 40
sikh: 10
- min_count (int): Minimum count to pass the test.
Min Religion Name Representation Proportion
This test checks the data regarding the sample proportion of religions.
alias_name: min_religion_name_representation_proportion
This data was curated using Kidpaw. Please adapt the data dictionaries to fit your use-case.
Config
min_religion_name_representation_proportion:
min_proportion:
muslim: 0.2
hindu: 0.2
- min_proportion (float): Minimum proportion to pass the test.
Custom Representation
Supported Custom representation Data Category:
Country-Economic-Representation
Religion-Representation
Ethnicity-Representation
Label-Representation
(only ner)
How to Add Custom Representation
To add custom representation, you can follow these steps:
# Import Harness from the LangTest library
from langtest import Harness
# Create a Harness object
harness = Harness(
task="ner",
model='en_core_web_sm',
hub="spacy"
)
# Load custom representation data for ethnicity representation
harness.pass_custom_data(
file_path='ethnicity_representation_data.json',
test_name="Ethnicity-Representation",
task="representation"
)
When adding custom representation data, it’s important to note that each custom representation category may have a different data format for the JSON file. Ensure that the JSON file adheres to the specific format required for each category.
Additionally, it’s important to remember that when you add custom representation data, it will affect a particular set of representation tests based on the category and data provided.
To learn more about the data format and how to structure the JSON file for custom representation data, you can refer to the tutorial available here.