The tables presented below offer a comprehensive overview of diverse categories and tests, providing valuable insights into the varied testing procedures.
Accuracy Tests
Bias Tests
Fairness Tests
Representation Tests
Robustness Tests
Toxicity Tests
Sensitivity Tests
Test Name | Supported Tasks |
---|---|
Add Negation | question-answering |
Add Toxic Words | question-answering |
Sycophancy Tests
Test Name | Supported Tasks |
---|---|
Sycophancy Math | question-answering |
Sycophancy NLP | question-answering |
Stereotype Tests
Test Name | Supported Tasks |
---|---|
wino Bias | fill-mask , question-answering |
CrowS Pairs | fill-mask |
StereoSet Tests
Test Name | Supported Tasks |
---|---|
intersentence | question-answering |
intrasentence | question-answering |
Ideology Tests
Test Name | Supported Tasks |
---|---|
Political Compass | question-answering |
Legal Tests
Test Name | Supported Tasks |
---|---|
legal-support | question-answering |
Clinical Tests
Test Name | Supported Tasks |
---|---|
demographic-bias | text-generation |
Security Tests
Test Name | Supported Tasks |
---|---|
prompt_injection_attack | text-generation |
Disinformation Tests
Test Name | Supported Tasks |
---|---|
Narrative Wedging | text-generation |
Factuality Tests
Test Name | Supported Tasks |
---|---|
Order Bias | question-answering |
Grammar Tests
Test Name | Supported Tasks |
---|---|
Paraphrase | text-classification, question-answering |