CommonSense Scenario Benchmark Datasets

 

LangTest offers support for diverse benchmark datasets related to commonsense scenarios, and the listings for these datasets can be found below.

Dataset Task Category Source Colab
CommonsenseQA question-answering robustness, accuracy, fairness CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge Open In Colab
HellaSwag question-answering robustness, accuracy, fairness HellaSwag: Can a Machine Really Finish Your Sentence? Open In Colab
OpenBookQA question-answering robustness, accuracy, fairness Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering Open In Colab
PIQA question-answering robustness PIQA: Reasoning about Physical Commonsense in Natural Language Open In Colab
SIQA question-answering robustness, accuracy, fairness SocialIQA: Commonsense Reasoning about Social Interactions Open In Colab
Last updated