LangTest provides support for a variety of benchmark datasets in the legal domain which are listed below in the table, allowing you to assess the performance of your models on legal queries.
| Dataset | Task | Category | Source | Colab |
|---|---|---|---|---|
| Contracts | question-answering | robustness, accuracy, fairness |
Answer yes/no questions about whether contractual clauses discuss particular issues. | |
| Consumer-Contracts | question-answering | robustness, accuracy, fairness |
Answer yes/no questions on the rights and obligations created by clauses in terms of services agreements. | |
| Privacy-Policy | question-answering | robustness, accuracy, fairness |
Given a question and a clause from a privacy policy, determine if the clause contains enough information to answer the question. | |
| FIQA | question-answering | robustness, accuracy, fairness |
FIQA (Financial Opinion Mining and Question Answering) | |
| MultiLexSum | summarization | robustness, accuracy, fairness |
Multi-LexSum: Real-World Summaries of Civil Rights Lawsuits at Multiple Granularities |