LangTest provides support for a variety of benchmark datasets in the legal domain which are listed below in the table, allowing you to assess the performance of your models on legal queries.
Dataset | Task | Category | Source | Colab |
---|---|---|---|---|
Contracts | question-answering | robustness , accuracy , fairness |
Answer yes/no questions about whether contractual clauses discuss particular issues. | |
Consumer-Contracts | question-answering | robustness , accuracy , fairness |
Answer yes/no questions on the rights and obligations created by clauses in terms of services agreements. | |
Privacy-Policy | question-answering | robustness , accuracy , fairness |
Given a question and a clause from a privacy policy, determine if the clause contains enough information to answer the question. | |
FIQA | question-answering | robustness , accuracy , fairness |
FIQA (Financial Opinion Mining and Question Answering) | |
MultiLexSum | summarization | robustness , accuracy , fairness |
Multi-LexSum: Real-World Summaries of Civil Rights Lawsuits at Multiple Granularities |