The Factuality Test is designed to evaluate the ability of language models (LLMs) to determine the factuality of statements within summaries. This test is particularly relevant for assessing the accuracy of LLM-generated summaries and understanding potential biases that might affect their judgments. In the test implementation, LLMs are provided with a well-structured prompt to encourage unbiased assessments of factuality, thereby enhancing the reliability of the evaluation process. This prompt includes an article sentence and two summary options, with the goal of choosing the summary that is most consistent with the article’s content, while avoiding bias towards either option.
factuality: order_bias: min_pass_rate: 0.70
- min_pass_rate (float): Minimum pass rate to pass the test.
|the abc have reported that those who receive centrelink payments made up half of radio rental’s income last year.||those who receive centrelink payments made up half of radio rental’s income last year.||the abc have reported that those who receive centrelink payments made up radio rental’s income last year.||A||B||True|