NarrativeQA

Source: The NarrativeQA Reading Comprehension Challenge

The NarrativeQA dataset is a collection of stories and questions designed to test reading comprehension, especially on long documents. The dataset contains many stories from various genres, such as books, movie scripts, and news articles. For each story, there are multiple questions and answers that require understanding the plot, characters, and events of the story. The dataset is challenging because the questions are not answerable by simple keyword matching or extraction, but require inference and reasoning based on the whole story.

You can see which subsets and splits are available below.

Split	Details
test	Testing set from the NarrativeQA dataset, containing 3000 stories and corresponding questions designed to test reading comprehension, especially on long documents.
test-tiny	Truncated version of NarrativeQA dataset which contains 50 stories and corresponding questions examples.