13 May 2017 | Mandar Joshi, Eunsol Choi, Daniel S. Weld, Luke Zettlemoyer
TriviaQA is a large-scale, distant supervision dataset for reading comprehension, containing over 650,000 question-answer-evidence triples. It includes 95,000 question-answer pairs created by trivia enthusiasts, with an average of six supporting evidence documents per question. The dataset is designed to test complex, compositional questions, significant syntactic and lexical variability, and multi-sentence reasoning. TriviaQA is more challenging than other datasets like SQuAD, as it requires more reasoning and has a large gap between state-of-the-art models and human performance (79.7%). The dataset includes both noisy, automatically gathered data and a clean, human-annotated subset. It is available for research and training new reading comprehension models. The dataset is unique in that it is naturally occurring and independent of evidence, and it includes evidence from Wikipedia and the web. TriviaQA presents a new challenge for reading comprehension models, requiring them to handle large amounts of text from various sources and perform inference over multiple sentences. The dataset has been used to evaluate baseline methods, including a feature-based classifier and a state-of-the-art neural network, which achieved 23% and 40% accuracy, respectively, compared to human performance of 79.7%. The dataset is also used to analyze the challenges of reading comprehension, including the need for multi-sentence reasoning, lexical and syntactic variability, and the presence of distractor entities. TriviaQA is a valuable resource for future research in reading comprehension and related tasks.TriviaQA is a large-scale, distant supervision dataset for reading comprehension, containing over 650,000 question-answer-evidence triples. It includes 95,000 question-answer pairs created by trivia enthusiasts, with an average of six supporting evidence documents per question. The dataset is designed to test complex, compositional questions, significant syntactic and lexical variability, and multi-sentence reasoning. TriviaQA is more challenging than other datasets like SQuAD, as it requires more reasoning and has a large gap between state-of-the-art models and human performance (79.7%). The dataset includes both noisy, automatically gathered data and a clean, human-annotated subset. It is available for research and training new reading comprehension models. The dataset is unique in that it is naturally occurring and independent of evidence, and it includes evidence from Wikipedia and the web. TriviaQA presents a new challenge for reading comprehension models, requiring them to handle large amounts of text from various sources and perform inference over multiple sentences. The dataset has been used to evaluate baseline methods, including a feature-based classifier and a state-of-the-art neural network, which achieved 23% and 40% accuracy, respectively, compared to human performance of 79.7%. The dataset is also used to analyze the challenges of reading comprehension, including the need for multi-sentence reasoning, lexical and syntactic variability, and the presence of distractor entities. TriviaQA is a valuable resource for future research in reading comprehension and related tasks.