15 Mar 2019 | Alon Talmor*,1,2 Jonathan Herzig*,1 Nicholas Lourie2 Jonathan Berant1,2
The paper introduces COMMONSENSEQA, a new dataset for commonsense question answering, which aims to test the ability of models to reason using common sense knowledge. The dataset is generated by extracting target concepts from CONCEPTNET and asking crowd workers to create multiple-choice questions that require prior knowledge to answer. The questions are designed to capture complex semantics and common sense beyond simple associations. The authors collected 12,247 questions through this process and evaluated various models, including pre-trained language models and fine-tuned models. The best baseline, BERT-Large, achieved 56% accuracy, significantly lower than human performance (89%). The paper also provides an analysis of the dataset's unique properties and the types of commonsense skills required to answer the questions. The goal is to facilitate future research on incorporating commonsense knowledge into natural language processing systems.The paper introduces COMMONSENSEQA, a new dataset for commonsense question answering, which aims to test the ability of models to reason using common sense knowledge. The dataset is generated by extracting target concepts from CONCEPTNET and asking crowd workers to create multiple-choice questions that require prior knowledge to answer. The questions are designed to capture complex semantics and common sense beyond simple associations. The authors collected 12,247 questions through this process and evaluated various models, including pre-trained language models and fine-tuned models. The best baseline, BERT-Large, achieved 56% accuracy, significantly lower than human performance (89%). The paper also provides an analysis of the dataset's unique properties and the types of commonsense skills required to answer the questions. The goal is to facilitate future research on incorporating commonsense knowledge into natural language processing systems.