16 Apr 2019 | Dheeru Dua*, Yizhong Wang*, Pradeep Dasigi*, Gabriel Stanovsky*, Sameer Singh*, and Matt Gardner*
DROP is a new English reading comprehension benchmark designed to push the field towards more comprehensive paragraph understanding. The benchmark, which consists of 96,567 questions, requires systems to perform discrete reasoning over the content of paragraphs, such as addition, counting, or sorting. These operations demand a more extensive understanding of paragraph content compared to previous datasets. The best-performing systems achieve only 32.7% F1 on the generalized accuracy metric, while expert human performance is 96.4%. The authors also present a new model that combines reading comprehension methods with simple numerical reasoning, achieving 47.0% F1. The dataset is constructed through crowdsourcing, with passages from Wikipedia and adversarially created questions to increase difficulty. The paper includes an analysis of the dataset's properties and a discussion of related work, highlighting the challenges in building effective semantic parsers for DROP.DROP is a new English reading comprehension benchmark designed to push the field towards more comprehensive paragraph understanding. The benchmark, which consists of 96,567 questions, requires systems to perform discrete reasoning over the content of paragraphs, such as addition, counting, or sorting. These operations demand a more extensive understanding of paragraph content compared to previous datasets. The best-performing systems achieve only 32.7% F1 on the generalized accuracy metric, while expert human performance is 96.4%. The authors also present a new model that combines reading comprehension methods with simple numerical reasoning, achieving 47.0% F1. The dataset is constructed through crowdsourcing, with passages from Wikipedia and adversarially created questions to increase difficulty. The paper includes an analysis of the dataset's properties and a discussion of related work, highlighting the challenges in building effective semantic parsers for DROP.