29 Mar 2019 | Siva Reddy*, Danqi Chen*, Christopher D. Manning
CoQA is a novel dataset designed to evaluate conversational question answering systems. The dataset contains 127,000 questions with answers, derived from 8,000 conversations about text passages from seven diverse domains. Each conversation turn consists of a question, an answer, and a rationale (a text span from the passage). The questions are conversational and the answers are free-form text, highlighting the challenges of coreference and pragmatic reasoning. The best-performing system achieves an F1 score of 65.4%, significantly behind human performance (88.8%), indicating significant room for improvement. CoQA aims to address the limitations of existing reading comprehension datasets by incorporating conversational elements, natural answer formats, and a diverse set of domains. The dataset is available at <https://stanfordnlp.github.io/coqa>.CoQA is a novel dataset designed to evaluate conversational question answering systems. The dataset contains 127,000 questions with answers, derived from 8,000 conversations about text passages from seven diverse domains. Each conversation turn consists of a question, an answer, and a rationale (a text span from the passage). The questions are conversational and the answers are free-form text, highlighting the challenges of coreference and pragmatic reasoning. The best-performing system achieves an F1 score of 65.4%, significantly behind human performance (88.8%), indicating significant room for improvement. CoQA aims to address the limitations of existing reading comprehension datasets by incorporating conversational elements, natural answer formats, and a diverse set of domains. The dataset is available at <https://stanfordnlp.github.io/coqa>.