20 Apr 2020 | Paweł Budzianowski, Tsung-Hsien Wen, Bo-Hsiang Tseng, Iñigo Casanueva, Stefan Ultes, Osman Ramadan and Milica Gašić
The paper introduces MultiWOZ, a large-scale multi-domain wizard-of- Oz dataset for task-oriented dialogue modeling. MultiWOZ consists of 10,438 dialogues across 7 domains, making it significantly larger than previous annotated task-oriented corpora. The dataset is fully labeled with dialogue belief states and actions, and the collection process is crowd-sourced without professional annotators. The paper details the data collection procedure, data structure, and analysis, and reports benchmark results for belief tracking, dialogue act generation, and response generation. These results demonstrate the usability of MultiWOZ and set a baseline for future research. The dataset and baseline models are available online.The paper introduces MultiWOZ, a large-scale multi-domain wizard-of- Oz dataset for task-oriented dialogue modeling. MultiWOZ consists of 10,438 dialogues across 7 domains, making it significantly larger than previous annotated task-oriented corpora. The dataset is fully labeled with dialogue belief states and actions, and the collection process is crowd-sourced without professional annotators. The paper details the data collection procedure, data structure, and analysis, and reports benchmark results for belief tracking, dialogue act generation, and response generation. These results demonstrate the usability of MultiWOZ and set a baseline for future research. The dataset and baseline models are available online.