WIZARD OF WIKIPEDIA: KNOWLEDGE-POWERED CONVERSATIONAL AGENTS

WIZARD OF WIKIPEDIA: KNOWLEDGE-POWERED CONVERSATIONAL AGENTS

21 Feb 2019 | Emily Dinan*, Stephen Roller*, Kurt Shuster*, Angela Fan, Michael Auli, Jason Weston
This paper presents a new benchmark for open-domain dialogue systems that incorporate knowledge from Wikipedia. The authors propose a dataset of human-human conversations, where one participant (the wizard) has access to Wikipedia knowledge to inform their responses. The dataset is built using crowd-sourced workers, with 1365 diverse topics linked to Wikipedia articles. Each topic is associated with a set of knowledge sentences from Wikipedia, which the wizard can use to craft responses. The dataset includes 22,311 dialogues with 201,999 turns, divided into training, validation, and test sets. The authors design two types of models: retrieval models that select responses from a set of candidate utterances, and generative models that produce responses word-by-word. Both models are based on Transformer architectures, which are known for their effectiveness in sequence generation tasks. The retrieval models use a knowledge retrieval system to find relevant Wikipedia sentences, while the generative models use a knowledge attention mechanism to focus on the most relevant information. The authors evaluate their models on both automatic metrics and human evaluations. The results show that their models are able to conduct knowledgeable conversations with humans, outperforming baselines such as standard Memory Networks and Transformers. The new benchmark, publicly available in ParlAI, aims to encourage further research in this important area of dialogue systems. The paper also discusses related work, including previous efforts in knowledge-based dialogue systems and question-answering tasks. The authors conclude that their work represents a significant step forward in the development of knowledge-powered conversational agents.This paper presents a new benchmark for open-domain dialogue systems that incorporate knowledge from Wikipedia. The authors propose a dataset of human-human conversations, where one participant (the wizard) has access to Wikipedia knowledge to inform their responses. The dataset is built using crowd-sourced workers, with 1365 diverse topics linked to Wikipedia articles. Each topic is associated with a set of knowledge sentences from Wikipedia, which the wizard can use to craft responses. The dataset includes 22,311 dialogues with 201,999 turns, divided into training, validation, and test sets. The authors design two types of models: retrieval models that select responses from a set of candidate utterances, and generative models that produce responses word-by-word. Both models are based on Transformer architectures, which are known for their effectiveness in sequence generation tasks. The retrieval models use a knowledge retrieval system to find relevant Wikipedia sentences, while the generative models use a knowledge attention mechanism to focus on the most relevant information. The authors evaluate their models on both automatic metrics and human evaluations. The results show that their models are able to conduct knowledgeable conversations with humans, outperforming baselines such as standard Memory Networks and Transformers. The new benchmark, publicly available in ParlAI, aims to encourage further research in this important area of dialogue systems. The paper also discusses related work, including previous efforts in knowledge-based dialogue systems and question-answering tasks. The authors conclude that their work represents a significant step forward in the development of knowledge-powered conversational agents.
Reach us at info@study.space
[slides] Wizard of Wikipedia%3A Knowledge-Powered Conversational agents | StudySpace