Personalizing Dialogue Agents: I have a dog, do you have pets too?

Personalizing Dialogue Agents: I have a dog, do you have pets too?

25 Sep 2018 | Saizheng Zhang†,1, Emily Dinan†, Jack Urbanek†, Arthur Szlam†, Douwe Kiela†, Jason Weston†
This paper introduces the PERSONA-CHAT dataset, a new dialogue dataset consisting of 162,064 utterances between crowdworkers who were randomly paired and each asked to act the part of a given provided persona. The dataset was created to facilitate research into improving chit-chat dialogue agents by endowing them with a configurable, persistent persona. The personas are created by crowdworkers and include multiple sentences of textual description. The dataset was collected by having two Turkers chat naturally while playing the part of a given persona, resulting in a dataset of 162,064 utterances over 10,907 dialogs. The paper presents various models for next utterance prediction, including ranking models and generative models. Ranking models produce a next utterance by considering any utterance in the training set as a possible candidate reply, while generative models generate novel sentences by conditioning on the dialogue history. The paper shows that models that have access to their own personas are scored as more consistent by annotators, although not more engaging. On the other hand, models trained on PERSONA-CHAT are more engaging than models trained on dialogue from other resources. The paper also presents experiments on profile prediction, showing that human speaker profiles can be predicted from their dialogue with high accuracy. The results indicate that it is possible to predict the profile of a speaker from their dialogue utterances, and that the accuracy of prediction improves as the dialogue length increases. The paper concludes that the PERSONA-CHAT dataset is a useful resource for training components of future dialogue systems, as it provides a dataset of dialogues where each participant plays the part of an assigned persona. The dataset also includes paraphrased versions of the personas, which cannot be trivially matched, making it a valuable resource for training agents that can ask questions about users' profiles, remember the answers, and use them naturally in conversation.This paper introduces the PERSONA-CHAT dataset, a new dialogue dataset consisting of 162,064 utterances between crowdworkers who were randomly paired and each asked to act the part of a given provided persona. The dataset was created to facilitate research into improving chit-chat dialogue agents by endowing them with a configurable, persistent persona. The personas are created by crowdworkers and include multiple sentences of textual description. The dataset was collected by having two Turkers chat naturally while playing the part of a given persona, resulting in a dataset of 162,064 utterances over 10,907 dialogs. The paper presents various models for next utterance prediction, including ranking models and generative models. Ranking models produce a next utterance by considering any utterance in the training set as a possible candidate reply, while generative models generate novel sentences by conditioning on the dialogue history. The paper shows that models that have access to their own personas are scored as more consistent by annotators, although not more engaging. On the other hand, models trained on PERSONA-CHAT are more engaging than models trained on dialogue from other resources. The paper also presents experiments on profile prediction, showing that human speaker profiles can be predicted from their dialogue with high accuracy. The results indicate that it is possible to predict the profile of a speaker from their dialogue utterances, and that the accuracy of prediction improves as the dialogue length increases. The paper concludes that the PERSONA-CHAT dataset is a useful resource for training components of future dialogue systems, as it provides a dataset of dialogues where each participant plays the part of an assigned persona. The dataset also includes paraphrased versions of the personas, which cannot be trivially matched, making it a valuable resource for training agents that can ask questions about users' profiles, remember the answers, and use them naturally in conversation.
Reach us at info@study.space
[slides and audio] Personalizing Dialogue Agents%3A I have a dog%2C do you have pets too%3F