[slides and audio] Deep Reinforcement Learning for Dialogue Generation

This paper presents a deep reinforcement learning framework for generating responses in conversational agents. The authors address the limitations of traditional neural models, which often generate generic and repetitive responses, by integrating reinforcement learning to model future rewards. The proposed model simulates dialogues between two virtual agents, using policy gradient methods to optimize sequences that exhibit three key conversational properties: informativity, coherence, and ease of answering. The model is evaluated on diversity, length, and human judgment, showing that it generates more interactive and sustained responses compared to standard SEQ2SEQ models. The work demonstrates a significant step towards learning a neural conversational model that can achieve long-term success in dialogues.This paper presents a deep reinforcement learning framework for generating responses in conversational agents. The authors address the limitations of traditional neural models, which often generate generic and repetitive responses, by integrating reinforcement learning to model future rewards. The proposed model simulates dialogues between two virtual agents, using policy gradient methods to optimize sequences that exhibit three key conversational properties: informativity, coherence, and ease of answering. The model is evaluated on diversity, length, and human judgment, showing that it generates more interactive and sustained responses compared to standard SEQ2SEQ models. The work demonstrates a significant step towards learning a neural conversational model that can achieve long-term success in dialogues.

Deep Reinforcement Learning for Dialogue Generation

November 1-5, 2016 | Jiwei Li1, Will Monroe1, Alan Ritter2, Michel Galley3, Jianfeng Gao3 and Dan Jurafsky1