A Diversity-Promoting Objective Function for Neural Conversation Models

A Diversity-Promoting Objective Function for Neural Conversation Models

10 Jun 2016 | Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, Bill Dolan
This paper addresses the issue of neural conversation models generating safe and commonplace responses, such as "I don't know," regardless of the input. The traditional objective function, which maximizes the likelihood of output given input, is found to be unsuitable for response generation tasks. Instead, the authors propose using Maximum Mutual Information (MMI) as the objective function. Experimental results on two conversational datasets (Twitter and OpenSubtitles) demonstrate that MMI models produce more diverse, interesting, and appropriate responses, leading to significant improvements in BLEU scores and human evaluations. The paper also discusses practical strategies for implementing MMI in sequence-to-sequence models and highlights the benefits of using MMI over the standard likelihood objective.This paper addresses the issue of neural conversation models generating safe and commonplace responses, such as "I don't know," regardless of the input. The traditional objective function, which maximizes the likelihood of output given input, is found to be unsuitable for response generation tasks. Instead, the authors propose using Maximum Mutual Information (MMI) as the objective function. Experimental results on two conversational datasets (Twitter and OpenSubtitles) demonstrate that MMI models produce more diverse, interesting, and appropriate responses, leading to significant improvements in BLEU scores and human evaluations. The paper also discusses practical strategies for implementing MMI in sequence-to-sequence models and highlights the benefits of using MMI over the standard likelihood objective.
Reach us at info@study.space