24 Sep 2017 | Jiwei Li1, Will Monroe1, Tianlin Shi1, Sébastien Jean2, Alan Ritter3 and Dan Jurafsky1
This paper proposes using adversarial training for open-domain dialogue generation, inspired by the Turing test. The system is trained to produce sequences that are indistinguishable from human-generated dialogues. The task is cast as a reinforcement learning problem, where a generative model and a discriminator are jointly trained. The generative model produces response sequences, while the discriminator distinguishes between human and machine-generated dialogues. The outputs from the discriminator serve as rewards for the generative model, pushing it to generate more human-like dialogues. Additionally, the paper introduces adversarial evaluation, which uses success in fooling the adversary as a dialogue evaluation metric. Experimental results demonstrate that the adversarially-trained system generates higher-quality responses compared to previous baselines. The paper also discusses potential pitfalls of adversarial evaluations and provides strategies to avoid them.This paper proposes using adversarial training for open-domain dialogue generation, inspired by the Turing test. The system is trained to produce sequences that are indistinguishable from human-generated dialogues. The task is cast as a reinforcement learning problem, where a generative model and a discriminator are jointly trained. The generative model produces response sequences, while the discriminator distinguishes between human and machine-generated dialogues. The outputs from the discriminator serve as rewards for the generative model, pushing it to generate more human-like dialogues. Additionally, the paper introduces adversarial evaluation, which uses success in fooling the adversary as a dialogue evaluation metric. Experimental results demonstrate that the adversarially-trained system generates higher-quality responses compared to previous baselines. The paper also discusses potential pitfalls of adversarial evaluations and provides strategies to avoid them.