6 May 2016 | Marc'Aurelio Ranzato, Sumit Chopra, Michael Auli, Wojciech Zaremba
The paper addresses the issue of exposure bias and the lack of sequence-level optimization in text generation models, which are trained to predict the next word given the previous words and some context but are used to generate the entire sequence from scratch at test time. To solve this, the authors propose a novel training algorithm called Mixed Incremental Cross-Entropy Reinforce (MIXER), which directly optimizes the evaluation metrics used at test time, such as BLEU or ROUGE. MIXER avoids exposure bias by using model predictions during training and combines REINFORCE with cross-entropy loss. The method is evaluated on three tasks: text summarization, machine translation, and image captioning, outperforming several strong baselines, including greedy generation and beam search. MIXER is also significantly faster than beam search, making it a competitive and efficient approach for text generation.The paper addresses the issue of exposure bias and the lack of sequence-level optimization in text generation models, which are trained to predict the next word given the previous words and some context but are used to generate the entire sequence from scratch at test time. To solve this, the authors propose a novel training algorithm called Mixed Incremental Cross-Entropy Reinforce (MIXER), which directly optimizes the evaluation metrics used at test time, such as BLEU or ROUGE. MIXER avoids exposure bias by using model predictions during training and combines REINFORCE with cross-entropy loss. The method is evaluated on three tasks: text summarization, machine translation, and image captioning, outperforming several strong baselines, including greedy generation and beam search. MIXER is also significantly faster than beam search, making it a competitive and efficient approach for text generation.