THE CURIOUS CASE OF NEURAL TEXT DeGENERATION

THE CURIOUS CASE OF NEURAL TEXT DeGENERATION

14 Feb 2020 | Ari Holtzman†‡ Jan Buys§† Li Du† Maxwell Forbes†‡ Yejin Choi†‡
The paper explores the issue of neural text degeneration, where decoding strategies like beam search produce repetitive, incoherent text despite the model's high quality. The authors propose Nucleus Sampling as a solution, which samples from the most probable tokens in a dynamic nucleus, avoiding the unreliable tail of the probability distribution. They compare various decoding methods, including beam search, top-k sampling, and temperature-based sampling, against human-generated text along dimensions such as likelihood, diversity, and repetition. Results show that beam search and top-k sampling lead to degenerate text, while Nucleus Sampling produces high-quality, diverse text. The paper also discusses the limitations of current decoding strategies and highlights the importance of balancing likelihood and diversity in text generation. Nucleus Sampling is shown to be the best decoding strategy for generating long-form text that is both high-quality and diverse. The study concludes that Nucleus Sampling effectively captures the model's confidence region and avoids the issues of repetition and incoherence seen in other methods.The paper explores the issue of neural text degeneration, where decoding strategies like beam search produce repetitive, incoherent text despite the model's high quality. The authors propose Nucleus Sampling as a solution, which samples from the most probable tokens in a dynamic nucleus, avoiding the unreliable tail of the probability distribution. They compare various decoding methods, including beam search, top-k sampling, and temperature-based sampling, against human-generated text along dimensions such as likelihood, diversity, and repetition. Results show that beam search and top-k sampling lead to degenerate text, while Nucleus Sampling produces high-quality, diverse text. The paper also discusses the limitations of current decoding strategies and highlights the importance of balancing likelihood and diversity in text generation. Nucleus Sampling is shown to be the best decoding strategy for generating long-form text that is both high-quality and diverse. The study concludes that Nucleus Sampling effectively captures the model's confidence region and avoids the issues of repetition and incoherence seen in other methods.
Reach us at info@study.space