Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks

Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks

23 Sep 2015 | Samy Bengio, Oriol Vinyals, Navdeep Jaitly, Noam Shazeer
This paper proposes a curriculum learning approach called Scheduled Sampling to improve the performance of recurrent neural networks (RNNs) in sequence prediction tasks. RNNs are trained to generate sequences of tokens, such as in machine translation and image captioning. However, during training, the model uses the true previous token, while during inference, it generates the previous token itself, leading to a discrepancy that can cause errors to accumulate. To address this, Scheduled Sampling gradually transitions the training process from using the true previous token to using the generated one, allowing the model to learn to correct its own mistakes during inference. The approach involves randomly deciding during training whether to use the true previous token or a generated one, with the probability of using the generated token decreasing over time. This is implemented using a schedule that controls the probability of using the generated token, such as linear, exponential, or inverse sigmoid decay. The model is trained to maximize the likelihood of the target sequence given the input, and during inference, it generates tokens one at a time until it generates a special end-of-sequence token. Experiments on tasks such as image captioning, constituency parsing, and speech recognition show that Scheduled Sampling improves performance. In image captioning, the model outperformed baselines on multiple metrics, and in speech recognition, it improved decoding frame error rates. The approach was also used successfully in the 2015 MSCOCO image captioning challenge, where the model ranked first in the final leaderboard. Scheduled Sampling helps the model learn to handle the discrepancy between training and inference by gradually introducing the model to the conditions it will face during inference. This leads to more robust models that can handle errors during inference. The approach is particularly effective in tasks where the model must generate sequences of tokens, such as in language modeling and image captioning. The paper concludes that Scheduled Sampling is a promising approach for improving the performance of RNNs in sequence prediction tasks.This paper proposes a curriculum learning approach called Scheduled Sampling to improve the performance of recurrent neural networks (RNNs) in sequence prediction tasks. RNNs are trained to generate sequences of tokens, such as in machine translation and image captioning. However, during training, the model uses the true previous token, while during inference, it generates the previous token itself, leading to a discrepancy that can cause errors to accumulate. To address this, Scheduled Sampling gradually transitions the training process from using the true previous token to using the generated one, allowing the model to learn to correct its own mistakes during inference. The approach involves randomly deciding during training whether to use the true previous token or a generated one, with the probability of using the generated token decreasing over time. This is implemented using a schedule that controls the probability of using the generated token, such as linear, exponential, or inverse sigmoid decay. The model is trained to maximize the likelihood of the target sequence given the input, and during inference, it generates tokens one at a time until it generates a special end-of-sequence token. Experiments on tasks such as image captioning, constituency parsing, and speech recognition show that Scheduled Sampling improves performance. In image captioning, the model outperformed baselines on multiple metrics, and in speech recognition, it improved decoding frame error rates. The approach was also used successfully in the 2015 MSCOCO image captioning challenge, where the model ranked first in the final leaderboard. Scheduled Sampling helps the model learn to handle the discrepancy between training and inference by gradually introducing the model to the conditions it will face during inference. This leads to more robust models that can handle errors during inference. The approach is particularly effective in tasks where the model must generate sequences of tokens, such as in language modeling and image captioning. The paper concludes that Scheduled Sampling is a promising approach for improving the performance of RNNs in sequence prediction tasks.
Reach us at info@study.space
Understanding Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks