[slides] Neural Machine Translation by Jointly Learning to Align and Translate

This paper introduces a novel approach to neural machine translation, which aims to improve the performance of existing encoder-decoder models by allowing the model to automatically (soft-)search for relevant parts of the source sentence during translation. Unlike traditional encoder-decoder models that encode the entire source sentence into a fixed-length vector, the proposed model encodes the source sentence into a sequence of vectors and adaptively selects a subset of these vectors during decoding. This approach is shown to achieve significantly better translation performance, especially for longer sentences, compared to the conventional encoder-decoder model. The proposed model, named RNNsearch, is evaluated on the English-to-French translation task using the WMT '14 dataset and achieves comparable or better performance than the state-of-the-art phrase-based system. Qualitative analysis reveals that the model finds linguistically plausible (soft-)alignments between the source and target sentences, demonstrating its effectiveness in handling long sentences and capturing complex relationships between words.This paper introduces a novel approach to neural machine translation, which aims to improve the performance of existing encoder-decoder models by allowing the model to automatically (soft-)search for relevant parts of the source sentence during translation. Unlike traditional encoder-decoder models that encode the entire source sentence into a fixed-length vector, the proposed model encodes the source sentence into a sequence of vectors and adaptively selects a subset of these vectors during decoding. This approach is shown to achieve significantly better translation performance, especially for longer sentences, compared to the conventional encoder-decoder model. The proposed model, named RNNsearch, is evaluated on the English-to-French translation task using the WMT '14 dataset and achieves comparable or better performance than the state-of-the-art phrase-based system. Qualitative analysis reveals that the model finds linguistically plausible (soft-)alignments between the source and target sentences, demonstrating its effectiveness in handling long sentences and capturing complex relationships between words.

NEURAL MACHINE TRANSLATION BY JOINTLY LEARNING TO ALIGN AND TRANSLATE

19 May 2016 | Dzmitry Bahdanau, KyungHyun Cho, Yoshua Bengio