19 May 2016 | Dzmitry Bahdanau, KyungHyun Cho, Yoshua Bengio
This paper introduces a novel approach to neural machine translation, which aims to improve the performance of existing encoder-decoder models by allowing the model to automatically (soft-)search for relevant parts of the source sentence during translation. Unlike traditional encoder-decoder models that encode the entire source sentence into a fixed-length vector, the proposed model encodes the source sentence into a sequence of vectors and adaptively selects a subset of these vectors during decoding. This approach is shown to achieve significantly better translation performance, especially for longer sentences, compared to the conventional encoder-decoder model. The proposed model, named RNNsearch, is evaluated on the English-to-French translation task using the WMT '14 dataset and achieves comparable or better performance than the state-of-the-art phrase-based system. Qualitative analysis reveals that the model finds linguistically plausible (soft-)alignments between the source and target sentences, demonstrating its effectiveness in handling long sentences and capturing complex relationships between words.This paper introduces a novel approach to neural machine translation, which aims to improve the performance of existing encoder-decoder models by allowing the model to automatically (soft-)search for relevant parts of the source sentence during translation. Unlike traditional encoder-decoder models that encode the entire source sentence into a fixed-length vector, the proposed model encodes the source sentence into a sequence of vectors and adaptively selects a subset of these vectors during decoding. This approach is shown to achieve significantly better translation performance, especially for longer sentences, compared to the conventional encoder-decoder model. The proposed model, named RNNsearch, is evaluated on the English-to-French translation task using the WMT '14 dataset and achieves comparable or better performance than the state-of-the-art phrase-based system. Qualitative analysis reveals that the model finds linguistically plausible (soft-)alignments between the source and target sentences, demonstrating its effectiveness in handling long sentences and capturing complex relationships between words.