3 Jun 2016 | Rico Sennrich and Barry Haddow and Alexandra Birch
This paper investigates the use of monolingual data in training Neural Machine Translation (NMT) models, aiming to improve fluency and performance. Unlike previous work that combines NMT with separately trained language models, the authors explore strategies to incorporate monolingual data without altering the neural network architecture. They propose two methods: using dummy source sentences and synthetic source sentences obtained through back-translation. The results show significant improvements on the WMT 15 English→German task (+2.8–3.7 BLEU) and the IWSLT 14 Turkish→English task (+2.1–3.4 BLEU), achieving new state-of-the-art results. The authors also demonstrate that fine-tuning on in-domain monolingual and parallel data enhances performance on the IWSLT 15 English→German task. The paper highlights the effectiveness of synthetic parallel data for domain adaptation and the benefits of monolingual data in reducing overfitting and improving fluency.This paper investigates the use of monolingual data in training Neural Machine Translation (NMT) models, aiming to improve fluency and performance. Unlike previous work that combines NMT with separately trained language models, the authors explore strategies to incorporate monolingual data without altering the neural network architecture. They propose two methods: using dummy source sentences and synthetic source sentences obtained through back-translation. The results show significant improvements on the WMT 15 English→German task (+2.8–3.7 BLEU) and the IWSLT 14 Turkish→English task (+2.1–3.4 BLEU), achieving new state-of-the-art results. The authors also demonstrate that fine-tuning on in-domain monolingual and parallel data enhances performance on the IWSLT 15 English→German task. The paper highlights the effectiveness of synthetic parallel data for domain adaptation and the benefits of monolingual data in reducing overfitting and improving fluency.