Improving Neural Machine Translation Models with Monolingual Data

Improving Neural Machine Translation Models with Monolingual Data

3 Jun 2016 | Rico Sennrich and Barry Haddow and Alexandra Birch
This paper presents a method to improve Neural Machine Translation (NMT) models by incorporating monolingual data. The authors investigate the use of monolingual data in NMT without altering the neural network architecture. They propose two strategies: using dummy source sentences and synthetic source sentences generated through back-translation. The synthetic data is treated as additional parallel training data, which significantly improves translation quality. The approach is tested on the WMT 15 English→German task, achieving a +2.8–3.7 BLEU improvement, and on the IWSLT 14 Turkish→English task, achieving a +2.1–3.4 BLEU improvement. The method also shows effectiveness in domain adaptation, with fine-tuning on in-domain monolingual and parallel data improving performance on the IWSLT 15 English→German task. The study demonstrates that encoder-decoder NMT architectures can effectively learn from monolingual data without requiring additional language models. The results show that synthetic data, generated through back-translation, is more effective than dummy source sentences. The paper also compares the effectiveness of synthetic data with phrase-based statistical machine translation (SMT) and finds that synthetic data improves NMT performance, especially in domain adaptation scenarios. The study highlights the importance of monolingual data in improving fluency and reducing overfitting in NMT models.This paper presents a method to improve Neural Machine Translation (NMT) models by incorporating monolingual data. The authors investigate the use of monolingual data in NMT without altering the neural network architecture. They propose two strategies: using dummy source sentences and synthetic source sentences generated through back-translation. The synthetic data is treated as additional parallel training data, which significantly improves translation quality. The approach is tested on the WMT 15 English→German task, achieving a +2.8–3.7 BLEU improvement, and on the IWSLT 14 Turkish→English task, achieving a +2.1–3.4 BLEU improvement. The method also shows effectiveness in domain adaptation, with fine-tuning on in-domain monolingual and parallel data improving performance on the IWSLT 15 English→German task. The study demonstrates that encoder-decoder NMT architectures can effectively learn from monolingual data without requiring additional language models. The results show that synthetic data, generated through back-translation, is more effective than dummy source sentences. The paper also compares the effectiveness of synthetic data with phrase-based statistical machine translation (SMT) and finds that synthetic data improves NMT performance, especially in domain adaptation scenarios. The study highlights the importance of monolingual data in improving fluency and reducing overfitting in NMT models.
Reach us at info@study.space