Adapting Large Language Models for Document-Level Machine Translation

Adapting Large Language Models for Document-Level Machine Translation

9 Jun 2024 | Minghao Wu, Thuy-Trang Vu, Lizhen Qu, George Foster, Gholamreza Haffari
This study investigates the adaptation of large language models (LLMs) for document-level machine translation (DocMT). The research explores the effectiveness of parameter-efficient fine-tuning (PEFT) and full fine-tuning (FFT) methods on moderately-sized LLMs for DocMT tasks. The study evaluates the performance of LLMs across 18 translation tasks involving nine language pairs, using metrics such as sBLEU, dBLEU, and COMET. The results show that specialized models can sometimes surpass GPT-4 in translation performance but still face issues like off-target translation due to error propagation in decoding. The study provides an in-depth analysis of LLMs tailored for DocMT, examining translation errors, discourse phenomena, training strategies, the scaling law of parallel documents, recent test set evaluations, and zero-shot cross-lingual transfer. The findings highlight the strengths and limitations of LLM-based DocMT models and provide a foundation for future research. The study also investigates the impact of prompt strategies on translation performance and compares different fine-tuning approaches. The results indicate that PEFT outperforms FFT overall, but FFT is more data-efficient. The study also evaluates the performance of LLM-based DocMT models on recent test sets and finds that they generalize better on out-of-domain text compared to conventional DocMT models. The study also explores the advantage of base LLMs for task-specific supervised fine-tuning and shows that they perform better than instruction-tuned LLMs in zero-shot cross-lingual transfer. The study concludes that while LLMs show promise in DocMT, they still face challenges such as off-target translation and require further research to improve their performance.This study investigates the adaptation of large language models (LLMs) for document-level machine translation (DocMT). The research explores the effectiveness of parameter-efficient fine-tuning (PEFT) and full fine-tuning (FFT) methods on moderately-sized LLMs for DocMT tasks. The study evaluates the performance of LLMs across 18 translation tasks involving nine language pairs, using metrics such as sBLEU, dBLEU, and COMET. The results show that specialized models can sometimes surpass GPT-4 in translation performance but still face issues like off-target translation due to error propagation in decoding. The study provides an in-depth analysis of LLMs tailored for DocMT, examining translation errors, discourse phenomena, training strategies, the scaling law of parallel documents, recent test set evaluations, and zero-shot cross-lingual transfer. The findings highlight the strengths and limitations of LLM-based DocMT models and provide a foundation for future research. The study also investigates the impact of prompt strategies on translation performance and compares different fine-tuning approaches. The results indicate that PEFT outperforms FFT overall, but FFT is more data-efficient. The study also evaluates the performance of LLM-based DocMT models on recent test sets and finds that they generalize better on out-of-domain text compared to conventional DocMT models. The study also explores the advantage of base LLMs for task-specific supervised fine-tuning and shows that they perform better than instruction-tuned LLMs in zero-shot cross-lingual transfer. The study concludes that while LLMs show promise in DocMT, they still face challenges such as off-target translation and require further research to improve their performance.
Reach us at info@study.space