[slides and audio] Guiding Large Language Models to Post-Edit Machine Translation with Error Annotations

This paper explores the use of large language models (LLMs) to improve machine translation (MT) quality through post-editing, leveraging external feedback derived from Multidimensional Quality Metric (MQM) annotations. The authors investigate two main strategies: prompting and fine-tuning. Prompting involves varying the nature of feedback provided to the LLMs, while fine-tuning enhances the LLMs' ability to exploit this feedback. Experiments on Chinese-English, English-German, and English-Russian data show that prompting LLMs to post-edit MT improves translation quality as measured by TER, BLEU, and COMET scores. Fine-tuning further improves these metrics, demonstrating that fine-grained feedback is more effective than generic or score-based feedback. The study also reveals that fine-tuning helps LLMs integrate fine-grained feedback more effectively, leading to more natural translations. The results suggest that post-editing MT output can be effectively achieved using smaller, open-source LLMs, opening avenues for further research in diverse settings and minimizing reliance on expensive human-annotated data.This paper explores the use of large language models (LLMs) to improve machine translation (MT) quality through post-editing, leveraging external feedback derived from Multidimensional Quality Metric (MQM) annotations. The authors investigate two main strategies: prompting and fine-tuning. Prompting involves varying the nature of feedback provided to the LLMs, while fine-tuning enhances the LLMs' ability to exploit this feedback. Experiments on Chinese-English, English-German, and English-Russian data show that prompting LLMs to post-edit MT improves translation quality as measured by TER, BLEU, and COMET scores. Fine-tuning further improves these metrics, demonstrating that fine-grained feedback is more effective than generic or score-based feedback. The study also reveals that fine-tuning helps LLMs integrate fine-grained feedback more effectively, leading to more natural translations. The results suggest that post-editing MT output can be effectively achieved using smaller, open-source LLMs, opening avenues for further research in diverse settings and minimizing reliance on expensive human-annotated data.

Guiding Large Language Models to Post-Edit Machine Translation with Error Annotations

11 Apr 2024 | Dayeon Ki, Marine Carpuat