This paper explores how to guide large language models (LLMs) to post-edit machine translation (MT) using error annotations. The study uses LLaMA-2 models and evaluates different prompting strategies with varying levels of feedback, including generic, score-based, and fine-grained feedback derived from Multidimensional Quality Metric (MQM) annotations. The results show that prompting LLMs to post-edit MT improves automatic metrics like TER, BLEU, and COMET scores. Fine-tuning the LLMs with fine-grained feedback further enhances translation quality, both in automatic and human evaluations. The study also compares different feedback granularities and finds that fine-grained feedback leads to more natural translations. The research demonstrates that smaller open-source models can achieve strong post-editing capabilities, and that fine-tuning with error annotations improves translation quality and error resolution. The study also includes human evaluation, which confirms that fine-tuned models produce more natural translations. The findings suggest that post-editing MT does not require the largest proprietary LLMs and can be effectively done with smaller open-source models. The study highlights the effectiveness of using external feedback to resolve errors in translations and the need for automated systems to generate high-quality error annotations. The research also identifies limitations, including the need for more diverse settings and the challenge of generating high-quality feedback. Overall, the study shows that post-editing MT with LLMs can improve translation quality and naturalness, and that fine-tuning with error annotations is an effective approach.This paper explores how to guide large language models (LLMs) to post-edit machine translation (MT) using error annotations. The study uses LLaMA-2 models and evaluates different prompting strategies with varying levels of feedback, including generic, score-based, and fine-grained feedback derived from Multidimensional Quality Metric (MQM) annotations. The results show that prompting LLMs to post-edit MT improves automatic metrics like TER, BLEU, and COMET scores. Fine-tuning the LLMs with fine-grained feedback further enhances translation quality, both in automatic and human evaluations. The study also compares different feedback granularities and finds that fine-grained feedback leads to more natural translations. The research demonstrates that smaller open-source models can achieve strong post-editing capabilities, and that fine-tuning with error annotations improves translation quality and error resolution. The study also includes human evaluation, which confirms that fine-tuned models produce more natural translations. The findings suggest that post-editing MT does not require the largest proprietary LLMs and can be effectively done with smaller open-source models. The study highlights the effectiveness of using external feedback to resolve errors in translations and the need for automated systems to generate high-quality error annotations. The research also identifies limitations, including the need for more diverse settings and the challenge of generating high-quality feedback. Overall, the study shows that post-editing MT with LLMs can improve translation quality and naturalness, and that fine-tuning with error annotations is an effective approach.