Rectifier: Code Translation with Corrector via LLMs

Rectifier: Code Translation with Corrector via LLMs

July 2024 | XIN YIN, Zhejiang University, China CHAO NI*, Zhejiang University, Hangzhou High-Tech Zone (Binjiang) Blockchain and Data Security Research Institute, China TIEN N. NGUYEN, University of Texas at Dallas, USA SHAOHUA WANG, Central University of Finance and Economics, China XIAOHU YANG, Zhejiang University, China
The paper "Rectifier: Code Translation with Corrector via LLMs" addresses the challenges of code translation using large language models (LLMs). Traditional methods relied on handcrafted translation rules, which are error-prone and time-consuming. Recent advancements in deep learning have led to the use of LLMs for code translation, but these models still generate various types of errors, including compilation, runtime, functional, and non-terminating execution errors. The authors propose a micro and universal model called Rectifier, which is designed to correct these errors. Rectifier learns from the errors generated by existing LLMs and can be applied to any LLM. The model is fine-tuned on the CodeT5+ 220M model, which requires significantly fewer computational resources compared to larger LLMs like Llama-2 13B. Experiments on the CodeNet and AVATAR datasets, covering C++, Java, and Python, demonstrate the effectiveness and robustness of Rectifier. The results show that Rectifier can repair a significant number of errors generated by different LLMs, highlighting its universal and LLM-agnostic nature. The paper also discusses the limitations of LLMs in handling certain types of errors, such as logic and model-specific errors, and provides case studies to illustrate the strengths and weaknesses of Rectifier.The paper "Rectifier: Code Translation with Corrector via LLMs" addresses the challenges of code translation using large language models (LLMs). Traditional methods relied on handcrafted translation rules, which are error-prone and time-consuming. Recent advancements in deep learning have led to the use of LLMs for code translation, but these models still generate various types of errors, including compilation, runtime, functional, and non-terminating execution errors. The authors propose a micro and universal model called Rectifier, which is designed to correct these errors. Rectifier learns from the errors generated by existing LLMs and can be applied to any LLM. The model is fine-tuned on the CodeT5+ 220M model, which requires significantly fewer computational resources compared to larger LLMs like Llama-2 13B. Experiments on the CodeNet and AVATAR datasets, covering C++, Java, and Python, demonstrate the effectiveness and robustness of Rectifier. The results show that Rectifier can repair a significant number of errors generated by different LLMs, highlighting its universal and LLM-agnostic nature. The paper also discusses the limitations of LLMs in handling certain types of errors, such as logic and model-specific errors, and provides case studies to illustrate the strengths and weaknesses of Rectifier.
Reach us at info@study.space