24 Jun 2024 | Tao Sun, Linzheng Chai, Jian Yang, Yuwei Yin, Hongcheng Guo, Jiaheng Liu, Bing Wang, Liqun Yang, Zhoujun Li
The paper introduces UniCoder, a framework for scaling large language models (LLMs) to handle code generation and translation tasks. The key innovation is the use of Universal Code (UniCode), which is a mixed representation of programming language conventions and natural language descriptions. UniCode serves as an intermediate representation that bridges the gap between abstract algorithmic steps and executable code, making it suitable for multiple programming languages.
The authors collect an instruction dataset, UniCODER-INSTRUCT, which includes natural language questions, code solutions, and corresponding UniCode. This dataset is used to train the UniCODER model through multi-task learning objectives, including question-answer generation, question-UniCode generation, UniCode-answer translation, and Universal-code-of-Thought (UoT). The experimental results demonstrate that UniCoder significantly outperforms previous prompting methods, showcasing the effectiveness of UniCode in improving code generation and translation tasks.
The paper also discusses the construction of the UniCODER-INSTRUCT dataset from existing instruction datasets and raw code snippets, as well as the evaluation of UniCODER on benchmarks such as HumanEval, MBPP, and MultiPL-E. The evaluation metrics used are Pass@k, which measures the accuracy of the generated code in passing test cases. The results show that UniCODER consistently achieves state-of-the-art performance across different benchmarks, highlighting the effectiveness of the proposed method.The paper introduces UniCoder, a framework for scaling large language models (LLMs) to handle code generation and translation tasks. The key innovation is the use of Universal Code (UniCode), which is a mixed representation of programming language conventions and natural language descriptions. UniCode serves as an intermediate representation that bridges the gap between abstract algorithmic steps and executable code, making it suitable for multiple programming languages.
The authors collect an instruction dataset, UniCODER-INSTRUCT, which includes natural language questions, code solutions, and corresponding UniCode. This dataset is used to train the UniCODER model through multi-task learning objectives, including question-answer generation, question-UniCode generation, UniCode-answer translation, and Universal-code-of-Thought (UoT). The experimental results demonstrate that UniCoder significantly outperforms previous prompting methods, showcasing the effectiveness of UniCode in improving code generation and translation tasks.
The paper also discusses the construction of the UniCODER-INSTRUCT dataset from existing instruction datasets and raw code snippets, as well as the evaluation of UniCODER on benchmarks such as HumanEval, MBPP, and MultiPL-E. The evaluation metrics used are Pass@k, which measures the accuracy of the generated code in passing test cases. The results show that UniCODER consistently achieves state-of-the-art performance across different benchmarks, highlighting the effectiveness of the proposed method.