April 2024 | YANGRUIBO DING, Columbia University, USA MARCUS J. MIN, Columbia University, USA GAIL KAISER, Columbia University, USA BAISHAKHI RAY, Columbia University, USA
**CYCLE: Learning to Self-Refine the Code Generation**
Pre-trained code language models (code LMs) have shown promising performance in code generation, but their self-refinement capability is often overlooked. This paper introduces CYCLE, a framework designed to enhance code LMs' ability to self-refine faulty generations based on execution feedback. The study reveals that existing code LMs struggle to effectively self-refine, which can lead to significant debugging challenges for developers.
**Key Contributions:**
1. **CYCLE Framework:** CYCLE is designed to teach code LMs to self-refine by jointly attending to three information sources: natural language problem descriptions, incorrect code, and execution feedback.
2. **Data Collection:** An automated approach is proposed to distill features from pre-trained code LMs, constructing datasets for self-refinement training.
3. **Training Strategy:** A specialized training strategy is designed to effectively learn code refinement, including the use of a Past Generation Mask (PGM) to prevent the model from copying faulty code.
4. **Iterative Self-Refinement:** CYCLE implements an iterative workflow that automatically generates code, refines it based on execution feedback, and repeats the process until the code passes all tests or reaches a maximum refinement step.
**Evaluation:**
- **Benchmarks:** CYCLE is evaluated on three popular code generation benchmarks: HumanEval, MBPP-Sanitized, and APPS.
- **Model Variants:** Four variants of CYCLE are trained with different parameter sizes (350M, 1B, 2B, 3B).
- **Results:** CYCLE consistently boosts code generation performance by up to 63.5% across all benchmarks and model sizes, outperforming larger models with 3× more parameters in self-refinement.
**Conclusion:**
CYCLE effectively enhances code LMs' self-refinement capability, improving both one-time generation and iterative refinement. The framework's design and performance are analyzed in detail, providing insights into the importance of self-refinement in code generation and suggesting future directions for further research.**CYCLE: Learning to Self-Refine the Code Generation**
Pre-trained code language models (code LMs) have shown promising performance in code generation, but their self-refinement capability is often overlooked. This paper introduces CYCLE, a framework designed to enhance code LMs' ability to self-refine faulty generations based on execution feedback. The study reveals that existing code LMs struggle to effectively self-refine, which can lead to significant debugging challenges for developers.
**Key Contributions:**
1. **CYCLE Framework:** CYCLE is designed to teach code LMs to self-refine by jointly attending to three information sources: natural language problem descriptions, incorrect code, and execution feedback.
2. **Data Collection:** An automated approach is proposed to distill features from pre-trained code LMs, constructing datasets for self-refinement training.
3. **Training Strategy:** A specialized training strategy is designed to effectively learn code refinement, including the use of a Past Generation Mask (PGM) to prevent the model from copying faulty code.
4. **Iterative Self-Refinement:** CYCLE implements an iterative workflow that automatically generates code, refines it based on execution feedback, and repeats the process until the code passes all tests or reaches a maximum refinement step.
**Evaluation:**
- **Benchmarks:** CYCLE is evaluated on three popular code generation benchmarks: HumanEval, MBPP-Sanitized, and APPS.
- **Model Variants:** Four variants of CYCLE are trained with different parameter sizes (350M, 1B, 2B, 3B).
- **Results:** CYCLE consistently boosts code generation performance by up to 63.5% across all benchmarks and model sizes, outperforming larger models with 3× more parameters in self-refinement.
**Conclusion:**
CYCLE effectively enhances code LMs' self-refinement capability, improving both one-time generation and iterative refinement. The framework's design and performance are analyzed in detail, providing insights into the importance of self-refinement in code generation and suggesting future directions for further research.