FT2Ra: A Fine-Tuning-Inspired Approach to Retrieval-Augmented Code Completion

FT2Ra: A Fine-Tuning-Inspired Approach to Retrieval-Augmented Code Completion

September 16–20, 2024 | Qi Guo, Xiaohong Li, Xiaofei Xie, Shangqing Liu, Ze Tang, Ruitao Feng, Junjie Wang, Jidong Ge, Lei Bu
FT2Ra: A Fine-Tuning-Inspired Approach to Retrieval-Augmented Code Completion This paper proposes FT2Ra, a retrieval-augmented method for code completion that mimics the effects of fine-tuning without requiring actual fine-tuning. The method leverages the concept of Δlogits, which represents the difference in logits between predicted and actual values, to enhance model predictions. FT2Ra iteratively retrieves and updates information from an external database, allowing for continuous improvement in prediction accuracy. The paper evaluates FT2Ra on both token-level and line-level code completion tasks. On token-level completion, FT2Ra achieves a 4.29% improvement in accuracy compared to the best baseline method on UniXcoder. On line-level completion, FT2Ra achieves a significant increase in Exact Match (EM) performance, demonstrating its effectiveness in handling more challenging tasks. Even without actual fine-tuning, FT2Ra exhibits competitive performance compared to models with real fine-tuning. The paper also discusses the theoretical basis for FT2Ra, highlighting the importance of Δlogits in improving model predictions. The method is designed to operate through an iterative retrieval cycle, progressively updating the external database to refine the quality of retrieved information and continuously improve prediction accuracy. The paper compares FT2Ra with several state-of-the-art retrieval-based methods, including kNN-LM, kNM-LM, BM25, and ReACC. The results show that FT2Ra significantly outperforms these methods in both token-level and line-level code completion tasks. Additionally, the paper evaluates the impact of different weighting strategies and the number of neighbors on FT2Ra's performance, finding that the method is relatively insensitive to the number of neighbors and that a distance-based weighting strategy performs best. Overall, the paper demonstrates that FT2Ra is an effective and efficient method for retrieval-augmented code completion, capable of achieving performance comparable to or better than fine-tuned models without the need for actual fine-tuning.FT2Ra: A Fine-Tuning-Inspired Approach to Retrieval-Augmented Code Completion This paper proposes FT2Ra, a retrieval-augmented method for code completion that mimics the effects of fine-tuning without requiring actual fine-tuning. The method leverages the concept of Δlogits, which represents the difference in logits between predicted and actual values, to enhance model predictions. FT2Ra iteratively retrieves and updates information from an external database, allowing for continuous improvement in prediction accuracy. The paper evaluates FT2Ra on both token-level and line-level code completion tasks. On token-level completion, FT2Ra achieves a 4.29% improvement in accuracy compared to the best baseline method on UniXcoder. On line-level completion, FT2Ra achieves a significant increase in Exact Match (EM) performance, demonstrating its effectiveness in handling more challenging tasks. Even without actual fine-tuning, FT2Ra exhibits competitive performance compared to models with real fine-tuning. The paper also discusses the theoretical basis for FT2Ra, highlighting the importance of Δlogits in improving model predictions. The method is designed to operate through an iterative retrieval cycle, progressively updating the external database to refine the quality of retrieved information and continuously improve prediction accuracy. The paper compares FT2Ra with several state-of-the-art retrieval-based methods, including kNN-LM, kNM-LM, BM25, and ReACC. The results show that FT2Ra significantly outperforms these methods in both token-level and line-level code completion tasks. Additionally, the paper evaluates the impact of different weighting strategies and the number of neighbors on FT2Ra's performance, finding that the method is relatively insensitive to the number of neighbors and that a distance-based weighting strategy performs best. Overall, the paper demonstrates that FT2Ra is an effective and efficient method for retrieval-augmented code completion, capable of achieving performance comparable to or better than fine-tuned models without the need for actual fine-tuning.
Reach us at info@study.space