2024 | Di Wu, Wasi Uddin Ahmad, Dejiao Zhang, Murali Krishna Ramanathan, Xiaofei Ma
REPOFORMER: Selective Retrieval for Repository-Level Code Completion
Recent advances in retrieval-augmented generation (RAG) have initiated a new era in repository-level code completion. However, the invariable use of retrieval in existing methods exposes issues in both efficiency and robustness, with a large proportion of the retrieved contexts proving unhelpful or harmful to code language models (code LMs). In this paper, we propose a selective RAG framework to avoid retrieval when unnecessary. To power this framework, we design a self-supervised learning approach to enable a code LM to accurately self-evaluate whether retrieval can improve its output quality and robustly leverage the potentially noisy retrieved contexts. Using this LM as both the selective RAG policy and the generation model, our framework achieves state-of-the-art repository-level code completion performance on diverse benchmarks including RepoEval, CrossCodeEval, and CrossCodeLongEval, a new long-form code completion benchmark. Meanwhile, our analyses show that selectively retrieving brings as much as 70% inference speedup in the online serving setting without harming the performance. We further demonstrate that our framework is able to accommodate different generation models, retrievers, and programming languages. These advancements position our framework as an important step towards more accurate and efficient repository-level code completion.
The paper introduces REPOFORMER, an intelligent code LM fine-tuned for robust code completion with self-triggered retrieval augmentation. REPOFORMER reflects three core principles: performance-oriented self-evaluation, robustness to retrieved contexts, and generalizability. The framework uses a self-supervised learning approach to enable the code LM to accurately self-evaluate the need for retrieval and robustly complete the code with optional retrieval augmentation. The framework achieves strong performance on various repository-level code completion tasks, outperforming always retrieving with the same-sized StarCoderBase by more than 3 absolute points for edit similarity across multiple tasks. The 3B REPOFORMER performs on par with always retrieving using the 16B StarCoder, and the 16B REPOFORMER achieves state-of-the-art performance across all the tasks. The framework also allows for up to 70% inference speedup without harming accuracy. The paper also demonstrates that REPOFORMER can accelerate RAG with larger black-box LMs as a plug-and-play selective RAG policy, improving the performance while reducing the latency of line and API completion to 75%. The paper provides comprehensive analyses on REPOFORMER's generalization ability, showing that it makes precise retrieval abstention decisions, is robust to retrieved contexts, and performs well when tested in other languages or with other retrievers. The paper will release its implementation and the CrossCodeLongEval benchmark at https://repoformer.github.io/.REPOFORMER: Selective Retrieval for Repository-Level Code Completion
Recent advances in retrieval-augmented generation (RAG) have initiated a new era in repository-level code completion. However, the invariable use of retrieval in existing methods exposes issues in both efficiency and robustness, with a large proportion of the retrieved contexts proving unhelpful or harmful to code language models (code LMs). In this paper, we propose a selective RAG framework to avoid retrieval when unnecessary. To power this framework, we design a self-supervised learning approach to enable a code LM to accurately self-evaluate whether retrieval can improve its output quality and robustly leverage the potentially noisy retrieved contexts. Using this LM as both the selective RAG policy and the generation model, our framework achieves state-of-the-art repository-level code completion performance on diverse benchmarks including RepoEval, CrossCodeEval, and CrossCodeLongEval, a new long-form code completion benchmark. Meanwhile, our analyses show that selectively retrieving brings as much as 70% inference speedup in the online serving setting without harming the performance. We further demonstrate that our framework is able to accommodate different generation models, retrievers, and programming languages. These advancements position our framework as an important step towards more accurate and efficient repository-level code completion.
The paper introduces REPOFORMER, an intelligent code LM fine-tuned for robust code completion with self-triggered retrieval augmentation. REPOFORMER reflects three core principles: performance-oriented self-evaluation, robustness to retrieved contexts, and generalizability. The framework uses a self-supervised learning approach to enable the code LM to accurately self-evaluate the need for retrieval and robustly complete the code with optional retrieval augmentation. The framework achieves strong performance on various repository-level code completion tasks, outperforming always retrieving with the same-sized StarCoderBase by more than 3 absolute points for edit similarity across multiple tasks. The 3B REPOFORMER performs on par with always retrieving using the 16B StarCoder, and the 16B REPOFORMER achieves state-of-the-art performance across all the tasks. The framework also allows for up to 70% inference speedup without harming accuracy. The paper also demonstrates that REPOFORMER can accelerate RAG with larger black-box LMs as a plug-and-play selective RAG policy, improving the performance while reducing the latency of line and API completion to 75%. The paper provides comprehensive analyses on REPOFORMER's generalization ability, showing that it makes precise retrieval abstention decisions, is robust to retrieved contexts, and performs well when tested in other languages or with other retrievers. The paper will release its implementation and the CrossCodeLongEval benchmark at https://repoformer.github.io/.