April 20, 2024 | Yichen Li, Yun Peng, Yintong Huo, Michael R. Lyu
This paper proposes IDECoder, a novel framework that enhances large language model (LLM)-based coding tools by integrating native static contexts from Integrated Development Environments (IDEs). Current LLMs are trained on in-file contexts and struggle with repository-level code completion due to limited context length and lack of cross-file information. IDEs provide accurate, real-time cross-file information such as class hierarchies, function signatures, and variable types, which can significantly improve LLM performance in code completion tasks. IDECoder leverages IDE-native static contexts for cross-context construction and uses diagnosis results for self-refinement. It utilizes the rich cross-context information available in IDEs to enhance the capabilities of LLMs in repository-level code completion. The framework conducts preliminary experiments and demonstrates that the synergy between IDEs and LLMs represents a promising trend for future exploration. The paper also discusses the challenges of repo-level code completion, including cross-file context identification and fusion, and presents a methodology for addressing these challenges. The proposed framework includes three key phases: cross-file context identification, cross-file context fusion, and linting-based code refinement. The results show that IDECoder consistently outperforms existing methods in code completion tasks, highlighting its effectiveness. The paper concludes that integrating IDEs with LLMs can lead to more powerful and natural coding tools, and that the core idea of extending cross-file contexts from IDE-provided static information can be generalized to other code-related tasks.This paper proposes IDECoder, a novel framework that enhances large language model (LLM)-based coding tools by integrating native static contexts from Integrated Development Environments (IDEs). Current LLMs are trained on in-file contexts and struggle with repository-level code completion due to limited context length and lack of cross-file information. IDEs provide accurate, real-time cross-file information such as class hierarchies, function signatures, and variable types, which can significantly improve LLM performance in code completion tasks. IDECoder leverages IDE-native static contexts for cross-context construction and uses diagnosis results for self-refinement. It utilizes the rich cross-context information available in IDEs to enhance the capabilities of LLMs in repository-level code completion. The framework conducts preliminary experiments and demonstrates that the synergy between IDEs and LLMs represents a promising trend for future exploration. The paper also discusses the challenges of repo-level code completion, including cross-file context identification and fusion, and presents a methodology for addressing these challenges. The proposed framework includes three key phases: cross-file context identification, cross-file context fusion, and linting-based code refinement. The results show that IDECoder consistently outperforms existing methods in code completion tasks, highlighting its effectiveness. The paper concludes that integrating IDEs with LLMs can lead to more powerful and natural coding tools, and that the core idea of extending cross-file contexts from IDE-provided static information can be generalized to other code-related tasks.