April 20, 2024 | Yichen Li, Yun Peng, Yintong Huo, Michael R. Lyu
The paper "Enhancing LLM-Based Coding Tools through Native Integration of IDE-Derived Static Context" by Yichen Li, Yun Peng, Yintong Huo, and Michael R. Lyu explores the integration of Integrated Development Environments (IDEs) to improve the capabilities of Large Language Models (LLMs) in repository-level code completion. The authors argue that current LLMs, while effective for single-source file completion, struggle with cross-file information required for large software projects. They propose IDECoder, a framework that leverages IDE-native static contexts and diagnostic results to enhance LLMs' performance in repository-level code completion.
IDECoder addresses the challenges of cross-file context identification and fusion by utilizing IDE features such as abstract syntax trees, symbol tables, and reference indexing. It identifies relevant cross-file contexts, organizes them into a structured format, and uses a chain-of-thought methodology to model the information sequentially. The framework also incorporates linting feedback to refine the generated code, ensuring its quality and correctness.
Preliminary experiments using Pylance, a Python language service extension, show that IDECoder outperforms baseline methods in code completion tasks, highlighting its potential for future improvements. The authors plan to develop a more mature version of IDECoder, support user-defined LLMs, and extend its capabilities to broader code-related tasks such as automated program repair and debugging.The paper "Enhancing LLM-Based Coding Tools through Native Integration of IDE-Derived Static Context" by Yichen Li, Yun Peng, Yintong Huo, and Michael R. Lyu explores the integration of Integrated Development Environments (IDEs) to improve the capabilities of Large Language Models (LLMs) in repository-level code completion. The authors argue that current LLMs, while effective for single-source file completion, struggle with cross-file information required for large software projects. They propose IDECoder, a framework that leverages IDE-native static contexts and diagnostic results to enhance LLMs' performance in repository-level code completion.
IDECoder addresses the challenges of cross-file context identification and fusion by utilizing IDE features such as abstract syntax trees, symbol tables, and reference indexing. It identifies relevant cross-file contexts, organizes them into a structured format, and uses a chain-of-thought methodology to model the information sequentially. The framework also incorporates linting feedback to refine the generated code, ensuring its quality and correctness.
Preliminary experiments using Pylance, a Python language service extension, show that IDECoder outperforms baseline methods in code completion tasks, highlighting its potential for future improvements. The authors plan to develop a more mature version of IDECoder, support user-defined LLMs, and extend its capabilities to broader code-related tasks such as automated program repair and debugging.