30 May 2024 | John Yang, Carlos Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, Ofir Press
The paper introduces SWE-agent, a system designed to enable language model (LM) agents to autonomously perform software engineering tasks. SWE-agent features a custom agent-computer interface (ACI) that enhances the LM's ability to create and edit code files, navigate repositories, and execute tests. The ACI is tailored to leverage the strengths of LMs while mitigating their weaknesses, such as visual understanding and context management. The system is evaluated on the SWE-bench dataset, achieving a pass@1 rate of 12.5%, significantly outperforming previous non-interactive LM systems. The paper also discusses the design principles of the ACI, including simplicity, efficiency, informative feedback, and error recovery mechanisms. Through ablation studies, the authors demonstrate how different ACI components impact LM performance, highlighting the importance of well-designed interfaces in improving agent behavior and performance.The paper introduces SWE-agent, a system designed to enable language model (LM) agents to autonomously perform software engineering tasks. SWE-agent features a custom agent-computer interface (ACI) that enhances the LM's ability to create and edit code files, navigate repositories, and execute tests. The ACI is tailored to leverage the strengths of LMs while mitigating their weaknesses, such as visual understanding and context management. The system is evaluated on the SWE-bench dataset, achieving a pass@1 rate of 12.5%, significantly outperforming previous non-interactive LM systems. The paper also discusses the design principles of the ACI, including simplicity, efficiency, informative feedback, and error recovery mechanisms. Through ablation studies, the authors demonstrate how different ACI components impact LM performance, highlighting the importance of well-designed interfaces in improving agent behavior and performance.