25 Mar 2024 | Islem Bouzenia, Premkumar Devanbu, Michael Pradel
RepairAgent is an autonomous, large language model (LLM)-based agent designed for program repair. Unlike existing deep learning-based approaches that rely on fixed prompts or feedback loops, RepairAgent treats the LLM as an autonomous agent capable of planning and executing actions to fix bugs by invoking suitable tools. The agent freely interleaves gathering information about the bug, gathering repair ingredients, and validating fixes, while deciding which tools to invoke based on the gathered information and feedback from previous fix attempts. Key contributions include a set of tools useful for program repair, a dynamically updated prompt format that allows the LLM to interact with these tools, and a finite state machine that guides the agent in invoking the tools. Evaluation on the Defects4J dataset shows that RepairAgent successfully repairs 164 bugs, including 39 not fixed by prior techniques. The average cost per bug is 270,000 tokens, translating to 14 cents per bug under current pricing. This work introduces the first autonomous, LLM-based agent for program repair, paving the way for future agent-based techniques in software engineering. RepairAgent uses a dynamic prompt format, a set of tools for interacting with the code base, and a middleware to orchestrate communication between the LLM and the tools. The agent iteratively interacts with the LLM, using tools to gather information, suggest fixes, and validate them. The approach is evaluated on the Defects4J dataset, showing that RepairAgent can fix bugs of varying complexity, including multi-line and multi-file bugs. The agent's ability to autonomously retrieve suitable repair ingredients and perform edits to an arbitrary number of lines and files contributes to its effectiveness. The approach is also evaluated for cost, showing that the median token consumption is approximately 270,000 tokens, equating to around 14 cents. The agent uses a variety of tools, including those for reading and extracting code, searching and generating code, testing and patching, and control. The middleware ensures that the LLM's output is parsed and refined, and that the agent's actions are executed in an isolated environment. The approach is compared to existing repair techniques, showing that RepairAgent achieves a higher success rate in fixing bugs, particularly in Defects4Jv2. The work also addresses potential threats to validity, including data leakage and missing test cases, and highlights the importance of fault localization and non-deterministic outputs of LLMs. Overall, RepairAgent demonstrates the effectiveness of an autonomous, LLM-based agent for program repair, offering a new state of the art in the field.RepairAgent is an autonomous, large language model (LLM)-based agent designed for program repair. Unlike existing deep learning-based approaches that rely on fixed prompts or feedback loops, RepairAgent treats the LLM as an autonomous agent capable of planning and executing actions to fix bugs by invoking suitable tools. The agent freely interleaves gathering information about the bug, gathering repair ingredients, and validating fixes, while deciding which tools to invoke based on the gathered information and feedback from previous fix attempts. Key contributions include a set of tools useful for program repair, a dynamically updated prompt format that allows the LLM to interact with these tools, and a finite state machine that guides the agent in invoking the tools. Evaluation on the Defects4J dataset shows that RepairAgent successfully repairs 164 bugs, including 39 not fixed by prior techniques. The average cost per bug is 270,000 tokens, translating to 14 cents per bug under current pricing. This work introduces the first autonomous, LLM-based agent for program repair, paving the way for future agent-based techniques in software engineering. RepairAgent uses a dynamic prompt format, a set of tools for interacting with the code base, and a middleware to orchestrate communication between the LLM and the tools. The agent iteratively interacts with the LLM, using tools to gather information, suggest fixes, and validate them. The approach is evaluated on the Defects4J dataset, showing that RepairAgent can fix bugs of varying complexity, including multi-line and multi-file bugs. The agent's ability to autonomously retrieve suitable repair ingredients and perform edits to an arbitrary number of lines and files contributes to its effectiveness. The approach is also evaluated for cost, showing that the median token consumption is approximately 270,000 tokens, equating to around 14 cents. The agent uses a variety of tools, including those for reading and extracting code, searching and generating code, testing and patching, and control. The middleware ensures that the LLM's output is parsed and refined, and that the agent's actions are executed in an isolated environment. The approach is compared to existing repair techniques, showing that RepairAgent achieves a higher success rate in fixing bugs, particularly in Defects4Jv2. The work also addresses potential threats to validity, including data leakage and missing test cases, and highlights the importance of fault localization and non-deterministic outputs of LLMs. Overall, RepairAgent demonstrates the effectiveness of an autonomous, LLM-based agent for program repair, offering a new state of the art in the field.