Executable Code Actions Elicit Better LLM Agents

Executable Code Actions Elicit Better LLM Agents

2024 | Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhang, Yunzhu Li, Hao Peng, Heng Ji
This paper introduces CodeAct, a framework that uses executable Python code to enhance the action space of Large Language Models (LLMs). CodeAct integrates a Python interpreter, enabling LLMs to execute code actions and dynamically adjust or emit new actions based on observations. Extensive experiments with 17 LLMs on the API-Bank benchmark and a new curated benchmark, M³ToolEval, show that CodeAct outperforms existing alternatives, achieving up to 20% higher success rates. The authors also collect an instruction-tuning dataset, CodeActInstruct, consisting of 7,000 multi-turn interactions using CodeAct, and fine-tune an LLM agent, CodeActAgent, from LLaMA-2 and Mistral-7B. CodeActAgent can perform sophisticated tasks using existing Python packages and self-debug through multi-turn interactions. The paper discusses the benefits of CodeAct, including its ability to leverage existing software packages, handle complex tasks, and improve LLM agents' capabilities in agent-oriented tasks without compromising their general performance.This paper introduces CodeAct, a framework that uses executable Python code to enhance the action space of Large Language Models (LLMs). CodeAct integrates a Python interpreter, enabling LLMs to execute code actions and dynamically adjust or emit new actions based on observations. Extensive experiments with 17 LLMs on the API-Bank benchmark and a new curated benchmark, M³ToolEval, show that CodeAct outperforms existing alternatives, achieving up to 20% higher success rates. The authors also collect an instruction-tuning dataset, CodeActInstruct, consisting of 7,000 multi-turn interactions using CodeAct, and fine-tune an LLM agent, CodeActAgent, from LLaMA-2 and Mistral-7B. CodeActAgent can perform sophisticated tasks using existing Python packages and self-debug through multi-turn interactions. The paper discusses the benefits of CodeAct, including its ability to leverage existing software packages, handle complex tasks, and improve LLM agents' capabilities in agent-oriented tasks without compromising their general performance.
Reach us at info@study.space
[slides and audio] Executable Code Actions Elicit Better LLM Agents