Understanding Ask-before-Plan%3A Proactive Language Agents for Real-World Planning

This paper introduces a new task called Proactive Agent Planning, which requires language agents to predict clarification needs based on user-agent conversation and agent-environment interaction, invoke external tools to collect valid information, and generate a plan to fulfill the user's demands. To study this practical problem, we establish a new benchmark dataset, Ask-before-Plan. To tackle the deficiency of LLMs in proactive planning, we propose a novel multi-agent framework, Clarification-Execution-Planning (CEP), which consists of three agents specialized in clarification, execution, and planning. We introduce the trajectory tuning scheme for the clarification agent and static execution agent, as well as the memory recollection mechanism for the dynamic execution agent. Extensive evaluations and comprehensive analyses conducted on the Ask-before-Plan dataset validate the effectiveness of our proposed framework. The CEP framework includes three agents: clarification, execution, and planning. The clarification agent is responsible for understanding the uncertainty of user instructions and asking users clarifying questions to uncover their underlying intentions. The execution agent leverages various tools to interact with the environment, gathering necessary information for the clarification agent. The planning agent produces the final plan by aggregating the clarification process for accomplishing the initial user instruction. To supplement the deficiency of simply prompting LLMs to ask clarification questions or perform complex tool learning, we devise Trajectory Tuning for fine-tuning the clarification and execution agents. Furthermore, we employ self-reflection to improve the reasoning process of the execution agent. However, the redundancy of self-reflection in multi-turn conversations may increase the time complexity of inference and introduce more noise into the context. To this end, we propose the memory recollection mechanism to optimize the memory utility for the execution agent in long-context reasoning. Our contributions are as follows: (1) We introduce a new and practical problem of Proactive Agent Planning to study the challenges of LLM-powered language agents in handling unclear user instructions. (2) We propose a novel multi-agent framework, namely CEP, which consists of clarification, execution, and planning agents, to address the underlying challenges in the Proactive Agent Planning problem. (3) We construct the first dataset for studying Proactive Agent Planning, namely Ask-before-Plan. Extensive evaluations and comprehensive analyses in diverse settings validate the effectiveness of the proposed CEP framework.This paper introduces a new task called Proactive Agent Planning, which requires language agents to predict clarification needs based on user-agent conversation and agent-environment interaction, invoke external tools to collect valid information, and generate a plan to fulfill the user's demands. To study this practical problem, we establish a new benchmark dataset, Ask-before-Plan. To tackle the deficiency of LLMs in proactive planning, we propose a novel multi-agent framework, Clarification-Execution-Planning (CEP), which consists of three agents specialized in clarification, execution, and planning. We introduce the trajectory tuning scheme for the clarification agent and static execution agent, as well as the memory recollection mechanism for the dynamic execution agent. Extensive evaluations and comprehensive analyses conducted on the Ask-before-Plan dataset validate the effectiveness of our proposed framework. The CEP framework includes three agents: clarification, execution, and planning. The clarification agent is responsible for understanding the uncertainty of user instructions and asking users clarifying questions to uncover their underlying intentions. The execution agent leverages various tools to interact with the environment, gathering necessary information for the clarification agent. The planning agent produces the final plan by aggregating the clarification process for accomplishing the initial user instruction. To supplement the deficiency of simply prompting LLMs to ask clarification questions or perform complex tool learning, we devise Trajectory Tuning for fine-tuning the clarification and execution agents. Furthermore, we employ self-reflection to improve the reasoning process of the execution agent. However, the redundancy of self-reflection in multi-turn conversations may increase the time complexity of inference and introduce more noise into the context. To this end, we propose the memory recollection mechanism to optimize the memory utility for the execution agent in long-context reasoning. Our contributions are as follows: (1) We introduce a new and practical problem of Proactive Agent Planning to study the challenges of LLM-powered language agents in handling unclear user instructions. (2) We propose a novel multi-agent framework, namely CEP, which consists of clarification, execution, and planning agents, to address the underlying challenges in the Proactive Agent Planning problem. (3) We construct the first dataset for studying Proactive Agent Planning, namely Ask-before-Plan. Extensive evaluations and comprehensive analyses in diverse settings validate the effectiveness of the proposed CEP framework.

Ask-before-Plan: Proactive Language Agents for Real-World Planning

18 Jun 2024 | Xuan Zhang, Yang Deng, Zifeng Ren, See-Kiong Ng, Tat-Seng Chua