AutoGPT+P: Affordance-based Task Planning using Large Language Models

AutoGPT+P: Affordance-based Task Planning using Large Language Models

23 Jul 2024 | Timo Birr, Christoph Pohl, Abdelrahman Younes and Tamis Asfour
The paper introduces *AutoGPT+P*, a system that combines affordance-based scene representation with a planning system to address the limitations of Large Language Models (LLMs) in task planning. Affordances, which represent the action possibilities of an agent on objects and the environment, enable symbolic planning with arbitrary objects. *AutoGPT+P* leverages this representation to derive and execute plans for tasks specified in natural language, handling incomplete information by exploring the scene, suggesting alternatives, or providing partial plans. The system uses an Object Affordance Mapping (OAM) generated using *ChatGPT* to combine object detection with affordance information. The core planning tool extends existing work by automatically correcting semantic and syntactic errors, achieving a 98% success rate on the *SayCan* instruction set. Evaluations on a newly created dataset with 150 scenarios show a 79% success rate, demonstrating the system's effectiveness in complex tasks with missing objects. The paper also discusses related work on affordances and LLMs in planning, and provides a detailed overview of the *AutoGPT+P* architecture, including its feedback loop and alternative suggestion process.The paper introduces *AutoGPT+P*, a system that combines affordance-based scene representation with a planning system to address the limitations of Large Language Models (LLMs) in task planning. Affordances, which represent the action possibilities of an agent on objects and the environment, enable symbolic planning with arbitrary objects. *AutoGPT+P* leverages this representation to derive and execute plans for tasks specified in natural language, handling incomplete information by exploring the scene, suggesting alternatives, or providing partial plans. The system uses an Object Affordance Mapping (OAM) generated using *ChatGPT* to combine object detection with affordance information. The core planning tool extends existing work by automatically correcting semantic and syntactic errors, achieving a 98% success rate on the *SayCan* instruction set. Evaluations on a newly created dataset with 150 scenarios show a 79% success rate, demonstrating the system's effectiveness in complex tasks with missing objects. The paper also discusses related work on affordances and LLMs in planning, and provides a detailed overview of the *AutoGPT+P* architecture, including its feedback loop and alternative suggestion process.
Reach us at info@study.space