20 Feb 2024 | Marta Skreta*, Zihan Zhou*, Jia Lin Yuan*, Kourosh Darvish, Alán Aspuru-Guzik, and Animesh Garg
REPLAN is a novel framework that enables robotic replanning with perception and language models, allowing robots to adapt to unforeseen obstacles while completing long-horizon, open-ended tasks. The framework integrates high-level planning, low-level reward generation, and perception-based feedback to ensure the robot can adjust its actions in real-time. It uses a Vision-Language Model (VLM) to provide accurate feedback about object states, which is crucial for effective planning and control. The framework also includes a Reasoning and Control (RC) benchmark with eight tasks to evaluate its performance. REPLAN outperforms baseline models in task completion, achieving a 4× improvement in success rates. The system is designed to work without human intervention, using a combination of LLMs and VLMs to generate plans, verify them, and adjust as needed. The framework demonstrates its effectiveness in real-world scenarios, such as placing an apple in a bowl while removing an obstacle like a lemon. However, the system faces challenges in handling complex tasks, such as identifying and resolving errors in object recognition or reward generation. Overall, REPLAN shows promise in enabling robots to perform multi-stage, long-horizon tasks with high accuracy and adaptability.REPLAN is a novel framework that enables robotic replanning with perception and language models, allowing robots to adapt to unforeseen obstacles while completing long-horizon, open-ended tasks. The framework integrates high-level planning, low-level reward generation, and perception-based feedback to ensure the robot can adjust its actions in real-time. It uses a Vision-Language Model (VLM) to provide accurate feedback about object states, which is crucial for effective planning and control. The framework also includes a Reasoning and Control (RC) benchmark with eight tasks to evaluate its performance. REPLAN outperforms baseline models in task completion, achieving a 4× improvement in success rates. The system is designed to work without human intervention, using a combination of LLMs and VLMs to generate plans, verify them, and adjust as needed. The framework demonstrates its effectiveness in real-world scenarios, such as placing an apple in a bowl while removing an obstacle like a lemon. However, the system faces challenges in handling complex tasks, such as identifying and resolving errors in object recognition or reward generation. Overall, REPLAN shows promise in enabling robots to perform multi-stage, long-horizon tasks with high accuracy and adaptability.