Plan-Seq-Learn: Language Model Guided RL for Solving Long Horizon Robotics Tasks
This paper introduces Plan-Seq-Learn (PSL), a modular approach that combines language models (LLMs) with reinforcement learning (RL) to solve long-horizon robotics tasks. PSL uses motion planning to bridge the gap between abstract language and learned low-level control. The method first generates a high-level plan using an LLM, then uses motion planning to compute a target robot pose, and finally uses RL to learn interaction policies. PSL achieves state-of-the-art results on over 25 challenging robotics tasks with up to 10 stages, outperforming language-based, classical, and end-to-end approaches. PSL solves long-horizon tasks from raw visual input spanning four benchmarks at success rates of over 85%. The method is effective in handling contact-rich interactions and enables online learning of low-level control strategies without requiring a pre-defined skill library. PSL decomposes tasks into sequences of contact-free motion and contact-rich interaction, and uses a curriculum learning strategy to track the language model plan. The method is evaluated on four simulated environment suites (Meta-World, Obstructed Suite, Kitchen, and Robosuite) and outperforms state-of-the-art methods on over 20 challenging vision-based control tasks. PSL is also applicable to sim-to-real transfer due to its use of local observations. The method is robust to noisy pose estimates and stage termination conditions, and is able to handle long-horizon tasks with up to 10 stages. PSL is a promising approach for solving long-horizon robotics tasks using language models and RL.Plan-Seq-Learn: Language Model Guided RL for Solving Long Horizon Robotics Tasks
This paper introduces Plan-Seq-Learn (PSL), a modular approach that combines language models (LLMs) with reinforcement learning (RL) to solve long-horizon robotics tasks. PSL uses motion planning to bridge the gap between abstract language and learned low-level control. The method first generates a high-level plan using an LLM, then uses motion planning to compute a target robot pose, and finally uses RL to learn interaction policies. PSL achieves state-of-the-art results on over 25 challenging robotics tasks with up to 10 stages, outperforming language-based, classical, and end-to-end approaches. PSL solves long-horizon tasks from raw visual input spanning four benchmarks at success rates of over 85%. The method is effective in handling contact-rich interactions and enables online learning of low-level control strategies without requiring a pre-defined skill library. PSL decomposes tasks into sequences of contact-free motion and contact-rich interaction, and uses a curriculum learning strategy to track the language model plan. The method is evaluated on four simulated environment suites (Meta-World, Obstructed Suite, Kitchen, and Robosuite) and outperforms state-of-the-art methods on over 20 challenging vision-based control tasks. PSL is also applicable to sim-to-real transfer due to its use of local observations. The method is robust to noisy pose estimates and stage termination conditions, and is able to handle long-horizon tasks with up to 10 stages. PSL is a promising approach for solving long-horizon robotics tasks using language models and RL.