18 May 2024 | Nishanth Kumar, Tom Silver, Willie McClinton, Linfeng Zhao, Stephen Proulx, Tomás Lozano-Pérez, Leslie Pack Kaelbling, Jennifer Barry
This paper presents a method for robots to learn parameter policies for parameterized skills through active learning. The robot is equipped with a library of parameterized skills, an AI planner for sequencing these skills, and a prior distribution for selecting skill parameters. The goal is to rapidly specialize these parameters to the specific objects, goals, and constraints in the environment. The robot autonomously plans, practices, and learns without environment resets, improving its ability to solve long-horizon tasks.
The key idea is to estimate the competence of each skill, extrapolate how much competence could improve through practice, and situate the skill in the task distribution through competence-aware planning. This approach is implemented in a fully autonomous system where the robot repeatedly plans, practices, and learns. The robot selects skills to practice based on the predicted improvement in task success, rather than focusing on the most incompetent skills.
The method is evaluated in both simulated and real-world environments. In simulated environments, the robot learns effective parameter policies more sample-efficiently than several baselines. In real-world environments, the robot demonstrates the ability to handle noise from perception and control and improves its ability to solve long-horizon mobile-manipulation tasks after a few hours of autonomous practice.
The paper also discusses related work in reinforcement learning and parameter policy learning, highlighting the importance of active learning in complex, long-horizon tasks. The approach is compared to other methods, including those that focus on competence gradient, skill diversity, and task relevance. The results show that the proposed method outperforms these approaches in terms of sample efficiency and task success.
The paper concludes that the proposed method, called Estimate, Extrapolate & Situate (EES), is a promising approach for robots to learn parameter policies for parameterized skills in complex, long-horizon tasks. The method is particularly effective in environments with high noise and uncertainty, and it enables the robot to adapt to new tasks and environments through continuous learning.This paper presents a method for robots to learn parameter policies for parameterized skills through active learning. The robot is equipped with a library of parameterized skills, an AI planner for sequencing these skills, and a prior distribution for selecting skill parameters. The goal is to rapidly specialize these parameters to the specific objects, goals, and constraints in the environment. The robot autonomously plans, practices, and learns without environment resets, improving its ability to solve long-horizon tasks.
The key idea is to estimate the competence of each skill, extrapolate how much competence could improve through practice, and situate the skill in the task distribution through competence-aware planning. This approach is implemented in a fully autonomous system where the robot repeatedly plans, practices, and learns. The robot selects skills to practice based on the predicted improvement in task success, rather than focusing on the most incompetent skills.
The method is evaluated in both simulated and real-world environments. In simulated environments, the robot learns effective parameter policies more sample-efficiently than several baselines. In real-world environments, the robot demonstrates the ability to handle noise from perception and control and improves its ability to solve long-horizon mobile-manipulation tasks after a few hours of autonomous practice.
The paper also discusses related work in reinforcement learning and parameter policy learning, highlighting the importance of active learning in complex, long-horizon tasks. The approach is compared to other methods, including those that focus on competence gradient, skill diversity, and task relevance. The results show that the proposed method outperforms these approaches in terms of sample efficiency and task success.
The paper concludes that the proposed method, called Estimate, Extrapolate & Situate (EES), is a promising approach for robots to learn parameter policies for parameterized skills in complex, long-horizon tasks. The method is particularly effective in environments with high noise and uncertainty, and it enables the robot to adapt to new tasks and environments through continuous learning.