WoCoCo: Learning Whole-Body Humanoid Control with Sequential Contacts

WoCoCo: Learning Whole-Body Humanoid Control with Sequential Contacts

10 Jun 2024 | Chong Zhang, Wenli Xiao, Tairan He, Guanya Shi
WoCoCo is a novel framework for learning whole-body humanoid control with sequential contacts. The framework decomposes tasks into separate contact stages, enabling simple and general policy learning through task-agnostic reward and sim-to-real designs. It requires only one or two task-related terms per task. The framework was tested on four challenging tasks involving diverse contact sequences in the real world: versatile parkour jumping, box loco-manipulation, dynamic clap-and-tap dancing, and cliffside climbing. It was also applied to a 22-DoF dinosaur robot loco-manipulation task. The framework uses a combination of dense contact rewards, stage count rewards, and curiosity rewards to facilitate exploration and learning. It also includes a sim-to-real pipeline with domain randomization and regularization rewards. The framework demonstrates robustness and versatility in handling complex tasks with sequential contacts. The contributions include a general framework for RL-based whole-body humanoid control under sequential contact plans, showcasing the framework's ability to handle diverse tasks, and validating the learned policies in the real world. The framework addresses three main challenges: sparse contacts, long-horizon exploration, and sim-to-real transfer. The framework's reward design and sim-to-real pipeline enable effective and efficient learning of whole-body humanoid control with sequential contacts.WoCoCo is a novel framework for learning whole-body humanoid control with sequential contacts. The framework decomposes tasks into separate contact stages, enabling simple and general policy learning through task-agnostic reward and sim-to-real designs. It requires only one or two task-related terms per task. The framework was tested on four challenging tasks involving diverse contact sequences in the real world: versatile parkour jumping, box loco-manipulation, dynamic clap-and-tap dancing, and cliffside climbing. It was also applied to a 22-DoF dinosaur robot loco-manipulation task. The framework uses a combination of dense contact rewards, stage count rewards, and curiosity rewards to facilitate exploration and learning. It also includes a sim-to-real pipeline with domain randomization and regularization rewards. The framework demonstrates robustness and versatility in handling complex tasks with sequential contacts. The contributions include a general framework for RL-based whole-body humanoid control under sequential contact plans, showcasing the framework's ability to handle diverse tasks, and validating the learned policies in the real world. The framework addresses three main challenges: sparse contacts, long-horizon exploration, and sim-to-real transfer. The framework's reward design and sim-to-real pipeline enable effective and efficient learning of whole-body humanoid control with sequential contacts.
Reach us at info@study.space