Understanding Reinforcement Learning for Versatile%2C Dynamic%2C and Robust Bipedal Locomotion Control

This paper presents a comprehensive study on using deep reinforcement learning (RL) to create dynamic locomotion controllers for bipedal robots. The authors develop a general control solution that can be used for a range of dynamic bipedal skills, including walking, running, and jumping. The proposed RL-based controller incorporates a novel dual-history architecture, which utilizes both long-term and short-term input/output (I/O) history of the robot. This architecture, when trained through an end-to-end RL approach, consistently outperforms other methods across various skills in both simulation and real-world experiments. The study also explores the adaptivity and robustness of the proposed RL system, demonstrating its ability to adapt to both time-invariant and time-variant changes in the robot's dynamics. Additionally, task randomization is introduced as a key source of robustness, enhancing task generalization and disturbance compliance. The resulting control policies are successfully deployed on Cassie, a torque-controlled human-sized bipedal robot, enabling a wide range of locomotion skills, including robust standing, versatile walking, fast running, and diverse jumping abilities. The paper provides extensive real-world validation and demonstrations, showcasing the effectiveness of the proposed framework in achieving complex and agile maneuvers.This paper presents a comprehensive study on using deep reinforcement learning (RL) to create dynamic locomotion controllers for bipedal robots. The authors develop a general control solution that can be used for a range of dynamic bipedal skills, including walking, running, and jumping. The proposed RL-based controller incorporates a novel dual-history architecture, which utilizes both long-term and short-term input/output (I/O) history of the robot. This architecture, when trained through an end-to-end RL approach, consistently outperforms other methods across various skills in both simulation and real-world experiments. The study also explores the adaptivity and robustness of the proposed RL system, demonstrating its ability to adapt to both time-invariant and time-variant changes in the robot's dynamics. Additionally, task randomization is introduced as a key source of robustness, enhancing task generalization and disturbance compliance. The resulting control policies are successfully deployed on Cassie, a torque-controlled human-sized bipedal robot, enabling a wide range of locomotion skills, including robust standing, versatile walking, fast running, and diverse jumping abilities. The paper provides extensive real-world validation and demonstrations, showcasing the effectiveness of the proposed framework in achieving complex and agile maneuvers.

Reinforcement Learning for Versatile, Dynamic, and Robust Bipedal Locomotion Control

26 Aug 2024 | Zhongyu Li1, Xue Bin Peng2, Pieter Abbeel1, Sergey Levine1, Glen Berseth3,4, Koushil Sreenath1