2020 | JOONHO LEE1.*, JEMIN HWANGBO1,2,†, LORENZ WELLHAUSEN1, VLADLEN KOLTUN3, AND MARCO HUTTER1
This paper presents a radically robust controller for legged locomotion in challenging natural environments. The controller is trained using model-free reinforcement learning in simulation and is based on a neural network that processes proprioceptive signals. It is capable of zero-shot generalization from simulation to real-world environments, including deformable terrain like mud and snow, dynamic footholds such as rubble, and overground obstacles like thick vegetation and flowing water. The controller uses only proprioceptive measurements from joint encoders and an inertial measurement unit (IMU), which are the most reliable sensors on legged machines. It has been successfully tested on two generations of ANYmal quadruped robots in a variety of natural environments, including steep slopes, creeks, mud, thick vegetation, snow-covered hills, and damp forests. The controller also performed well in the DARPA Subterranean Challenge Urban Circuit, demonstrating robustness in challenging conditions. The controller was trained using a privileged learning approach, where a teacher policy with access to privileged information (terrain and contact data) was used to guide the learning of a purely proprioceptive student controller. The controller was also tested in indoor environments with loose debris, showing robustness in unstable conditions. The controller outperformed a state-of-the-art baseline in terms of locomotion speed and energy efficiency. The controller was also tested in scenarios with substantial model mismatch, such as carrying a 10 kg payload, and showed robustness in these conditions. The controller was also tested in scenarios with foot slippage, showing adaptability to slippery terrain. The controller's performance was evaluated in various conditions, including traversing steps, handling obstacles, and maintaining heading accuracy. The controller's robustness was further demonstrated in the DARPA Subterranean Challenge, where it successfully navigated a steep staircase. The controller's performance was also evaluated in indoor experiments, showing robustness in unstable environments. The controller's ability to generalize to new environments and its robustness in challenging conditions make it a significant advancement in legged robotics. The controller's success in simulation-to-real-world transfer and its robustness in diverse environments highlight the potential of model-free reinforcement learning in legged locomotion. The controller's ability to handle a wide range of terrains and its robustness in challenging conditions demonstrate the effectiveness of the proposed approach. The controller's performance in various scenarios, including traversing steps, handling obstacles, and maintaining heading accuracy, further supports its effectiveness. The controller's success in the DARPA Subterranean Challenge and its robustness in indoor experiments highlight its potential for real-world applications. The controller's ability to generalize to new environments and its robustness in challenging conditions make it a significant advancement in legged robotics. The controller's performance in various scenarios, including traversing steps, handling obstacles, and maintaining heading accuracy, further supports its effectiveness. The controller's success in simulation-to-real-world transfer and its robustness in diverse environments highlight the potential of model-free reinforcement learning in legged locomotion.This paper presents a radically robust controller for legged locomotion in challenging natural environments. The controller is trained using model-free reinforcement learning in simulation and is based on a neural network that processes proprioceptive signals. It is capable of zero-shot generalization from simulation to real-world environments, including deformable terrain like mud and snow, dynamic footholds such as rubble, and overground obstacles like thick vegetation and flowing water. The controller uses only proprioceptive measurements from joint encoders and an inertial measurement unit (IMU), which are the most reliable sensors on legged machines. It has been successfully tested on two generations of ANYmal quadruped robots in a variety of natural environments, including steep slopes, creeks, mud, thick vegetation, snow-covered hills, and damp forests. The controller also performed well in the DARPA Subterranean Challenge Urban Circuit, demonstrating robustness in challenging conditions. The controller was trained using a privileged learning approach, where a teacher policy with access to privileged information (terrain and contact data) was used to guide the learning of a purely proprioceptive student controller. The controller was also tested in indoor environments with loose debris, showing robustness in unstable conditions. The controller outperformed a state-of-the-art baseline in terms of locomotion speed and energy efficiency. The controller was also tested in scenarios with substantial model mismatch, such as carrying a 10 kg payload, and showed robustness in these conditions. The controller was also tested in scenarios with foot slippage, showing adaptability to slippery terrain. The controller's performance was evaluated in various conditions, including traversing steps, handling obstacles, and maintaining heading accuracy. The controller's robustness was further demonstrated in the DARPA Subterranean Challenge, where it successfully navigated a steep staircase. The controller's performance was also evaluated in indoor experiments, showing robustness in unstable environments. The controller's ability to generalize to new environments and its robustness in challenging conditions make it a significant advancement in legged robotics. The controller's success in simulation-to-real-world transfer and its robustness in diverse environments highlight the potential of model-free reinforcement learning in legged locomotion. The controller's ability to handle a wide range of terrains and its robustness in challenging conditions demonstrate the effectiveness of the proposed approach. The controller's performance in various scenarios, including traversing steps, handling obstacles, and maintaining heading accuracy, further supports its effectiveness. The controller's success in the DARPA Subterranean Challenge and its robustness in indoor experiments highlight its potential for real-world applications. The controller's ability to generalize to new environments and its robustness in challenging conditions make it a significant advancement in legged robotics. The controller's performance in various scenarios, including traversing steps, handling obstacles, and maintaining heading accuracy, further supports its effectiveness. The controller's success in simulation-to-real-world transfer and its robustness in diverse environments highlight the potential of model-free reinforcement learning in legged locomotion.