2020 | JOONHO LEE1.*, JEMIN HWANGBO1,2,†, LORENZ WELLHAUSEN1, VLADLEN KOLTUN3, AND MARCO HUTTER1
This paper presents a robust controller for quadrupedal locomotion in challenging natural environments, addressing the limitations of conventional controllers that are complex and fragile. The controller is trained using reinforcement learning in simulation and incorporates proprioceptive feedback, enabling zero-shot generalization from simulated to real-world terrains. Key contributions include a temporal convolutional network (TCN) architecture that processes a sequence of proprioceptive signals, a two-stage training process involving a teacher policy with privileged information and a student policy that learns only from proprioceptive data, and an adaptive terrain curriculum that synthesizes terrains based on the controller's performance. The controller has been successfully deployed on ANYmal robots, demonstrating robust performance on various challenging terrains, including mud, snow, rubble, thick vegetation, and water. The results highlight the controller's ability to handle deformable and dynamic environments, outperforming existing controllers in terms of speed, energy efficiency, and robustness. The methodology opens new frontiers for legged robotics, enabling the deployment of robots in environments beyond the reach of wheeled and tracked machines.This paper presents a robust controller for quadrupedal locomotion in challenging natural environments, addressing the limitations of conventional controllers that are complex and fragile. The controller is trained using reinforcement learning in simulation and incorporates proprioceptive feedback, enabling zero-shot generalization from simulated to real-world terrains. Key contributions include a temporal convolutional network (TCN) architecture that processes a sequence of proprioceptive signals, a two-stage training process involving a teacher policy with privileged information and a student policy that learns only from proprioceptive data, and an adaptive terrain curriculum that synthesizes terrains based on the controller's performance. The controller has been successfully deployed on ANYmal robots, demonstrating robust performance on various challenging terrains, including mud, snow, rubble, thick vegetation, and water. The results highlight the controller's ability to handle deformable and dynamic environments, outperforming existing controllers in terms of speed, energy efficiency, and robustness. The methodology opens new frontiers for legged robotics, enabling the deployment of robots in environments beyond the reach of wheeled and tracked machines.