15 Jun 2024 | Ziwen Zhuang, Shenzhe Yao, Hang Zhao
This paper presents a vision-based end-to-end whole-body-control parkour policy for humanoid robots that can perform multiple parkour skills without any motion prior. The policy enables the robot to jump on 0.42m platforms, leap over hurdles, and overcome 0.8m gaps, as well as run at 1.8m/s in the wild and walk robustly on various terrains. The policy is tested in both indoor and outdoor environments, demonstrating its ability to autonomously select parkour skills while following joystick rotation commands. The framework is also shown to transfer easily to humanoid mobile manipulation tasks. The policy is trained using fractal noise terrain, which allows the robot to learn locomotion skills without requiring motion references or reward terms to encourage foot raising. The training objective is simple enough to train multiple agile locomotion skills in a unified manner while being able to deploy to the real humanoid robot with zero-shot sim-to-real transfer. The policy is trained from a pre-trained plane locomotion policy, allowing it to respond to turning commands even when the locomotion command tells the robot to walk along a straight track. The policy is distilled using DAgger with 4-GPU acceleration to create a vision-based parkour policy that can be deployed on the real humanoid robot with only onboard computation, sensing, and power support. The paper also discusses the challenges of agile locomotion for humanoids, including diverse skills for a single locomotion network, the need for egocentric perception, and the time required for proprioception and exteroception processing. The paper concludes that the proposed method is effective for parkour learning in humanoid robots, but further research is needed to generalize to unseen terrains.This paper presents a vision-based end-to-end whole-body-control parkour policy for humanoid robots that can perform multiple parkour skills without any motion prior. The policy enables the robot to jump on 0.42m platforms, leap over hurdles, and overcome 0.8m gaps, as well as run at 1.8m/s in the wild and walk robustly on various terrains. The policy is tested in both indoor and outdoor environments, demonstrating its ability to autonomously select parkour skills while following joystick rotation commands. The framework is also shown to transfer easily to humanoid mobile manipulation tasks. The policy is trained using fractal noise terrain, which allows the robot to learn locomotion skills without requiring motion references or reward terms to encourage foot raising. The training objective is simple enough to train multiple agile locomotion skills in a unified manner while being able to deploy to the real humanoid robot with zero-shot sim-to-real transfer. The policy is trained from a pre-trained plane locomotion policy, allowing it to respond to turning commands even when the locomotion command tells the robot to walk along a straight track. The policy is distilled using DAgger with 4-GPU acceleration to create a vision-based parkour policy that can be deployed on the real humanoid robot with only onboard computation, sensing, and power support. The paper also discusses the challenges of agile locomotion for humanoids, including diverse skills for a single locomotion network, the need for egocentric perception, and the time required for proprioception and exteroception processing. The paper concludes that the proposed method is effective for parkour learning in humanoid robots, but further research is needed to generalize to unseen terrains.