[slides] Agile But Safe%3A Learning Collision-Free High-Speed Legged Locomotion

The paper introduces Agile But Safe (ABS), a learning-based control framework that enables agile and collision-free locomotion for quadrupedal robots in cluttered environments. ABS features a dual-policy setup: an agile policy for executing fast motor skills amidst obstacles and a recovery policy to prevent failures. The policy switch is governed by a learned control-theoretic reach-avoid value network, which also guides the recovery policy as an objective function. The training process involves learning the agile policy, the reach-avoid value network, the recovery policy, and an exteroception representation network in simulation. These trained modules can be directly deployed in real-world settings with onboard sensing and computation, enabling high-speed and collision-free navigation in confined indoor and outdoor spaces with both static and dynamic obstacles. The key contributions of ABS include: 1. A perceptive agile policy for obstacle avoidance in high-speed locomotion with novel training methods. 2. A novel control-theoretic data-driven method for RA value estimation conditioned on the learned agile policy. 3. A dual-policy setup where an agile policy and a recovery policy collaborate for high-speed collision-free locomotion, with RA values governing the policy switch and guiding the recovery policy. 4. An exteroception representation network that predicts low-dimensional obstacle information for generalizable collision avoidance capability. The paper demonstrates the effectiveness of ABS through experiments in both simulation and real-world settings, showing superior safety and state-of-the-art agility amidst obstacles.The paper introduces Agile But Safe (ABS), a learning-based control framework that enables agile and collision-free locomotion for quadrupedal robots in cluttered environments. ABS features a dual-policy setup: an agile policy for executing fast motor skills amidst obstacles and a recovery policy to prevent failures. The policy switch is governed by a learned control-theoretic reach-avoid value network, which also guides the recovery policy as an objective function. The training process involves learning the agile policy, the reach-avoid value network, the recovery policy, and an exteroception representation network in simulation. These trained modules can be directly deployed in real-world settings with onboard sensing and computation, enabling high-speed and collision-free navigation in confined indoor and outdoor spaces with both static and dynamic obstacles. The key contributions of ABS include: 1. A perceptive agile policy for obstacle avoidance in high-speed locomotion with novel training methods. 2. A novel control-theoretic data-driven method for RA value estimation conditioned on the learned agile policy. 3. A dual-policy setup where an agile policy and a recovery policy collaborate for high-speed collision-free locomotion, with RA values governing the policy switch and guiding the recovery policy. 4. An exteroception representation network that predicts low-dimensional obstacle information for generalizable collision avoidance capability. The paper demonstrates the effectiveness of ABS through experiments in both simulation and real-world settings, showing superior safety and state-of-the-art agility amidst obstacles.

Agile But Safe: Learning Collision-Free High-Speed Legged Locomotion

21 May 2024 | Tairan He1† Chong Zhang2‡ Wenli Xiao1 Guanqi He1 Changliu Liu1 Guanya Shi1