The paper introduces Agile But Safe (ABS), a learning-based control framework that enables agile and collision-free locomotion for quadrupedal robots in cluttered environments. ABS features a dual-policy setup: an agile policy for executing fast motor skills amidst obstacles and a recovery policy to prevent failures. The policy switch is governed by a learned control-theoretic reach-avoid value network, which also guides the recovery policy as an objective function. The training process involves learning the agile policy, the reach-avoid value network, the recovery policy, and an exteroception representation network in simulation. These trained modules can be directly deployed in real-world settings with onboard sensing and computation, enabling high-speed and collision-free navigation in confined indoor and outdoor spaces with both static and dynamic obstacles.
The key contributions of ABS include:
1. A perceptive agile policy for obstacle avoidance in high-speed locomotion with novel training methods.
2. A novel control-theoretic data-driven method for RA value estimation conditioned on the learned agile policy.
3. A dual-policy setup where an agile policy and a recovery policy collaborate for high-speed collision-free locomotion, with RA values governing the policy switch and guiding the recovery policy.
4. An exteroception representation network that predicts low-dimensional obstacle information for generalizable collision avoidance capability.
The paper demonstrates the effectiveness of ABS through experiments in both simulation and real-world settings, showing superior safety and state-of-the-art agility amidst obstacles.The paper introduces Agile But Safe (ABS), a learning-based control framework that enables agile and collision-free locomotion for quadrupedal robots in cluttered environments. ABS features a dual-policy setup: an agile policy for executing fast motor skills amidst obstacles and a recovery policy to prevent failures. The policy switch is governed by a learned control-theoretic reach-avoid value network, which also guides the recovery policy as an objective function. The training process involves learning the agile policy, the reach-avoid value network, the recovery policy, and an exteroception representation network in simulation. These trained modules can be directly deployed in real-world settings with onboard sensing and computation, enabling high-speed and collision-free navigation in confined indoor and outdoor spaces with both static and dynamic obstacles.
The key contributions of ABS include:
1. A perceptive agile policy for obstacle avoidance in high-speed locomotion with novel training methods.
2. A novel control-theoretic data-driven method for RA value estimation conditioned on the learned agile policy.
3. A dual-policy setup where an agile policy and a recovery policy collaborate for high-speed collision-free locomotion, with RA values governing the policy switch and guiding the recovery policy.
4. An exteroception representation network that predicts low-dimensional obstacle information for generalizable collision avoidance capability.
The paper demonstrates the effectiveness of ABS through experiments in both simulation and real-world settings, showing superior safety and state-of-the-art agility amidst obstacles.