Bootstrapping Reinforcement Learning with Imitation for Vision-Based Agile Flight

Bootstrapping Reinforcement Learning with Imitation for Vision-Based Agile Flight

2024 | Jiaxu Xing, Angel Romero, Leonard Bauersfeld, Davide Scaramuzza
This paper presents a novel approach combining Reinforcement Learning (RL) and Imitation Learning (IL) for vision-based agile quadrotor flight, specifically for autonomous drone racing. The method addresses the challenges of learning visuomotor policies for agile flight, which include inefficient policy exploration due to high-dimensional visual inputs and the need for precise, low-latency control. The proposed framework consists of three phases: training a teacher policy using RL with privileged state information, distilling it into a student policy via IL, and adaptive fine-tuning via RL. The approach is tested in both simulated and real-world scenarios, demonstrating superior performance and robustness compared to existing IL methods and outperforming RL from scratch in scenarios where it fails. The method successfully navigates a quadrotor through a race course using only visual information. The framework integrates the strengths of both RL and IL, leveraging the sample efficiency of IL and the performance of RL. The approach is validated on three different race tracks, showing improved performance in terms of success rate, gate-passing error, and lap time. The method also demonstrates robustness to unknown disturbances and achieves high performance in real-world settings. The results show that the proposed approach outperforms existing baselines in terms of performance and robustness, and that the use of an asymmetric critic in the third phase enhances the policy's performance and robustness. The method is generalizable to other robotic platforms and tasks, as it does not rely on task-specific adaptations. The paper also discusses the limitations of the approach, including the need for further improvements in perception modules to handle more out-of-distribution cases. The work is supported by grants from the European Union and the European Research Council.This paper presents a novel approach combining Reinforcement Learning (RL) and Imitation Learning (IL) for vision-based agile quadrotor flight, specifically for autonomous drone racing. The method addresses the challenges of learning visuomotor policies for agile flight, which include inefficient policy exploration due to high-dimensional visual inputs and the need for precise, low-latency control. The proposed framework consists of three phases: training a teacher policy using RL with privileged state information, distilling it into a student policy via IL, and adaptive fine-tuning via RL. The approach is tested in both simulated and real-world scenarios, demonstrating superior performance and robustness compared to existing IL methods and outperforming RL from scratch in scenarios where it fails. The method successfully navigates a quadrotor through a race course using only visual information. The framework integrates the strengths of both RL and IL, leveraging the sample efficiency of IL and the performance of RL. The approach is validated on three different race tracks, showing improved performance in terms of success rate, gate-passing error, and lap time. The method also demonstrates robustness to unknown disturbances and achieves high performance in real-world settings. The results show that the proposed approach outperforms existing baselines in terms of performance and robustness, and that the use of an asymmetric critic in the third phase enhances the policy's performance and robustness. The method is generalizable to other robotic platforms and tasks, as it does not rely on task-specific adaptations. The paper also discusses the limitations of the approach, including the need for further improvements in perception modules to handle more out-of-distribution cases. The work is supported by grants from the European Union and the European Research Council.
Reach us at info@futurestudyspace.com