12 Nov 2024 | Jiaxu Xing, Angel Romero, Leonard Bauersfeld, Davide Scaramuzza
The paper "Bootstrapping Reinforcement Learning with Imitation for Vision-Based Agile Flight" addresses the challenges of learning visuomotor policies for agile quadrotor flight, particularly the inefficiency of policy exploration due to high-dimensional visual inputs and the need for precise, low-latency control. The authors propose a novel approach that combines the strengths of Reinforcement Learning (RL) and Imitation Learning (IL) to enhance sample efficiency and performance in autonomous drone racing. The framework consists of three phases: training a teacher policy using RL with privileged state information, distilling it into a student policy via IL, and adaptive fine-tuning through RL. This method is tested in both simulated and real-world scenarios, demonstrating superior performance and robustness compared to existing IL methods and outperforming RL from scratch in scenarios where it fails. The approach leverages visual inputs, such as gate corners or RGB images, to navigate through a sequence of gates, achieving faster lap times and tighter trajectories. The paper also discusses the limitations and future directions, emphasizing the need for improved perception modules to handle more out-of-distribution cases.The paper "Bootstrapping Reinforcement Learning with Imitation for Vision-Based Agile Flight" addresses the challenges of learning visuomotor policies for agile quadrotor flight, particularly the inefficiency of policy exploration due to high-dimensional visual inputs and the need for precise, low-latency control. The authors propose a novel approach that combines the strengths of Reinforcement Learning (RL) and Imitation Learning (IL) to enhance sample efficiency and performance in autonomous drone racing. The framework consists of three phases: training a teacher policy using RL with privileged state information, distilling it into a student policy via IL, and adaptive fine-tuning through RL. This method is tested in both simulated and real-world scenarios, demonstrating superior performance and robustness compared to existing IL methods and outperforming RL from scratch in scenarios where it fails. The approach leverages visual inputs, such as gate corners or RGB images, to navigate through a sequence of gates, achieving faster lap times and tighter trajectories. The paper also discusses the limitations and future directions, emphasizing the need for improved perception modules to handle more out-of-distribution cases.