2024 | Yongfeng Yin, Zhetao Wang, Lili Zheng, Qingran Su, Yang Guo
This paper addresses the challenge of autonomous UAV navigation in low-altitude, complex environments by proposing a deep reinforcement learning algorithm, GARTD3 (Guide Attention Recurrent TD3). The authors construct the UAV navigation problem in 3D environments as a Markov decision process (MDP) and introduce adaptive control to allow the UAV to adjust its flight altitude and velocity. To enhance obstacle avoidance, they propose a guide attention method that shifts the UAV's decision focus between navigation and obstacle avoidance tasks based on environmental changes. Additionally, a novel velocity-constrained loss function is introduced to improve the UAV's velocity control capability. The GARTD3 algorithm is evaluated in a 3D simulation environment, demonstrating superior performance compared to state-of-the-art reinforcement learning algorithms. The results show an average reward increase of 9.35%, a 14% increase in navigation task success rate, and a 14% decrease in collision rate. The paper also discusses future research directions, including improving the model's adaptability to different environments and integrating computer vision algorithms for enhanced navigation in dynamic environments.This paper addresses the challenge of autonomous UAV navigation in low-altitude, complex environments by proposing a deep reinforcement learning algorithm, GARTD3 (Guide Attention Recurrent TD3). The authors construct the UAV navigation problem in 3D environments as a Markov decision process (MDP) and introduce adaptive control to allow the UAV to adjust its flight altitude and velocity. To enhance obstacle avoidance, they propose a guide attention method that shifts the UAV's decision focus between navigation and obstacle avoidance tasks based on environmental changes. Additionally, a novel velocity-constrained loss function is introduced to improve the UAV's velocity control capability. The GARTD3 algorithm is evaluated in a 3D simulation environment, demonstrating superior performance compared to state-of-the-art reinforcement learning algorithms. The results show an average reward increase of 9.35%, a 14% increase in navigation task success rate, and a 14% decrease in collision rate. The paper also discusses future research directions, including improving the model's adaptability to different environments and integrating computer vision algorithms for enhanced navigation in dynamic environments.