Multi-Scale Fusion Uncrewed Aerial Vehicle Detection Based on RT-DETR

Multi-Scale Fusion Uncrewed Aerial Vehicle Detection Based on RT-DETR

14 April 2024 | Minling Zhu and En Kong
This paper addresses the challenges in UAV target detection by proposing the Gathering Cascaded Dilated DETR (GCD-DETR) model, which aims to enhance the accuracy and efficiency of UAV target detection. The main innovations include: 1. **Dilated Re-param Block**: This block creatively combines large kernel convolutions with parallel small kernel convolutions, improving feature extraction ability and enhancing the accuracy of UAV detection. 2. **Gather-and-Distribute Mechanism**: This mechanism effectively enhances multi-scale feature fusion, allowing the model to fully utilize feature information from the backbone network and improve detection performance. 3. **Cascaded Group Attention**: This mechanism introduces a new attention mechanism that divides attention heads in different ways, reducing computational cost and improving attention diversity, thereby enhancing the model's ability to process complex scenes. The effectiveness of the proposed model is validated through experiments on multiple UAV datasets. The results show that the GCD-DETR model achieves an accuracy of 0.956 and 0.978 on two UAV datasets, respectively, which are 2% and 1.1% higher than the original RT-DETR model. Additionally, the model's FPS is improved by 10 frames per second, achieving a balance between accuracy and speed. The paper also discusses the background and motivation of UAV detection, reviews existing methods, and provides a detailed explanation of the proposed GCD-DETR model, including its network structure and key components. Experimental results demonstrate the model's superior performance in terms of precision, recall, and detection accuracy, making it a promising solution for UAV target detection.This paper addresses the challenges in UAV target detection by proposing the Gathering Cascaded Dilated DETR (GCD-DETR) model, which aims to enhance the accuracy and efficiency of UAV target detection. The main innovations include: 1. **Dilated Re-param Block**: This block creatively combines large kernel convolutions with parallel small kernel convolutions, improving feature extraction ability and enhancing the accuracy of UAV detection. 2. **Gather-and-Distribute Mechanism**: This mechanism effectively enhances multi-scale feature fusion, allowing the model to fully utilize feature information from the backbone network and improve detection performance. 3. **Cascaded Group Attention**: This mechanism introduces a new attention mechanism that divides attention heads in different ways, reducing computational cost and improving attention diversity, thereby enhancing the model's ability to process complex scenes. The effectiveness of the proposed model is validated through experiments on multiple UAV datasets. The results show that the GCD-DETR model achieves an accuracy of 0.956 and 0.978 on two UAV datasets, respectively, which are 2% and 1.1% higher than the original RT-DETR model. Additionally, the model's FPS is improved by 10 frames per second, achieving a balance between accuracy and speed. The paper also discusses the background and motivation of UAV detection, reviews existing methods, and provides a detailed explanation of the proposed GCD-DETR model, including its network structure and key components. Experimental results demonstrate the model's superior performance in terms of precision, recall, and detection accuracy, making it a promising solution for UAV target detection.
Reach us at info@study.space
Understanding Multi-Scale Fusion Uncrewed Aerial Vehicle Detection Based on RT-DETR