25 January 2024; Received in revised form 8 April 2024; Accepted 9 April 2024 | Wenyuan Xu, Chuang Cui, Yongcheng Ji*, Xiang Li, Shuai Li
This paper presents an enhanced YOLOv8 model, named YOLOv8-MPEB, for small target detection in UAV images. The proposed algorithm addresses the challenges of large-scale changes, small target sizes, complex scenes, and variable external factors in UAV aerial images. Key contributions include:
1. **Network Architecture**: The backbone network is replaced with the lightweight MobileNetV3 to reduce model parameters and computational complexity while maintaining inference speed.
2. **Feature Extraction**: A dedicated small target detection layer is designed to optimize feature extraction for multi-scale targets.
3. **Attention Mechanism**: The Efficient Multi-Scale Attention (EMA) mechanism is integrated into the Convolution to Feature (C2F) module to enhance the extraction of vital features and suppress superfluous ones.
4. **Bidirectional Feature Pyramid Network (BiFPN)**: Used in the Neck segment to improve detection accuracy and generalization, especially in scale variations and complex scenes.
The algorithm is evaluated on a custom-made helmet and reflective clothing dataset, achieving a mean Average Precision (mAP) of 91.9% with 7.39 million parameters and a model size of 14.5 MB. Compared to standard YOLOv8 models, the proposed algorithm improves average accuracy by 2.2 percentage points, reduces model parameters by 34%, and diminishes model size by 32%. It outperforms other prevalent detection algorithms in terms of accuracy and speed.
The paper also includes ablation experiments and comparative studies to validate the effectiveness of the proposed algorithm. The results demonstrate that the YOLOv8-MPEB algorithm effectively reduces leakage and false detection in UAV images, making it suitable for real-time and accurate target detection in complex scenarios.This paper presents an enhanced YOLOv8 model, named YOLOv8-MPEB, for small target detection in UAV images. The proposed algorithm addresses the challenges of large-scale changes, small target sizes, complex scenes, and variable external factors in UAV aerial images. Key contributions include:
1. **Network Architecture**: The backbone network is replaced with the lightweight MobileNetV3 to reduce model parameters and computational complexity while maintaining inference speed.
2. **Feature Extraction**: A dedicated small target detection layer is designed to optimize feature extraction for multi-scale targets.
3. **Attention Mechanism**: The Efficient Multi-Scale Attention (EMA) mechanism is integrated into the Convolution to Feature (C2F) module to enhance the extraction of vital features and suppress superfluous ones.
4. **Bidirectional Feature Pyramid Network (BiFPN)**: Used in the Neck segment to improve detection accuracy and generalization, especially in scale variations and complex scenes.
The algorithm is evaluated on a custom-made helmet and reflective clothing dataset, achieving a mean Average Precision (mAP) of 91.9% with 7.39 million parameters and a model size of 14.5 MB. Compared to standard YOLOv8 models, the proposed algorithm improves average accuracy by 2.2 percentage points, reduces model parameters by 34%, and diminishes model size by 32%. It outperforms other prevalent detection algorithms in terms of accuracy and speed.
The paper also includes ablation experiments and comparative studies to validate the effectiveness of the proposed algorithm. The results demonstrate that the YOLOv8-MPEB algorithm effectively reduces leakage and false detection in UAV images, making it suitable for real-time and accurate target detection in complex scenarios.