This paper presents an improved version of the YOLOv5 model, named SEB-YOLO, specifically designed for small target detection in remote sensing images. The SEB-YOLO model incorporates several enhancements to address the challenges posed by small and complex backgrounds in remote sensing images. These enhancements include:
1. **Space-to-Depth (SPD) Convolution**: This module reconstructs the backbone network using a non-strided convolution, reducing feature loss and retaining global features.
2. **ECSPP Module**: This module combines Spatial Pyramid Pooling Cross Stage Partial Conv (SPPCSPC) with Efficient Channel Network (ECA-Net) to enhance feature extraction at different scales.
3. **Bidirectional Feature Pyramid Network (Bi-FPN)**: This network improves bidirectional cross-scale connection and weighted feature fusion, enhancing the detection of small targets.
4. **Decoupled Head**: This head separates classification and regression tasks, improving model convergence and detection performance.
Experimental results on the NWPU VHR-10 and RSOD datasets show that the proposed SEB-YOLO model achieves higher mAP values (93.5% and 93.9%, respectively) compared to the original YOLOv5 model, demonstrating its effectiveness in detecting small targets in complex remote sensing images. The improvements in feature extraction, fusion, and detection head contribute to better accuracy and efficiency in small target detection.This paper presents an improved version of the YOLOv5 model, named SEB-YOLO, specifically designed for small target detection in remote sensing images. The SEB-YOLO model incorporates several enhancements to address the challenges posed by small and complex backgrounds in remote sensing images. These enhancements include:
1. **Space-to-Depth (SPD) Convolution**: This module reconstructs the backbone network using a non-strided convolution, reducing feature loss and retaining global features.
2. **ECSPP Module**: This module combines Spatial Pyramid Pooling Cross Stage Partial Conv (SPPCSPC) with Efficient Channel Network (ECA-Net) to enhance feature extraction at different scales.
3. **Bidirectional Feature Pyramid Network (Bi-FPN)**: This network improves bidirectional cross-scale connection and weighted feature fusion, enhancing the detection of small targets.
4. **Decoupled Head**: This head separates classification and regression tasks, improving model convergence and detection performance.
Experimental results on the NWPU VHR-10 and RSOD datasets show that the proposed SEB-YOLO model achieves higher mAP values (93.5% and 93.9%, respectively) compared to the original YOLOv5 model, demonstrating its effectiveness in detecting small targets in complex remote sensing images. The improvements in feature extraction, fusion, and detection head contribute to better accuracy and efficiency in small target detection.