ISOD: improved small object detection based on extended scale feature pyramid network

ISOD: improved small object detection based on extended scale feature pyramid network

28 March 2024 | Ping Ma¹, Xinyi He¹, Yiyang Chen², Yuan Liu¹
The paper presents an improved small object detection (ISOD) network based on an extended scale feature pyramid network (ESFPN) to enhance the detection of small objects in construction sites. ISOD integrates an efficient channel attention (ECA) mechanism into the backbone network to extract features and combines ESFPN to simplify calculations and create additional high-resolution pyramid layers, improving the detection of small targets. Experiments on the Reflective Vest Scene Dataset (RVSD) and Tsinghua-Tencent 100K dataset show that ISOD achieves mAP@0.5–0.95 of 0.425 and 0.635, respectively, surpassing the state-of-the-art YOLOv7 model. ISOD improves small object detection by integrating ECA and ESFPN. The ECA module dynamically assigns weights to channel features, selectively retaining information for object recognition. The ESFPN consists of two components: fusion feature transfer (FFT) module and improved spatial pyramid pooling-fast (ISPPF) structure. The FFT module extracts precise details from reliable regions while preventing noise transmission, and the ISPPF structure improves processing speed through efficient pyramid pooling. ISOD also introduces a large-scale RVSD dataset with 6,036 images and 13,488 annotation instances. Comparative experiments on RVSD and Tsinghua-Tencent 100K datasets show that ISOD outperforms YOLOv5 and YOLOv7. The main contributions of ISOD include the integration of ECA and ESFPN for improved small object detection, the development of ESFPN with FFT and ISPPF for enhanced feature representation and processing speed, and the creation of a large-scale RVSD dataset for small object detection. The paper is structured as follows: Section 2 provides an overview of related works, Section 3 presents the model structure of ISOD, Section 4 describes comparative and ablation experiments, and Section 5 concludes the paper and outlines future research directions.The paper presents an improved small object detection (ISOD) network based on an extended scale feature pyramid network (ESFPN) to enhance the detection of small objects in construction sites. ISOD integrates an efficient channel attention (ECA) mechanism into the backbone network to extract features and combines ESFPN to simplify calculations and create additional high-resolution pyramid layers, improving the detection of small targets. Experiments on the Reflective Vest Scene Dataset (RVSD) and Tsinghua-Tencent 100K dataset show that ISOD achieves mAP@0.5–0.95 of 0.425 and 0.635, respectively, surpassing the state-of-the-art YOLOv7 model. ISOD improves small object detection by integrating ECA and ESFPN. The ECA module dynamically assigns weights to channel features, selectively retaining information for object recognition. The ESFPN consists of two components: fusion feature transfer (FFT) module and improved spatial pyramid pooling-fast (ISPPF) structure. The FFT module extracts precise details from reliable regions while preventing noise transmission, and the ISPPF structure improves processing speed through efficient pyramid pooling. ISOD also introduces a large-scale RVSD dataset with 6,036 images and 13,488 annotation instances. Comparative experiments on RVSD and Tsinghua-Tencent 100K datasets show that ISOD outperforms YOLOv5 and YOLOv7. The main contributions of ISOD include the integration of ECA and ESFPN for improved small object detection, the development of ESFPN with FFT and ISPPF for enhanced feature representation and processing speed, and the creation of a large-scale RVSD dataset for small object detection. The paper is structured as follows: Section 2 provides an overview of related works, Section 3 presents the model structure of ISOD, Section 4 describes comparative and ablation experiments, and Section 5 concludes the paper and outlines future research directions.
Reach us at info@study.space
Understanding ISOD%3A improved small object detection based on extended scale feature pyramid network