ISOD: improved small object detection based on extended scale feature pyramid network

ISOD: improved small object detection based on extended scale feature pyramid network

Accepted: 16 February 2024 / Published online: 28 March 2024 | Ping Ma, Xinyi He, Yiyang Chen, Yuan Liu
The paper introduces an improved small object detection (ISOD) network designed to enhance the accuracy and speed of target detection in intelligent construction sites. The ISOD network combines an efficient channel attention mechanism (ECA) with an extended scale feature pyramid network (ESFPN) to improve the detection of small objects. The ECA module dynamically assigns weights to channel features, enhancing object recognition and classification. The ESFPN consists of a fusion feature transfer (FFT) module and an improved spatial pyramid pooling-fast (ISPPF) structure, which together improve the detection of small objects by extracting precise details and enriching regional features. The proposed ISOD network outperforms the state-of-the-art YOLOv7 model on the Reflective Vest Scene Dataset (RVSD) and the Tsinghua-Tencent 100K dataset, achieving 0.425 and 0.635 mAP@0.5–0.95, respectively. The main contributions of the paper include the integration of ECA and ESFPN, the development of a large-scale reflective vest scene dataset, and the superior performance of ISOD on both datasets.The paper introduces an improved small object detection (ISOD) network designed to enhance the accuracy and speed of target detection in intelligent construction sites. The ISOD network combines an efficient channel attention mechanism (ECA) with an extended scale feature pyramid network (ESFPN) to improve the detection of small objects. The ECA module dynamically assigns weights to channel features, enhancing object recognition and classification. The ESFPN consists of a fusion feature transfer (FFT) module and an improved spatial pyramid pooling-fast (ISPPF) structure, which together improve the detection of small objects by extracting precise details and enriching regional features. The proposed ISOD network outperforms the state-of-the-art YOLOv7 model on the Reflective Vest Scene Dataset (RVSD) and the Tsinghua-Tencent 100K dataset, achieving 0.425 and 0.635 mAP@0.5–0.95, respectively. The main contributions of the paper include the integration of ECA and ESFPN, the development of a large-scale reflective vest scene dataset, and the superior performance of ISOD on both datasets.
Reach us at info@study.space