29 Jul 2024 | Chun-Lin Ji, Tao Yu, Peng Gao, Fei Wang, Ru-Yue Yuan
YOLO-TLA is an efficient and lightweight small object detection model based on YOLOv5. The model introduces a tiny detection layer in the neck network to enhance small object detection performance. It also incorporates a global attention mechanism into the backbone network to improve feature extraction and highlight object attributes. Additionally, the C3CrossCovn module is integrated into the backbone network to reduce computational demands and parameters. These improvements result in significant gains in mAP@0.5 and mAP@0.5:0.95 on the MS COCO validation dataset, with YOLO-TLA achieving a 4.6% increase in mAP@0.5 and 4% in mAP@0.5:0.95 while maintaining a compact model size of 9.49M parameters. When extended to the YOLOv5m model, YOLO-TLAM shows a 1.7% and 1.9% increase in mAP@0.5 and mAP@0.5:0.95, respectively, with a total of 27.53M parameters. The model's enhancements include lightweight convolution modules, attention mechanisms, and a global attention mechanism, which collectively improve detection accuracy and efficiency. The results validate the effectiveness of YOLO-TLA in small object detection, achieving high accuracy with fewer parameters and computational demands. The study also compares YOLO-TLA with other state-of-the-art models, demonstrating its superior performance in terms of accuracy and efficiency. The proposed method not only enhances small object detection but also maintains model efficiency, making it suitable for deployment on resource-constrained systems.YOLO-TLA is an efficient and lightweight small object detection model based on YOLOv5. The model introduces a tiny detection layer in the neck network to enhance small object detection performance. It also incorporates a global attention mechanism into the backbone network to improve feature extraction and highlight object attributes. Additionally, the C3CrossCovn module is integrated into the backbone network to reduce computational demands and parameters. These improvements result in significant gains in mAP@0.5 and mAP@0.5:0.95 on the MS COCO validation dataset, with YOLO-TLA achieving a 4.6% increase in mAP@0.5 and 4% in mAP@0.5:0.95 while maintaining a compact model size of 9.49M parameters. When extended to the YOLOv5m model, YOLO-TLAM shows a 1.7% and 1.9% increase in mAP@0.5 and mAP@0.5:0.95, respectively, with a total of 27.53M parameters. The model's enhancements include lightweight convolution modules, attention mechanisms, and a global attention mechanism, which collectively improve detection accuracy and efficiency. The results validate the effectiveness of YOLO-TLA in small object detection, achieving high accuracy with fewer parameters and computational demands. The study also compares YOLO-TLA with other state-of-the-art models, demonstrating its superior performance in terms of accuracy and efficiency. The proposed method not only enhances small object detection but also maintains model efficiency, making it suitable for deployment on resource-constrained systems.