4 January 2024 | Qiuli Liu, Haixiong Ye, Shiming Wang and Zhe Xu
This paper presents YOLOv8-CB, an improved lightweight multi-scale pedestrian detection algorithm for vehicle-mounted cameras. The algorithm addresses the challenges of high computational load, complex models, and suboptimal detection accuracy for small targets and highly occluded pedestrians in complex scenes such as intersections. YOLOv8-CB introduces a lightweight cascade fusion network (CFNet) and a CBAM attention module to enhance multi-scale feature semantics and location information. It also incorporates a bidirectional weighted feature fusion path (BIFPN) to improve detection performance. Experimental results show that YOLOv8-CB achieves a 2.4% increase in accuracy, a 6.45% reduction in model parameters, and a 6.74% reduction in computational load compared to YOLOv8n. The algorithm's inference time for a single image is 10.8 ms, making it suitable for dense pedestrian detection in urban streets and intersections. The proposed algorithm is validated through extensive experiments on the WiderPerson dataset and compared with other popular object detection algorithms, demonstrating superior performance in terms of accuracy and efficiency.This paper presents YOLOv8-CB, an improved lightweight multi-scale pedestrian detection algorithm for vehicle-mounted cameras. The algorithm addresses the challenges of high computational load, complex models, and suboptimal detection accuracy for small targets and highly occluded pedestrians in complex scenes such as intersections. YOLOv8-CB introduces a lightweight cascade fusion network (CFNet) and a CBAM attention module to enhance multi-scale feature semantics and location information. It also incorporates a bidirectional weighted feature fusion path (BIFPN) to improve detection performance. Experimental results show that YOLOv8-CB achieves a 2.4% increase in accuracy, a 6.45% reduction in model parameters, and a 6.74% reduction in computational load compared to YOLOv8n. The algorithm's inference time for a single image is 10.8 ms, making it suitable for dense pedestrian detection in urban streets and intersections. The proposed algorithm is validated through extensive experiments on the WiderPerson dataset and compared with other popular object detection algorithms, demonstrating superior performance in terms of accuracy and efficiency.