22 March 2024 | Yang Sun, Yuhang Zhang, Haiyang Wang, Jianhua Guo, Jiushuai Zheng, Haonan Ning
The paper "SES-YOLOv8n: automatic driving object detection algorithm based on improved YOLOv8" addresses the challenges of balancing real-time detection and high accuracy in autonomous driving systems. The authors propose an enhanced object detection network, SES-YOLOv8n, which builds on the YOLOv8n model. Key improvements include:
1. **SPPCSPC Module**: Replacing the SPPF module with SPPCSPC to better integrate global and local information, enhancing the model's ability to capture features at different scales.
2. **EMA Attention Mechanism**: Introducing an efficient multi-scale attention module (EMA) into the C2F module of the backbone network to improve feature expression and detection accuracy.
3. **SPD-Conv Module**: Replacing part of the convolution modules with SPD-Conv to retain more feature information, improving the network's accuracy and learning ability.
Experimental results on the KITTI and BDD100K datasets show that the improved model achieves an average accuracy of 92.7% and 41.9%, respectively, which is 3.4% and 5.0% higher than the baseline model. The model demonstrates real-time image processing capabilities while maintaining high detection accuracy, making it suitable for general autonomous driving scenarios.The paper "SES-YOLOv8n: automatic driving object detection algorithm based on improved YOLOv8" addresses the challenges of balancing real-time detection and high accuracy in autonomous driving systems. The authors propose an enhanced object detection network, SES-YOLOv8n, which builds on the YOLOv8n model. Key improvements include:
1. **SPPCSPC Module**: Replacing the SPPF module with SPPCSPC to better integrate global and local information, enhancing the model's ability to capture features at different scales.
2. **EMA Attention Mechanism**: Introducing an efficient multi-scale attention module (EMA) into the C2F module of the backbone network to improve feature expression and detection accuracy.
3. **SPD-Conv Module**: Replacing part of the convolution modules with SPD-Conv to retain more feature information, improving the network's accuracy and learning ability.
Experimental results on the KITTI and BDD100K datasets show that the improved model achieves an average accuracy of 92.7% and 41.9%, respectively, which is 3.4% and 5.0% higher than the baseline model. The model demonstrates real-time image processing capabilities while maintaining high detection accuracy, making it suitable for general autonomous driving scenarios.