3 March 2024 | Na Ma, Yaxin Su, Lexin Yang, Zhongtao Li and Hongwen Yan
This study proposes a lightweight real-time wheat seed detection model, YOLOv8-HD, based on the YOLOv8 framework to improve detection accuracy and speed in wheat seed counting. The model addresses challenges such as seed adhesion, occlusion, and impurities, which can reduce detection accuracy. To achieve this, the YOLOv8 detection head is redesigned with shared convolutional layers to reduce parameters and improve runtime speed. Additionally, the Vision Transformer with Deformable Attention mechanism is integrated into the C2f module of the backbone network to enhance feature extraction and detection accuracy. The results show that YOLOv8-HD achieves an average detection accuracy (mAP) of 77.6% in stacked scenes with impurities, which is 9.1% higher than YOLOv8. In all scenes, YOLOv8-HD achieves an average mAP of 99.3%, 16.8% higher than YOLOv8. The model's memory size is 6.35 MB, approximately 4/5 of YOLOv8, and its GFLOPs decrease by 16%. The inference time is 2.86 ms (on GPU), lower than YOLOv8. The model outperforms other mainstream networks in terms of mAP, speed, and model size. YOLOv8-HD efficiently detects wheat seeds in various scenarios, providing technical support for the development of seed counting instruments. The model was evaluated on a dataset of five different scenarios: dispersed without impurities, dispersed with impurities, aggregated without impurities, aggregated with impurities, and stacked. The results show that YOLOv8-HD achieves high detection accuracy and speed, making it suitable for deployment on embedded platforms. The study also compares YOLOv8-HD with other models, including YOLOv7-tiny, and finds that YOLOv8-HD outperforms them in terms of detection accuracy, model size, and runtime speed. The model's performance is further validated on a global wheat ear dataset, where it achieves a higher mAP and lower GFLOPs compared to the original YOLOv8. The study concludes that YOLOv8-HD is an effective solution for wheat seed detection and counting, with potential for broader applications in agricultural technology.This study proposes a lightweight real-time wheat seed detection model, YOLOv8-HD, based on the YOLOv8 framework to improve detection accuracy and speed in wheat seed counting. The model addresses challenges such as seed adhesion, occlusion, and impurities, which can reduce detection accuracy. To achieve this, the YOLOv8 detection head is redesigned with shared convolutional layers to reduce parameters and improve runtime speed. Additionally, the Vision Transformer with Deformable Attention mechanism is integrated into the C2f module of the backbone network to enhance feature extraction and detection accuracy. The results show that YOLOv8-HD achieves an average detection accuracy (mAP) of 77.6% in stacked scenes with impurities, which is 9.1% higher than YOLOv8. In all scenes, YOLOv8-HD achieves an average mAP of 99.3%, 16.8% higher than YOLOv8. The model's memory size is 6.35 MB, approximately 4/5 of YOLOv8, and its GFLOPs decrease by 16%. The inference time is 2.86 ms (on GPU), lower than YOLOv8. The model outperforms other mainstream networks in terms of mAP, speed, and model size. YOLOv8-HD efficiently detects wheat seeds in various scenarios, providing technical support for the development of seed counting instruments. The model was evaluated on a dataset of five different scenarios: dispersed without impurities, dispersed with impurities, aggregated without impurities, aggregated with impurities, and stacked. The results show that YOLOv8-HD achieves high detection accuracy and speed, making it suitable for deployment on embedded platforms. The study also compares YOLOv8-HD with other models, including YOLOv7-tiny, and finds that YOLOv8-HD outperforms them in terms of detection accuracy, model size, and runtime speed. The model's performance is further validated on a global wheat ear dataset, where it achieves a higher mAP and lower GFLOPs compared to the original YOLOv8. The study concludes that YOLOv8-HD is an effective solution for wheat seed detection and counting, with potential for broader applications in agricultural technology.