14 March 2024 | Ajantha Vijayakumar, Subramaniyaswamy Vairavasundaram
This paper provides a comprehensive review of YOLO (You Only Look Once) object detection models, from YOLOv1 to YOLOv8. YOLO is a real-time object detection system that has gained significant popularity due to its high accuracy and speed. The paper begins by explaining the performance metrics, post-processing methods, dataset availability, and common object detection techniques. It then delves into the architectural design of each YOLO version, highlighting improvements such as anchor box to bounding box, network model design, loss value calculations, model scaling, labeling methods, and aggregation techniques. The review also discusses the diverse applications of YOLO versions, including their effectiveness in various domains such as autonomous vehicles, surveillance systems, smart cities, and healthcare. The paper aims to provide a detailed understanding of the evolution and capabilities of YOLO, making it a valuable resource for researchers and practitioners in the field of computer vision and deep learning.This paper provides a comprehensive review of YOLO (You Only Look Once) object detection models, from YOLOv1 to YOLOv8. YOLO is a real-time object detection system that has gained significant popularity due to its high accuracy and speed. The paper begins by explaining the performance metrics, post-processing methods, dataset availability, and common object detection techniques. It then delves into the architectural design of each YOLO version, highlighting improvements such as anchor box to bounding box, network model design, loss value calculations, model scaling, labeling methods, and aggregation techniques. The review also discusses the diverse applications of YOLO versions, including their effectiveness in various domains such as autonomous vehicles, surveillance systems, smart cities, and healthcare. The paper aims to provide a detailed understanding of the evolution and capabilities of YOLO, making it a valuable resource for researchers and practitioners in the field of computer vision and deep learning.