14 March 2024 | Ajantha Vijayakumar¹ · Subramaniyaswamy Vairavasundaram¹
This paper provides a comprehensive review of YOLO-based object detection models, focusing on their performance, architecture, and applications. YOLO (You Only Look Once) is a single-stage object detection model that has gained significant popularity due to its high accuracy and fast inference speed. Since its introduction in 2015, YOLO has evolved through several versions, with the latest being YOLOv8, released in January 2023. The paper discusses the performance metrics used in object detection, post-processing methods, dataset availability, and common detection techniques. It then provides an in-depth analysis of the architectural design of each YOLO version, highlighting their contributions to various applications.
YOLO has become a central object detection model, particularly in real-time environments, due to its impressive accuracy and speed. Object detection is a vital component in computer vision research, enabling the identification of object categories and marking their locations. Over the years, there have been notable advancements in object detection algorithms, especially those using deep convolutional neural networks (CNNs), leading to a shift away from traditional methods like Viola Jones, HOG, and DPM.
The paper discusses the differences between single-stage and two-stage object detection algorithms. Single-stage detectors use a simpler architecture to detect objects in a single pass, while two-stage detectors use multiple passes and a more complex architecture. YOLO, with its one-shot detection approach, significantly improved the speed of object detection by dividing the image into a grid and making predictions within each grid cell.
The YOLO family has been continuously improved, with each version addressing detection issues and enhancing performance. This review provides a detailed analysis of all YOLO versions from YOLOv1 to YOLOv8, highlighting the main improvements, modifications, and innovations in each version. The paper also discusses how YOLO versions are applied in various domains, such as autonomous vehicles, surveillance systems, and healthcare, showcasing their versatility and effectiveness in addressing diverse visual recognition challenges.This paper provides a comprehensive review of YOLO-based object detection models, focusing on their performance, architecture, and applications. YOLO (You Only Look Once) is a single-stage object detection model that has gained significant popularity due to its high accuracy and fast inference speed. Since its introduction in 2015, YOLO has evolved through several versions, with the latest being YOLOv8, released in January 2023. The paper discusses the performance metrics used in object detection, post-processing methods, dataset availability, and common detection techniques. It then provides an in-depth analysis of the architectural design of each YOLO version, highlighting their contributions to various applications.
YOLO has become a central object detection model, particularly in real-time environments, due to its impressive accuracy and speed. Object detection is a vital component in computer vision research, enabling the identification of object categories and marking their locations. Over the years, there have been notable advancements in object detection algorithms, especially those using deep convolutional neural networks (CNNs), leading to a shift away from traditional methods like Viola Jones, HOG, and DPM.
The paper discusses the differences between single-stage and two-stage object detection algorithms. Single-stage detectors use a simpler architecture to detect objects in a single pass, while two-stage detectors use multiple passes and a more complex architecture. YOLO, with its one-shot detection approach, significantly improved the speed of object detection by dividing the image into a grid and making predictions within each grid cell.
The YOLO family has been continuously improved, with each version addressing detection issues and enhancing performance. This review provides a detailed analysis of all YOLO versions from YOLOv1 to YOLOv8, highlighting the main improvements, modifications, and innovations in each version. The paper also discusses how YOLO versions are applied in various domains, such as autonomous vehicles, surveillance systems, and healthcare, showcasing their versatility and effectiveness in addressing diverse visual recognition challenges.