22 Feb 2021 | Chien-Yao Wang, Alexey Bochkovskiy, Hong-Yuan Mark Liao
Scaled-YOLOv4 is a model scaling approach that enhances the performance of the YOLOv4 object detection network by adjusting depth, width, resolution, and structure. This method enables YOLOv4 to scale both up and down, achieving high accuracy and speed on various devices. The YOLOv4-large model achieves 55.5% AP (73.4% AP50) on the MS COCO dataset at 16 FPS on a Tesla V100 GPU, while YOLOv4-tiny achieves 22.0% AP (42.0% AP50) at 443 FPS on an RTX 2080Ti GPU, with TensorRT improving performance to 1774 FPS.
The paper introduces scaled-YOLOv4, which systematically balances computation cost and memory bandwidth for small models and optimizes large models for high-end GPUs. It analyzes the relationships among scaling factors and uses them to achieve the best trade-off between speed and accuracy. The design of scaled-YOLOv4 includes CSP-ized models, which reduce computation and improve inference speed. The YOLOv4-tiny model is optimized for low-end devices, while YOLOv4-large is optimized for high-end GPUs, achieving real-time performance on various devices.
Experiments show that scaled-YOLOv4 outperforms other state-of-the-art models in terms of accuracy and speed. The YOLOv4-large model achieves 56.0% AP (73.3% AP50) with test-time augmentation, and YOLOv4-tiny achieves real-time performance on embedded systems. The paper also demonstrates that scaled-YOLOv4 can act as a "once-for-all" model, capable of adapting to different input resolutions and performing well on various devices. The results confirm that scaled-YOLOv4 achieves the highest accuracy and speed on the COCO dataset, making it a powerful tool for real-time object detection.Scaled-YOLOv4 is a model scaling approach that enhances the performance of the YOLOv4 object detection network by adjusting depth, width, resolution, and structure. This method enables YOLOv4 to scale both up and down, achieving high accuracy and speed on various devices. The YOLOv4-large model achieves 55.5% AP (73.4% AP50) on the MS COCO dataset at 16 FPS on a Tesla V100 GPU, while YOLOv4-tiny achieves 22.0% AP (42.0% AP50) at 443 FPS on an RTX 2080Ti GPU, with TensorRT improving performance to 1774 FPS.
The paper introduces scaled-YOLOv4, which systematically balances computation cost and memory bandwidth for small models and optimizes large models for high-end GPUs. It analyzes the relationships among scaling factors and uses them to achieve the best trade-off between speed and accuracy. The design of scaled-YOLOv4 includes CSP-ized models, which reduce computation and improve inference speed. The YOLOv4-tiny model is optimized for low-end devices, while YOLOv4-large is optimized for high-end GPUs, achieving real-time performance on various devices.
Experiments show that scaled-YOLOv4 outperforms other state-of-the-art models in terms of accuracy and speed. The YOLOv4-large model achieves 56.0% AP (73.3% AP50) with test-time augmentation, and YOLOv4-tiny achieves real-time performance on embedded systems. The paper also demonstrates that scaled-YOLOv4 can act as a "once-for-all" model, capable of adapting to different input resolutions and performing well on various devices. The results confirm that scaled-YOLOv4 achieves the highest accuracy and speed on the COCO dataset, making it a powerful tool for real-time object detection.