[slides and audio] Scaled-YOLOv4%3A Scaling Cross Stage Partial Network

The paper "Scaled-YOLOv4: Scaling Cross Stage Partial Network" by Chien-Yao Wang, Alexey Bochkovskiy, and Hong-Yuan Mark Liao proposes a method to scale the YOLOv4 object detection neural network, which is based on the CSP (Cross Stage Partial) approach. The authors demonstrate that YOLOv4 can be scaled both up and down to fit small and large networks while maintaining optimal speed and accuracy. They introduce a network scaling approach that modifies the depth, width, resolution, and structure of the network. The YOLOv4-large model achieves state-of-the-art results with 55.5% AP (73.4% AP50) on the MS COCO dataset at a speed of ~16 FPS on Tesla V100, and with test-time augmentation, it achieves 56.0% AP (73.3 AP50). The YOLOv4-tiny model achieves 22.0% AP (42.0% AP50) at a speed of ~443 FPS on RTX 2080Ti, and with TensorRT, batch size = 4, and FP16 precision, it achieves 1774 FPS. The paper discusses the design principles of model scaling, including the general principle of model scaling, scaling tiny models for low-end devices, and scaling large models for high-end GPUs. It also presents the CSP-ized YOLOv4, YOLOv4-tiny, and YOLOv4-large models, detailing their architecture and performance. Experimental results show that the scaled-YOLOv4 models achieve the highest accuracy and real-time inference speed, outperforming other state-of-the-art object detectors. The paper concludes by highlighting the effectiveness of the proposed scaling approach and its potential for various applications.The paper "Scaled-YOLOv4: Scaling Cross Stage Partial Network" by Chien-Yao Wang, Alexey Bochkovskiy, and Hong-Yuan Mark Liao proposes a method to scale the YOLOv4 object detection neural network, which is based on the CSP (Cross Stage Partial) approach. The authors demonstrate that YOLOv4 can be scaled both up and down to fit small and large networks while maintaining optimal speed and accuracy. They introduce a network scaling approach that modifies the depth, width, resolution, and structure of the network. The YOLOv4-large model achieves state-of-the-art results with 55.5% AP (73.4% AP50) on the MS COCO dataset at a speed of ~16 FPS on Tesla V100, and with test-time augmentation, it achieves 56.0% AP (73.3 AP50). The YOLOv4-tiny model achieves 22.0% AP (42.0% AP50) at a speed of ~443 FPS on RTX 2080Ti, and with TensorRT, batch size = 4, and FP16 precision, it achieves 1774 FPS. The paper discusses the design principles of model scaling, including the general principle of model scaling, scaling tiny models for low-end devices, and scaling large models for high-end GPUs. It also presents the CSP-ized YOLOv4, YOLOv4-tiny, and YOLOv4-large models, detailing their architecture and performance. Experimental results show that the scaled-YOLOv4 models achieve the highest accuracy and real-time inference speed, outperforming other state-of-the-art object detectors. The paper concludes by highlighting the effectiveness of the proposed scaling approach and its potential for various applications.

Scaled-YOLOv4: Scaling Cross Stage Partial Network

22 Feb 2021 | Chien-Yao Wang, Alexey Bochkovskiy, Hong-Yuan Mark Liao