November 28, 2019 | Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu, Ping-Yang Chen, Jun-Wei Hsieh
CSPNet is a novel backbone network designed to enhance the learning capability of Convolutional Neural Networks (CNNs). The paper proposes a Cross Stage Partial Network (CSPNet) that reduces computational costs while maintaining or improving accuracy. CSPNet addresses the issue of redundant gradient information in network optimization by integrating feature maps from the beginning and end of a network stage. This approach reduces computation by 20% on the ImageNet dataset and significantly outperforms state-of-the-art methods on the MS COCO object detection dataset in terms of AP50. CSPNet is easy to implement and can be applied to architectures based on ResNet, ResNeXt, and DenseNet. It also reduces memory usage and computational bottlenecks, making it suitable for deployment on CPUs and mobile GPUs. The paper also introduces the Exact Fusion Model (EFM), which improves feature aggregation and reduces memory bandwidth requirements. Experiments show that CSPNet achieves high accuracy and inference speed on both GPU and CPU platforms, outperforming existing methods in terms of computational efficiency and performance. The proposed CSPNet and EFM significantly enhance the learning ability of CNNs, enabling efficient and accurate object detection on mobile and edge computing devices.CSPNet is a novel backbone network designed to enhance the learning capability of Convolutional Neural Networks (CNNs). The paper proposes a Cross Stage Partial Network (CSPNet) that reduces computational costs while maintaining or improving accuracy. CSPNet addresses the issue of redundant gradient information in network optimization by integrating feature maps from the beginning and end of a network stage. This approach reduces computation by 20% on the ImageNet dataset and significantly outperforms state-of-the-art methods on the MS COCO object detection dataset in terms of AP50. CSPNet is easy to implement and can be applied to architectures based on ResNet, ResNeXt, and DenseNet. It also reduces memory usage and computational bottlenecks, making it suitable for deployment on CPUs and mobile GPUs. The paper also introduces the Exact Fusion Model (EFM), which improves feature aggregation and reduces memory bandwidth requirements. Experiments show that CSPNet achieves high accuracy and inference speed on both GPU and CPU platforms, outperforming existing methods in terms of computational efficiency and performance. The proposed CSPNet and EFM significantly enhance the learning ability of CNNs, enabling efficient and accurate object detection on mobile and edge computing devices.