Understanding Channel Pruning for Accelerating Very Deep Neural Networks

This paper introduces a new channel pruning method to accelerate very deep convolutional neural networks (CNNs). The method is designed for inference-time pruning, focusing on reducing the number of channels in each layer while maintaining accuracy. The pruning process involves two steps: channel selection using LASSO regression and feature map reconstruction using least squares. The algorithm is generalized to multi-layer and multi-branch networks, such as ResNet and Xception. The proposed method achieves significant speed-ups (up to 5×) with minimal accuracy loss (0.3% for VGG-16, 1.4% for ResNet-50, and 1.0% for Xception-50). The code for this method is publicly available. The paper also discusses the challenges and comparisons with other pruning methods, demonstrating the effectiveness of the proposed approach through experiments on ImageNet, CIFAR-10, and PASCAL VOC datasets.This paper introduces a new channel pruning method to accelerate very deep convolutional neural networks (CNNs). The method is designed for inference-time pruning, focusing on reducing the number of channels in each layer while maintaining accuracy. The pruning process involves two steps: channel selection using LASSO regression and feature map reconstruction using least squares. The algorithm is generalized to multi-layer and multi-branch networks, such as ResNet and Xception. The proposed method achieves significant speed-ups (up to 5×) with minimal accuracy loss (0.3% for VGG-16, 1.4% for ResNet-50, and 1.0% for Xception-50). The code for this method is publicly available. The paper also discusses the challenges and comparisons with other pruning methods, demonstrating the effectiveness of the proposed approach through experiments on ImageNet, CIFAR-10, and PASCAL VOC datasets.

Channel Pruning for Accelerating Very Deep Neural Networks

21 Aug 2017 | Yihui He*, Xiangyu Zhang, Jian Sun