10 Mar 2017 | Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, Hans Peter Graf
The paper "Pruning Filters for Efficient ConvNets" by Hao Li addresses the challenge of reducing the computational and storage costs of Convolutional Neural Networks (CNNs) without compromising accuracy. Traditional methods, such as magnitude-based pruning of weights, often remove a significant number of parameters from fully connected layers, which may not effectively reduce computation costs in convolutional layers due to irregular sparsity. The authors propose a novel approach that prunes filters from CNNs, which are identified as having a small effect on output accuracy. By removing entire filters along with their connecting feature maps, the computation costs are significantly reduced. This method does not introduce sparse connectivity patterns, making it compatible with existing efficient BLAS libraries for dense matrix multiplications.
The paper demonstrates that simple filter pruning techniques can reduce inference costs for VGG-16 by up to 34% and ResNet-110 by up to 38% on the CIFAR-10 dataset while maintaining close to the original accuracy through retraining. The authors also introduce a one-shot pruning and retraining strategy to save time, which is particularly beneficial for pruning deep networks. They analyze the sensitivity of different layers to pruning and propose strategies to determine the optimal pruning rates for each layer. The effectiveness of their method is validated through experiments on VGG-16, ResNet-56/110, and ResNet-34, showing significant reductions in FLOPs with minimal accuracy loss. The paper concludes by discussing the advantages of their approach over other pruning methods and its potential for further improving deep network architectures.The paper "Pruning Filters for Efficient ConvNets" by Hao Li addresses the challenge of reducing the computational and storage costs of Convolutional Neural Networks (CNNs) without compromising accuracy. Traditional methods, such as magnitude-based pruning of weights, often remove a significant number of parameters from fully connected layers, which may not effectively reduce computation costs in convolutional layers due to irregular sparsity. The authors propose a novel approach that prunes filters from CNNs, which are identified as having a small effect on output accuracy. By removing entire filters along with their connecting feature maps, the computation costs are significantly reduced. This method does not introduce sparse connectivity patterns, making it compatible with existing efficient BLAS libraries for dense matrix multiplications.
The paper demonstrates that simple filter pruning techniques can reduce inference costs for VGG-16 by up to 34% and ResNet-110 by up to 38% on the CIFAR-10 dataset while maintaining close to the original accuracy through retraining. The authors also introduce a one-shot pruning and retraining strategy to save time, which is particularly beneficial for pruning deep networks. They analyze the sensitivity of different layers to pruning and propose strategies to determine the optimal pruning rates for each layer. The effectiveness of their method is validated through experiments on VGG-16, ResNet-56/110, and ResNet-34, showing significant reductions in FLOPs with minimal accuracy loss. The paper concludes by discussing the advantages of their approach over other pruning methods and its potential for further improving deep network architectures.