PRUNING FILTERS FOR EFFICIENT CONVNETS

PRUNING FILTERS FOR EFFICIENT CONVNETS

10 Mar 2017 | Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, Hans Peter Graf
This paper presents a method for pruning filters in convolutional neural networks (CNNs) to reduce computation costs without sacrificing accuracy. The approach focuses on removing filters that have minimal impact on output accuracy, which reduces the number of matrix multiplications and thus the computational load. Unlike weight pruning, which introduces sparse connectivity patterns and requires sparse convolution libraries, filter pruning removes entire filters and their corresponding feature maps, resulting in a dense network that can be efficiently processed with existing BLAS libraries. The method is applied to VGG-16 and ResNet-110 on CIFAR-10, achieving up to 34% and 38% reductions in inference costs, respectively, while maintaining close to original accuracy through retraining. The paper discusses the importance of filter pruning in reducing the computational cost of CNNs, especially for applications with limited computational resources. It compares filter pruning with weight pruning and activation-based feature map pruning, showing that filter pruning is more effective in reducing computation costs without introducing sparsity. The method involves determining which filters to prune based on their $ \ell_{1} $-norm, which measures the sum of absolute weights of the filters. Filters with smaller $ \ell_{1} $-norms are pruned first, and the process is followed by retraining to restore accuracy. The paper also addresses the sensitivity of different layers to pruning, finding that some layers are more sensitive to pruning than others. It proposes a one-shot pruning and retraining strategy to save time, especially for deep networks. The results show that even for ResNets, which have fewer parameters and lower inference costs than AlexNet or VGGNet, pruning can achieve significant FLOP reductions without significant accuracy loss. The method is applied to various CNN architectures, including VGG-16, ResNet-56/110, and ResNet-34, demonstrating its effectiveness across different network designs. The paper concludes that filter pruning is a promising approach for reducing the computational cost of CNNs while maintaining accuracy.This paper presents a method for pruning filters in convolutional neural networks (CNNs) to reduce computation costs without sacrificing accuracy. The approach focuses on removing filters that have minimal impact on output accuracy, which reduces the number of matrix multiplications and thus the computational load. Unlike weight pruning, which introduces sparse connectivity patterns and requires sparse convolution libraries, filter pruning removes entire filters and their corresponding feature maps, resulting in a dense network that can be efficiently processed with existing BLAS libraries. The method is applied to VGG-16 and ResNet-110 on CIFAR-10, achieving up to 34% and 38% reductions in inference costs, respectively, while maintaining close to original accuracy through retraining. The paper discusses the importance of filter pruning in reducing the computational cost of CNNs, especially for applications with limited computational resources. It compares filter pruning with weight pruning and activation-based feature map pruning, showing that filter pruning is more effective in reducing computation costs without introducing sparsity. The method involves determining which filters to prune based on their $ \ell_{1} $-norm, which measures the sum of absolute weights of the filters. Filters with smaller $ \ell_{1} $-norms are pruned first, and the process is followed by retraining to restore accuracy. The paper also addresses the sensitivity of different layers to pruning, finding that some layers are more sensitive to pruning than others. It proposes a one-shot pruning and retraining strategy to save time, especially for deep networks. The results show that even for ResNets, which have fewer parameters and lower inference costs than AlexNet or VGGNet, pruning can achieve significant FLOP reductions without significant accuracy loss. The method is applied to various CNN architectures, including VGG-16, ResNet-56/110, and ResNet-34, demonstrating its effectiveness across different network designs. The paper concludes that filter pruning is a promising approach for reducing the computational cost of CNNs while maintaining accuracy.
Reach us at info@study.space
Understanding Pruning Filters for Efficient ConvNets