[slides] Network Trimming%3A A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures

The paper "Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures" by Hengyuan Hu introduces a method called Network Trimming to optimize deep neural networks by pruning unimportant neurons. The authors observe that a significant portion of neurons in large networks have mostly zero activations, which are redundant and can be removed without affecting overall accuracy. The pruning process involves identifying these neurons based on their activation statistics (Average Percentage of Zeros, APoZ) and then retraining the network using the weights before pruning as initialization. This iterative process is repeated until the desired level of compression is achieved. Experiments on LeNet and VGG-16 networks demonstrate that Network Trimming can achieve high compression ratios (2-3x) while maintaining or improving accuracy. The method is particularly effective in reducing the number of parameters in fully connected layers and convolutional layers, leading to more efficient and accurate neural networks. The paper also discusses the necessity of proper weight initialization during the pruning and retraining process and compares the effectiveness of Network Trimming with other pruning methods, highlighting its advantages in terms of computational efficiency and performance.The paper "Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures" by Hengyuan Hu introduces a method called Network Trimming to optimize deep neural networks by pruning unimportant neurons. The authors observe that a significant portion of neurons in large networks have mostly zero activations, which are redundant and can be removed without affecting overall accuracy. The pruning process involves identifying these neurons based on their activation statistics (Average Percentage of Zeros, APoZ) and then retraining the network using the weights before pruning as initialization. This iterative process is repeated until the desired level of compression is achieved. Experiments on LeNet and VGG-16 networks demonstrate that Network Trimming can achieve high compression ratios (2-3x) while maintaining or improving accuracy. The method is particularly effective in reducing the number of parameters in fully connected layers and convolutional layers, leading to more efficient and accurate neural networks. The paper also discusses the necessity of proper weight initialization during the pruning and retraining process and compares the effectiveness of Network Trimming with other pruning methods, highlighting its advantages in terms of computational efficiency and performance.

Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures

12 Jul 2016 | Hengyuan Hu, Rui Peng, Yu-Wing Tai, Chi-Keung Tang