Learning both Weights and Connections for Efficient Neural Networks

Learning both Weights and Connections for Efficient Neural Networks

30 Oct 2015 | Song Han, Jeff Pool, John Tran, William J. Dally
This paper presents a method to reduce the storage and computation required by neural networks by an order of magnitude without affecting their accuracy by learning only the important connections. The method prunes redundant connections using a three-step process: first, the network is trained to learn which connections are important. Next, unimportant connections are pruned. Finally, the network is retrained to fine-tune the weights of the remaining connections. This approach reduces the number of parameters of AlexNet by a factor of 9×, from 61 million to 6.7 million, without incurring accuracy loss. Similar experiments with VGG-16 found that the total number of parameters can be reduced by 13×, from 138 million to 10.3 million, again with no loss of accuracy. Neural networks are computationally and memory-intensive, making them difficult to deploy on embedded systems. Conventional networks fix the architecture before training, making it impossible to improve the architecture during training. The proposed method allows for the pruning of unimportant connections while preserving the original accuracy. This is achieved by learning the important connections during training, pruning the unimportant ones, and then retraining the network to adjust the weights of the remaining connections. The method is effective because it learns both the weights and the connections of the network. This is similar to how synapses are created and pruned in the mammalian brain. The process involves an initial training phase where the network learns which connections are important. Then, the unimportant connections are removed, and the network is retrained to adjust the weights of the remaining connections. This process can be repeated iteratively to further reduce network complexity. The method also includes techniques for regularization, dropout ratio adjustment, local pruning, and iterative pruning. Regularization helps in maintaining the performance of the network after pruning. Dropout ratio adjustment is necessary to account for the change in model capacity after pruning. Local pruning involves retaining the weights from the initial training phase for the connections that survived pruning. Iterative pruning allows for further reduction in network complexity. The experiments show that the method can reduce the number of parameters of AlexNet by 9× and VGG-16 by 13× without affecting accuracy. The method also reduces the computational requirements of the network, making it more suitable for deployment on mobile devices. The results demonstrate that the proposed method is effective in reducing the storage and computational requirements of neural networks while maintaining their accuracy.This paper presents a method to reduce the storage and computation required by neural networks by an order of magnitude without affecting their accuracy by learning only the important connections. The method prunes redundant connections using a three-step process: first, the network is trained to learn which connections are important. Next, unimportant connections are pruned. Finally, the network is retrained to fine-tune the weights of the remaining connections. This approach reduces the number of parameters of AlexNet by a factor of 9×, from 61 million to 6.7 million, without incurring accuracy loss. Similar experiments with VGG-16 found that the total number of parameters can be reduced by 13×, from 138 million to 10.3 million, again with no loss of accuracy. Neural networks are computationally and memory-intensive, making them difficult to deploy on embedded systems. Conventional networks fix the architecture before training, making it impossible to improve the architecture during training. The proposed method allows for the pruning of unimportant connections while preserving the original accuracy. This is achieved by learning the important connections during training, pruning the unimportant ones, and then retraining the network to adjust the weights of the remaining connections. The method is effective because it learns both the weights and the connections of the network. This is similar to how synapses are created and pruned in the mammalian brain. The process involves an initial training phase where the network learns which connections are important. Then, the unimportant connections are removed, and the network is retrained to adjust the weights of the remaining connections. This process can be repeated iteratively to further reduce network complexity. The method also includes techniques for regularization, dropout ratio adjustment, local pruning, and iterative pruning. Regularization helps in maintaining the performance of the network after pruning. Dropout ratio adjustment is necessary to account for the change in model capacity after pruning. Local pruning involves retaining the weights from the initial training phase for the connections that survived pruning. Iterative pruning allows for further reduction in network complexity. The experiments show that the method can reduce the number of parameters of AlexNet by 9× and VGG-16 by 13× without affecting accuracy. The method also reduces the computational requirements of the network, making it more suitable for deployment on mobile devices. The results demonstrate that the proposed method is effective in reducing the storage and computational requirements of neural networks while maintaining their accuracy.
Reach us at info@study.space
[slides and audio] Learning both Weights and Connections for Efficient Neural Network