30 Oct 2015 | Song Han, Jeff Pool, John Tran, William J. Dally
The paper "Learning Both Weights and Connections for Efficient Neural Networks" by Song Han addresses the computational and memory-intensive nature of neural networks, particularly in embedded systems. The authors propose a method to reduce the storage and computation required by neural networks by an order of magnitude without compromising accuracy. This is achieved through a three-step process: learning important connections, pruning unimportant connections, and retraining the network to fine-tune the weights of the remaining connections. On the ImageNet dataset, AlexNet's parameters were reduced by 9× (from 61 million to 6.7 million), and VGG-16's parameters were reduced by 13× (from 138 million to 10.3 million), with no loss of accuracy. The method leverages the sparsity of connections, similar to how synapses are pruned in mammalian brains, to optimize network performance. The paper also discusses related work, including various techniques for reducing network complexity and overfitting, and provides experimental results on different datasets and models, demonstrating the effectiveness of the proposed method.The paper "Learning Both Weights and Connections for Efficient Neural Networks" by Song Han addresses the computational and memory-intensive nature of neural networks, particularly in embedded systems. The authors propose a method to reduce the storage and computation required by neural networks by an order of magnitude without compromising accuracy. This is achieved through a three-step process: learning important connections, pruning unimportant connections, and retraining the network to fine-tune the weights of the remaining connections. On the ImageNet dataset, AlexNet's parameters were reduced by 9× (from 61 million to 6.7 million), and VGG-16's parameters were reduced by 13× (from 138 million to 10.3 million), with no loss of accuracy. The method leverages the sparsity of connections, similar to how synapses are pruned in mammalian brains, to optimize network performance. The paper also discusses related work, including various techniques for reducing network complexity and overfitting, and provides experimental results on different datasets and models, demonstrating the effectiveness of the proposed method.