This paper introduces a novel network compression method called Dynamic Network Surgery (DNS), which aims to reduce the complexity of deep neural networks (DNNs) by pruning connections on-the-fly. Unlike previous methods that use a greedy approach, DNS incorporates connection splicing to avoid incorrect pruning and maintain the network structure continuously. The effectiveness of DNS is demonstrated through experiments, showing that it can compress the number of parameters in LeNet-5 and AlexNet by factors of 108× and 17.7×, respectively, without significant loss in accuracy. The method is compared with other pruning methods, such as Han et al.'s method, and it outperforms them in terms of compression rate and learning efficiency. The paper also discusses the implementation details of DNS, including the optimization problem formulation and the parameter importance function, and provides experimental results to validate its effectiveness.This paper introduces a novel network compression method called Dynamic Network Surgery (DNS), which aims to reduce the complexity of deep neural networks (DNNs) by pruning connections on-the-fly. Unlike previous methods that use a greedy approach, DNS incorporates connection splicing to avoid incorrect pruning and maintain the network structure continuously. The effectiveness of DNS is demonstrated through experiments, showing that it can compress the number of parameters in LeNet-5 and AlexNet by factors of 108× and 17.7×, respectively, without significant loss in accuracy. The method is compared with other pruning methods, such as Han et al.'s method, and it outperforms them in terms of compression rate and learning efficiency. The paper also discusses the implementation details of DNS, including the optimization problem formulation and the parameter importance function, and provides experimental results to validate its effectiveness.