This paper challenges the common belief that training a large, over-parameterized model is necessary for efficient network pruning. The authors find that for structured pruning methods, training a pruned model from scratch with random initialization can achieve comparable or better performance than fine-tuning a pruned model. They also show that for some pruning methods, the pruned architecture itself is more important than the preserved weights. This suggests that structured pruning can be useful as an architecture search paradigm. The authors compare their results with the "Lottery Ticket Hypothesis" and find that the "winning ticket" initialization does not bring improvement over random initialization. They also show that for unstructured pruning, training from scratch can achieve comparable accuracy on smaller datasets but fails on large-scale ImageNet. The authors advocate for more careful baseline evaluations in future research on structured pruning methods. They also show that for some pruning methods, the pruned architecture itself is more important than the preserved weights, and that the value of automatic pruning methods sometimes lies in identifying efficient structures and performing implicit architecture search rather than selecting "important" weights. The authors also show that for structured pruning methods, the pruned architecture itself is more important than the preserved weights, and that the value of automatic pruning methods sometimes lies in identifying efficient structures and performing implicit architecture search rather than selecting "important" weights. The authors also show that for structured pruning methods, the pruned architecture itself is more important than the preserved weights, and that the value of automatic pruning methods sometimes lies in identifying efficient structures and performing implicit architecture search rather than selecting "important" weights.This paper challenges the common belief that training a large, over-parameterized model is necessary for efficient network pruning. The authors find that for structured pruning methods, training a pruned model from scratch with random initialization can achieve comparable or better performance than fine-tuning a pruned model. They also show that for some pruning methods, the pruned architecture itself is more important than the preserved weights. This suggests that structured pruning can be useful as an architecture search paradigm. The authors compare their results with the "Lottery Ticket Hypothesis" and find that the "winning ticket" initialization does not bring improvement over random initialization. They also show that for unstructured pruning, training from scratch can achieve comparable accuracy on smaller datasets but fails on large-scale ImageNet. The authors advocate for more careful baseline evaluations in future research on structured pruning methods. They also show that for some pruning methods, the pruned architecture itself is more important than the preserved weights, and that the value of automatic pruning methods sometimes lies in identifying efficient structures and performing implicit architecture search rather than selecting "important" weights. The authors also show that for structured pruning methods, the pruned architecture itself is more important than the preserved weights, and that the value of automatic pruning methods sometimes lies in identifying efficient structures and performing implicit architecture search rather than selecting "important" weights. The authors also show that for structured pruning methods, the pruned architecture itself is more important than the preserved weights, and that the value of automatic pruning methods sometimes lies in identifying efficient structures and performing implicit architecture search rather than selecting "important" weights.