Understanding Deep Networks with Stochastic Depth

The paper introduces "Deep Networks with Stochastic Depth," a novel training procedure that addresses the challenges of training very deep convolutional neural networks (CNNs). The method involves randomly dropping layers during training, effectively reducing the network's depth while maintaining its full depth at test time. This approach aims to improve training efficiency and test performance by reducing vanishing gradients and diminishing feature reuse. The authors demonstrate that stochastic depth significantly reduces training time and improves test error on various datasets, including CIFAR-10, CIFAR-100, SVHN, and ImageNet. They also show that training with stochastic depth can be interpreted as an implicit ensemble of networks of different depths, leading to better performance. The method is particularly effective in handling extremely deep networks, such as the 1202-layer ResNet, which achieves a test error of 4.91% on CIFAR-10, a record-breaking result at the time of submission. The paper concludes by highlighting the potential of stochastic depth as a valuable tool for training deeper models in deep learning.The paper introduces "Deep Networks with Stochastic Depth," a novel training procedure that addresses the challenges of training very deep convolutional neural networks (CNNs). The method involves randomly dropping layers during training, effectively reducing the network's depth while maintaining its full depth at test time. This approach aims to improve training efficiency and test performance by reducing vanishing gradients and diminishing feature reuse. The authors demonstrate that stochastic depth significantly reduces training time and improves test error on various datasets, including CIFAR-10, CIFAR-100, SVHN, and ImageNet. They also show that training with stochastic depth can be interpreted as an implicit ensemble of networks of different depths, leading to better performance. The method is particularly effective in handling extremely deep networks, such as the 1202-layer ResNet, which achieves a test error of 4.91% on CIFAR-10, a record-breaking result at the time of submission. The paper concludes by highlighting the potential of stochastic depth as a valuable tool for training deeper models in deep learning.

Deep Networks with Stochastic Depth

28 Jul 2016 | Gao Huang*, Yu Sun*, Zhuang Liu†, Daniel Sedra, Kilian Q. Weinberger

28 Jul 2016 | Gao Huang, Yu Sun, Zhuang Liu†, Daniel Sedra, Kilian Q. Weinberger