25 Sep 2014 | Chen-Yu Lee, Saining Xie, Patrick Gallagher, Zhengyou Zhang, Zhuowen Tu
Deeply-Supervised Nets (DSN) is a novel method that enhances deep learning by directly supervising hidden layers and the output layer, improving classification performance. The method introduces a "companion objective" for each hidden layer, in addition to the output layer objective, to enforce discriminative and transparent feature learning. This approach addresses issues such as reduced feature transparency, training difficulty due to exploding/vanishing gradients, and the need for extensive hyperparameter tuning. DSN uses stochastic gradient descent and incorporates techniques like SVM and Softmax for classification. The method is effective in training deep neural networks, especially with small datasets, and shows significant performance improvements on benchmark datasets like MNIST, CIFAR-10, CIFAR-100, and SVHN. Experiments demonstrate that DSN outperforms existing methods, achieving state-of-the-art results with lower error rates. The formulation is also inclusive of various techniques such as averaging, dropconnect, and Maxout. Theoretical analysis shows that DSN improves convergence rates and provides better generalization. The method is efficient to train and does not require complex engineering tricks. DSN's direct supervision of hidden layers leads to faster convergence and more discriminative features, making it a promising approach in deep learning.Deeply-Supervised Nets (DSN) is a novel method that enhances deep learning by directly supervising hidden layers and the output layer, improving classification performance. The method introduces a "companion objective" for each hidden layer, in addition to the output layer objective, to enforce discriminative and transparent feature learning. This approach addresses issues such as reduced feature transparency, training difficulty due to exploding/vanishing gradients, and the need for extensive hyperparameter tuning. DSN uses stochastic gradient descent and incorporates techniques like SVM and Softmax for classification. The method is effective in training deep neural networks, especially with small datasets, and shows significant performance improvements on benchmark datasets like MNIST, CIFAR-10, CIFAR-100, and SVHN. Experiments demonstrate that DSN outperforms existing methods, achieving state-of-the-art results with lower error rates. The formulation is also inclusive of various techniques such as averaging, dropconnect, and Maxout. Theoretical analysis shows that DSN improves convergence rates and provides better generalization. The method is efficient to train and does not require complex engineering tricks. DSN's direct supervision of hidden layers leads to faster convergence and more discriminative features, making it a promising approach in deep learning.