25 Sep 2014 | Chen-Yu Lee, Saining Xie, Patrick Gallagher, Zhengyou Zhang, Zhuowen Tu
The paper introduces Deeply-Supervised Nets (DSN), a novel method that enhances the transparency and discriminativeness of features learned in hidden layers of deep convolutional neural networks (CNNs). DSN introduces a "companion objective" for each hidden layer, in addition to the overall output layer objective, to directly supervise the learning process. This approach aims to improve the classification performance by ensuring that the features learned at each layer are highly discriminative. The paper discusses the motivation behind DSN, its formulation, and provides a stochastic gradient descent (SGD) analysis to justify its effectiveness. Experimental results on benchmark datasets (MNIST, CIFAR-10, CIFAR-100, and SVHN) demonstrate significant performance improvements over existing methods, particularly in terms of classification error reduction and faster convergence, especially with limited training data. The authors also highlight the robustness of DSN to hyperparameter choices and the intuitive nature of the learned features.The paper introduces Deeply-Supervised Nets (DSN), a novel method that enhances the transparency and discriminativeness of features learned in hidden layers of deep convolutional neural networks (CNNs). DSN introduces a "companion objective" for each hidden layer, in addition to the overall output layer objective, to directly supervise the learning process. This approach aims to improve the classification performance by ensuring that the features learned at each layer are highly discriminative. The paper discusses the motivation behind DSN, its formulation, and provides a stochastic gradient descent (SGD) analysis to justify its effectiveness. Experimental results on benchmark datasets (MNIST, CIFAR-10, CIFAR-100, and SVHN) demonstrate significant performance improvements over existing methods, particularly in terms of classification error reduction and faster convergence, especially with limited training data. The authors also highlight the robustness of DSN to hyperparameter choices and the intuitive nature of the learned features.