Semi-Supervised Learning with Ladder Networks

Semi-Supervised Learning with Ladder Networks

24 Nov 2015 | Antti Rasmus, Harri Valpola, Mikko Honkala, Mathias Berglund, Tapani Raiko
This paper introduces a semi-supervised learning method that combines supervised and unsupervised learning in deep neural networks. The proposed model is trained to minimize the sum of supervised and unsupervised cost functions through backpropagation, without requiring layer-wise pre-training. Building on the Ladder network proposed by Valpola (2015), the model is extended by integrating supervision. The resulting model achieves state-of-the-art performance in semi-supervised MNIST and CIFAR-10 classification, as well as permutation-invariant MNIST classification with all labels. The key aspects of the approach include compatibility with supervised methods, scalability from local learning, and computational efficiency. The model uses skip connections and layer-wise unsupervised targets to turn autoencoders into hierarchical latent variable models, which are well-suited for semi-supervised learning. The model is implemented using a fully connected MLP as the encoder and a decoder for unsupervised learning. The decoder is designed to optimally denoise Gaussian latent variables, with the denoising function being a weighted sum of the corrupted input and a prior. The model parameters are trained using backpropagation to optimize the total cost function, which includes both supervised and unsupervised components. The experiments on the MNIST and CIFAR-10 datasets show that the proposed method outperforms previous results in semi-supervised learning. The model is also compared to the simpler Γ-model, which only includes a denoising cost on the input layer. The results demonstrate that the full Ladder network achieves better performance, especially in the permutation-invariant MNIST task. The model is also tested on convolutional networks, showing that even a single convolution on the bottom level improves results over fully connected networks. The results suggest that combining the generalization ability of convolutional networks with efficient unsupervised learning of the full Ladder network could lead to even better performance.This paper introduces a semi-supervised learning method that combines supervised and unsupervised learning in deep neural networks. The proposed model is trained to minimize the sum of supervised and unsupervised cost functions through backpropagation, without requiring layer-wise pre-training. Building on the Ladder network proposed by Valpola (2015), the model is extended by integrating supervision. The resulting model achieves state-of-the-art performance in semi-supervised MNIST and CIFAR-10 classification, as well as permutation-invariant MNIST classification with all labels. The key aspects of the approach include compatibility with supervised methods, scalability from local learning, and computational efficiency. The model uses skip connections and layer-wise unsupervised targets to turn autoencoders into hierarchical latent variable models, which are well-suited for semi-supervised learning. The model is implemented using a fully connected MLP as the encoder and a decoder for unsupervised learning. The decoder is designed to optimally denoise Gaussian latent variables, with the denoising function being a weighted sum of the corrupted input and a prior. The model parameters are trained using backpropagation to optimize the total cost function, which includes both supervised and unsupervised components. The experiments on the MNIST and CIFAR-10 datasets show that the proposed method outperforms previous results in semi-supervised learning. The model is also compared to the simpler Γ-model, which only includes a denoising cost on the input layer. The results demonstrate that the full Ladder network achieves better performance, especially in the permutation-invariant MNIST task. The model is also tested on convolutional networks, showing that even a single convolution on the bottom level improves results over fully connected networks. The results suggest that combining the generalization ability of convolutional networks with efficient unsupervised learning of the full Ladder network could lead to even better performance.
Reach us at info@study.space
[slides] Semi-supervised Learning with Ladder Networks | StudySpace