[slides and audio] Greedy Layer-Wise Training of Deep Networks

The paper "Greedy Layer-Wise Training of Deep Networks" by Yoshua Bengio, Pascal Lamblin, Dan Popovici, and Hugo Larochelle explores the effectiveness of greedy layer-wise unsupervised learning in training deep belief networks (DBNs). The authors argue that deep architectures can represent highly non-linear and varying functions more efficiently than shallow architectures, but they are challenging to train due to the difficulty of optimizing from random initialization. Hinton et al. introduced a greedy layer-wise unsupervised learning algorithm for DBNs, which has shown promise in pre-training layers to improve optimization. The paper extends this approach to handle continuous-valued inputs and performs experiments to understand the benefits of this strategy. Key findings include: 1. **Extension to Continuous-Valued Inputs**: The authors extend RBMs and DBNs to handle continuous-valued inputs, demonstrating improved predictive models. 2. **Empirical Validation**: Experiments show that the greedy layer-wise unsupervised training strategy helps optimize deep networks and improves generalization by initializing upper layers with better representations. 3. **Comparison with Supervised Training**: The paper compares the performance of DBNs with unsupervised pre-training to those trained without pre-training or using supervised pre-training, showing significant improvements. 4. **Supervised Training with Partial Supervision**: For tasks where the input distribution is not strongly related to the target, the authors propose a partially supervised greedy layer-wise training strategy, which yields significant improvements. The paper concludes by highlighting the importance of unsupervised pre-training in deep learning and provides insights into how this strategy can be extended and adapted to various scenarios.The paper "Greedy Layer-Wise Training of Deep Networks" by Yoshua Bengio, Pascal Lamblin, Dan Popovici, and Hugo Larochelle explores the effectiveness of greedy layer-wise unsupervised learning in training deep belief networks (DBNs). The authors argue that deep architectures can represent highly non-linear and varying functions more efficiently than shallow architectures, but they are challenging to train due to the difficulty of optimizing from random initialization. Hinton et al. introduced a greedy layer-wise unsupervised learning algorithm for DBNs, which has shown promise in pre-training layers to improve optimization. The paper extends this approach to handle continuous-valued inputs and performs experiments to understand the benefits of this strategy. Key findings include: 1. **Extension to Continuous-Valued Inputs**: The authors extend RBMs and DBNs to handle continuous-valued inputs, demonstrating improved predictive models. 2. **Empirical Validation**: Experiments show that the greedy layer-wise unsupervised training strategy helps optimize deep networks and improves generalization by initializing upper layers with better representations. 3. **Comparison with Supervised Training**: The paper compares the performance of DBNs with unsupervised pre-training to those trained without pre-training or using supervised pre-training, showing significant improvements. 4. **Supervised Training with Partial Supervision**: For tasks where the input distribution is not strongly related to the target, the authors propose a partially supervised greedy layer-wise training strategy, which yields significant improvements. The paper concludes by highlighting the importance of unsupervised pre-training in deep learning and provides insights into how this strategy can be extended and adapted to various scenarios.

Greedy Layer-Wise Training of Deep Networks

| Yoshua Bengio, Pascal Lamblin, Dan Popovici, Hugo Larochelle