Understanding Exploring Generalization in Deep Learning

This paper explores the factors that drive generalization in deep learning networks. It examines several recently proposed explanations, including norm-based control, sharpness, and robustness, and investigates how these measures can ensure generalization. The authors highlight the importance of scale normalization and establish a connection between sharpness and PAC-Bayes theory. They then evaluate how well these measures explain various observed phenomena, such as the difference in generalization behavior between models trained on true and random labels, the impact of increasing network size, and the role of different optimization algorithms. The paper discusses the limitations of using the number of parameters as a complexity measure and proposes alternative complexity measures, such as norms and sharpness, that better capture the capacity of neural networks. It also introduces the concept of expected sharpness in the context of PAC-Bayes analysis, showing that sharpness, when combined with norms, provides a more comprehensive measure of capacity control. The empirical investigations support the theoretical findings, demonstrating that these complexity measures can effectively explain the observed generalization phenomena.This paper explores the factors that drive generalization in deep learning networks. It examines several recently proposed explanations, including norm-based control, sharpness, and robustness, and investigates how these measures can ensure generalization. The authors highlight the importance of scale normalization and establish a connection between sharpness and PAC-Bayes theory. They then evaluate how well these measures explain various observed phenomena, such as the difference in generalization behavior between models trained on true and random labels, the impact of increasing network size, and the role of different optimization algorithms. The paper discusses the limitations of using the number of parameters as a complexity measure and proposes alternative complexity measures, such as norms and sharpness, that better capture the capacity of neural networks. It also introduces the concept of expected sharpness in the context of PAC-Bayes analysis, showing that sharpness, when combined with norms, provides a more comprehensive measure of capacity control. The empirical investigations support the theoretical findings, demonstrating that these complexity measures can effectively explain the observed generalization phenomena.

Exploring Generalization in Deep Learning

6 Jul 2017 | Behnam Neyshabur, Srinadh Bhojanapalli, David McAllester, Nathan Srebro