4 Nov 2017 | Balaji Lakshminarayanan, Alexander Pritzel, Charles Blundell
This paper proposes a simple and scalable method for estimating predictive uncertainty in deep neural networks (NNs) using deep ensembles. The method is based on training NNs with proper scoring rules as the training criterion, using adversarial training to smooth predictive distributions, and training an ensemble of networks. The approach is compared to Bayesian methods such as variational inference and MCMC, and shown to produce well-calibrated uncertainty estimates that are as good or better than these methods. The method is evaluated on classification and regression benchmarks, and shown to produce higher uncertainty on out-of-distribution examples. It is also demonstrated to be scalable, with predictive uncertainty estimates evaluated on ImageNet. The method is simple to implement, requires minimal hyperparameter tuning, and is well-suited for distributed computation. The paper also discusses the benefits of ensembles and adversarial training for improving predictive uncertainty, and shows that the method outperforms MC-dropout in terms of calibration and generalization to unknown classes. The results demonstrate that the method provides a strong baseline for predictive uncertainty estimation and is a promising alternative to Bayesian methods.This paper proposes a simple and scalable method for estimating predictive uncertainty in deep neural networks (NNs) using deep ensembles. The method is based on training NNs with proper scoring rules as the training criterion, using adversarial training to smooth predictive distributions, and training an ensemble of networks. The approach is compared to Bayesian methods such as variational inference and MCMC, and shown to produce well-calibrated uncertainty estimates that are as good or better than these methods. The method is evaluated on classification and regression benchmarks, and shown to produce higher uncertainty on out-of-distribution examples. It is also demonstrated to be scalable, with predictive uncertainty estimates evaluated on ImageNet. The method is simple to implement, requires minimal hyperparameter tuning, and is well-suited for distributed computation. The paper also discusses the benefits of ensembles and adversarial training for improving predictive uncertainty, and shows that the method outperforms MC-dropout in terms of calibration and generalization to unknown classes. The results demonstrate that the method provides a strong baseline for predictive uncertainty estimation and is a promising alternative to Bayesian methods.