2024 | James Harrison¹, John Willes², Jasper Snoek¹
This paper introduces a deterministic variational formulation for training Bayesian last layer neural networks (VBLL), which enables sampling-free, single-pass model and loss computation, improving uncertainty estimation. VBLL is computationally efficient, with quadratic complexity in last layer width, and can be easily integrated into standard architectures. Experimental results show that VBLL improves predictive accuracy, calibration, and out-of-distribution detection across regression and classification tasks. The authors also investigate combining VBLL with variational Bayesian feature learning, yielding a lower-variance collapsed variational inference method for Bayesian neural networks. VBLL models are shown to outperform baseline models in contextual bandits. The paper also presents three methods for learning VBLL models: full training, post-training, and feature uncertainty. VBLL models are evaluated on UCI regression datasets, CIFAR-10 and CIFAR-100 image classification tasks, and sentiment classification using language models. Results show that VBLL models achieve strong performance in accuracy, calibration, and out-of-distribution detection. The paper concludes that VBLL provides a simple, computationally efficient approach to Bayesian deep learning, with potential for integration with large-scale language models.This paper introduces a deterministic variational formulation for training Bayesian last layer neural networks (VBLL), which enables sampling-free, single-pass model and loss computation, improving uncertainty estimation. VBLL is computationally efficient, with quadratic complexity in last layer width, and can be easily integrated into standard architectures. Experimental results show that VBLL improves predictive accuracy, calibration, and out-of-distribution detection across regression and classification tasks. The authors also investigate combining VBLL with variational Bayesian feature learning, yielding a lower-variance collapsed variational inference method for Bayesian neural networks. VBLL models are shown to outperform baseline models in contextual bandits. The paper also presents three methods for learning VBLL models: full training, post-training, and feature uncertainty. VBLL models are evaluated on UCI regression datasets, CIFAR-10 and CIFAR-100 image classification tasks, and sentiment classification using language models. Results show that VBLL models achieve strong performance in accuracy, calibration, and out-of-distribution detection. The paper concludes that VBLL provides a simple, computationally efficient approach to Bayesian deep learning, with potential for integration with large-scale language models.