Understanding Stochastic Gradient Hamiltonian Monte Carlo

Hamiltonian Monte Carlo (HMC) sampling methods are powerful for defining distant proposals with high acceptance probabilities in a Metropolis-Hastings framework, enabling efficient exploration of the state space. However, HMC requires gradient computation, which is infeasible for large datasets or streaming data. This paper explores the properties of stochastic gradient HMC, where a noisy gradient estimate is used. Surprisingly, the natural implementation of stochastic gradient HMC can be arbitrarily bad. To address this, the authors introduce a variant that uses second-order Langevin dynamics with a friction term to counteract the effects of the noisy gradient, maintaining the target distribution as the invariant distribution. Theoretical results and empirical validation on simulated data demonstrate the effectiveness of this approach. The method is also applied to a classification task using neural networks and online Bayesian matrix factorization, showing its practical value.Hamiltonian Monte Carlo (HMC) sampling methods are powerful for defining distant proposals with high acceptance probabilities in a Metropolis-Hastings framework, enabling efficient exploration of the state space. However, HMC requires gradient computation, which is infeasible for large datasets or streaming data. This paper explores the properties of stochastic gradient HMC, where a noisy gradient estimate is used. Surprisingly, the natural implementation of stochastic gradient HMC can be arbitrarily bad. To address this, the authors introduce a variant that uses second-order Langevin dynamics with a friction term to counteract the effects of the noisy gradient, maintaining the target distribution as the invariant distribution. Theoretical results and empirical validation on simulated data demonstrate the effectiveness of this approach. The method is also applied to a classification task using neural networks and online Bayesian matrix factorization, showing its practical value.

Stochastic Gradient Hamiltonian Monte Carlo

12 May 2014 | Tianqi Chen, Emily B. Fox, Carlos Guestrin