Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm

Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm

9 Sep 2019 | Qiang Liu, Dilin Wang
The paper introduces a general-purpose variational inference algorithm that is a natural counterpart to gradient descent for optimization. The method iteratively transports a set of particles to match the target distribution by minimizing the KL divergence using a form of functional gradient descent. The algorithm is based on a new theoretical result connecting the derivative of KL divergence under smooth transforms with Stein's identity and the kernelized Stein discrepancy. This connection allows for a closed-form solution for the optimal smooth perturbation direction, which is crucial for the algorithm's effectiveness. The paper includes empirical studies on various real-world models and datasets, demonstrating the method's competitiveness with existing state-of-the-art methods. The algorithm is simple and can be applied whenever gradient descent can be used, making it user-friendly and efficient for large datasets. The paper also discusses related works and provides experimental results to support the method's effectiveness.The paper introduces a general-purpose variational inference algorithm that is a natural counterpart to gradient descent for optimization. The method iteratively transports a set of particles to match the target distribution by minimizing the KL divergence using a form of functional gradient descent. The algorithm is based on a new theoretical result connecting the derivative of KL divergence under smooth transforms with Stein's identity and the kernelized Stein discrepancy. This connection allows for a closed-form solution for the optimal smooth perturbation direction, which is crucial for the algorithm's effectiveness. The paper includes empirical studies on various real-world models and datasets, demonstrating the method's competitiveness with existing state-of-the-art methods. The algorithm is simple and can be applied whenever gradient descent can be used, making it user-friendly and efficient for large datasets. The paper also discusses related works and provides experimental results to support the method's effectiveness.
Reach us at info@study.space