This paper presents a fully Bayesian treatment of Probabilistic Matrix Factorization (PMF) using Markov Chain Monte Carlo (MCMC) methods. PMF is a probabilistic linear model used for collaborative filtering, where user preferences are modeled by the inner product of user and movie factor vectors. The model is trained by maximizing the log-posterior over model parameters and hyperparameters, which allows automatic control of model capacity. The authors show that Bayesian PMF models can be efficiently trained using MCMC methods on the Netflix dataset, which contains over 100 million user/movie ratings. The resulting models achieve significantly higher prediction accuracy than PMF models trained using Maximum A Posteriori (MAP) estimation.
The Bayesian PMF model uses Gaussian-Wishart priors for hyperparameters and integrates out model parameters and hyperparameters to obtain the predictive distribution. This approach allows for more accurate predictions, especially for infrequent users, compared to MAP-trained models. The authors also demonstrate that Bayesian PMF models outperform their MAP counterparts by a larger margin than variational methods, which assume independence between user and movie factors.
The paper describes the Gibbs sampling algorithm for Bayesian PMF, which cycles through latent variables and samples each one from its distribution conditional on the current values of all other variables. The algorithm is applied to the Netflix dataset, and the results show that Bayesian PMF models achieve significantly lower root mean squared error (RMSE) on both validation and test sets compared to MAP-trained models. The models also show better performance on users with few ratings, as they account for uncertainty in predictions.
The authors conclude that Bayesian PMF models provide a more effective way to handle uncertainty in predictions and achieve higher accuracy compared to MAP-trained models. However, the computational cost of training Bayesian PMF models is higher, as it requires inverting a $ D \times D $ matrix per feature vector, which is an $ O(D^3) $ operation. Despite this, the Bayesian approach is shown to be effective for large-scale problems, and the results demonstrate that it does not require limiting model complexity based on the number of training samples.This paper presents a fully Bayesian treatment of Probabilistic Matrix Factorization (PMF) using Markov Chain Monte Carlo (MCMC) methods. PMF is a probabilistic linear model used for collaborative filtering, where user preferences are modeled by the inner product of user and movie factor vectors. The model is trained by maximizing the log-posterior over model parameters and hyperparameters, which allows automatic control of model capacity. The authors show that Bayesian PMF models can be efficiently trained using MCMC methods on the Netflix dataset, which contains over 100 million user/movie ratings. The resulting models achieve significantly higher prediction accuracy than PMF models trained using Maximum A Posteriori (MAP) estimation.
The Bayesian PMF model uses Gaussian-Wishart priors for hyperparameters and integrates out model parameters and hyperparameters to obtain the predictive distribution. This approach allows for more accurate predictions, especially for infrequent users, compared to MAP-trained models. The authors also demonstrate that Bayesian PMF models outperform their MAP counterparts by a larger margin than variational methods, which assume independence between user and movie factors.
The paper describes the Gibbs sampling algorithm for Bayesian PMF, which cycles through latent variables and samples each one from its distribution conditional on the current values of all other variables. The algorithm is applied to the Netflix dataset, and the results show that Bayesian PMF models achieve significantly lower root mean squared error (RMSE) on both validation and test sets compared to MAP-trained models. The models also show better performance on users with few ratings, as they account for uncertainty in predictions.
The authors conclude that Bayesian PMF models provide a more effective way to handle uncertainty in predictions and achieve higher accuracy compared to MAP-trained models. However, the computational cost of training Bayesian PMF models is higher, as it requires inverting a $ D \times D $ matrix per feature vector, which is an $ O(D^3) $ operation. Despite this, the Bayesian approach is shown to be effective for large-scale problems, and the results demonstrate that it does not require limiting model complexity based on the number of training samples.