- Large-Scale Distributed Bayesian Matrix Factorization using Stochastic Gradient MCMC
- Conference on Knowledge Discovery and Data Mining (KDD2015)
- Book/source title
- KDD'15: proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining: August 10-13, 2015, Sydney, Australia
- Pages (from-to)
- New York, NY: Association for Computing Machinery
- Document type
- Conference contribution
- Faculty of Science (FNWI)
- Informatics Institute (IVI)
Despite having various attractive qualities such as high prediction accuracy and the ability to quantify uncertainty and avoid ovrfitting, Bayesian Matrix Factorization has not been widely adopted because of the prohibitive cost of inference. In this paper, we propose a scalable distributed Bayesian matrix factorization algorithm using stochastic gradient MCMC. Our algorithm, based on Distributed Stochastic Gradient Langevin Dynamics, can not only match the prediction accuracy of standard MCMC methods like Gibbs sampling, but at the same time is as fast and simple as stochastic gradient descent. In our experiments, we show that our algorithm can achieve the same level of prediction accuracy as Gibbs sampling an order of magnitude faster. We also show that our method reduces the prediction error as fast as distributed stochastic gradient descent, achieving a 4.1% improvement in RMSE for the Netflix dataset and an 1.8% for the Yahoo music dataset.
- go to publisher's site
If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library, or send a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible.