- Bayesian posterior sampling via stochastic gradient Fisher scoring
- 29th International Conference Machine Learning (ICML2012)
- Book/source title
- Proceedings of Twenty-Ninth International Conference Machine Learning. - Vol. 2
- Pages (from-to)
- Madison, WI: International Machine Learning Society
- Document type
- Conference contribution
- Faculty of Science (FNWI)
- Informatics Institute (IVI)
In this paper we address the following question: "Can we approximately sample from a Bayesian posterior distribution if we are only allowed to touch a small mini-batch of data-items for every sample we generate?". An algorithm based on the Langevin equation with stochastic gradients (SGLD) was previously proposed to solve this, but its mixing rate was slow. By leveraging the Bayesian Central Limit Theorem, we extend the SGLD algorithm so that at high mixing rates it will sample from a normal approximation of the posterior, while for slow mixing rates it will mimic the behavior of SGLD with a pre-conditioner matrix. As a bonus, the proposed algorithm is reminiscent of Fisher scoring (with stochastic gradients) and as such an efficient optimizer during burn-in.
If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library, or send a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible.