Bayesian Dark Knowledge

Open Access
Authors
Publication date 2015
Host editors
  • C. Cortes
  • N.D. Lawrence
  • D.D. Lee
  • M. Sugiyama
  • R. Garnett
Book title 29th Annual Conference on Neural Information Processing Systems 2015
Book subtitle Montreal, Canada, 7-12 December 2015
ISBN
  • 9781510825024
Series Advances in Neural Information Processing Systems
Event Neural Information Processing Systems (NIPS2015)
Volume | Issue number 4
Pages (from-to) 3438-3446
Publisher Red Hook, NY: Curran Associates
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
We consider the problem of Bayesian parameter estimation for deep neural networks, which is important in problem settings where we may have little data, and/ or where we need accurate posterior predictive densities p(y|x, D), e.g., for applications involving bandits or active learning. One simple approach to this is to use online Monte Carlo methods, such as SGLD (stochastic gradient Langevin dynamics). Unfortunately, such a method needs to store many copies of the parameters (which wastes memory), and needs to make predictions using many versions of the model (which wastes time).We describe a method for "distilling" a Monte Carlo approximation to the posterior predictive density into a more compact form, namely a single deep neural network. We compare to two very recent approaches to Bayesian neural networks, namely an approach based on expectation propagation [HLA15] and an approach based on variational Bayes [BCKW15]. Our method performs better than both of these, is much simpler to implement, and uses less computation at test time.
Document type Conference contribution
Language English
Published at http://papers.nips.cc/paper/5965-bayesian-dark-knowledge
Downloads
5965-bayesian-dark-knowledge (Accepted author manuscript)
Permalink to this page
Back