- Stochastic Collapsed Variational Bayesian Inference for Latent Dirichlet Allocation
- Conference on Knowledge Discovery and Data Mining (KDD)
- Book/source title
- KDD '13: the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining: August 11-14, 2013, Chicago, Illinois, USA
- Pages (from-to)
- New York: ACM
- Document type
- Conference contribution
- Faculty of Science (FNWI)
- Informatics Institute (IVI)
There has been an explosion in the amount of digital text information available in recent years, leading to challenges of scale for traditional inference algorithms for topic models. Recent advances in stochastic variational inference algorithms for latent Dirichlet allocation (LDA) have made it feasible to learn topic models on very large-scale corpora, but these methods do not currently take full advantage of the collapsed representation of the model. We propose a stochastic algorithm for collapsed variational Bayesian inference for LDA, which is simpler and more efficient than the state of the art method. In experiments on large-scale text corpora, the algorithm was found to converge faster and often to a better solution than previous methods. Human-subject experiments also demonstrated that the method can learn coherent topics in seconds on small corpora, facilitating the use of topic models in interactive document analysis software.
- go to publisher's site
If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library, or send a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible.