- Exact Sampling and Decoding in High-Order Hidden Markov Models
- 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
- Book/source title
- 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning: EMNLP-CoNLL 2012: proceedings of the conference: July 12-14, 2012, Jeju Island, Korea
- Pages (from-to)
- Stroudsburg, PA: Association for Computational Linguistics
- Document type
- Conference contribution
- Faculty of Science (FNWI)
- Informatics Institute (IVI)
We present a method for exact optimization and sampling from high order Hidden Markov Models (HMMs), which are generally handled by approximation techniques. Motivated by adaptive rejection sampling and heuristic search, we propose a strategy based on sequentially refining a lower-order language model that is an upper bound on the true model we wish to decode and sample from. This allows us to build tractable variable-order HMMs. The ARPA format for language models is extended to enable an efficient use of the max-backoff quantities required to compute the upper bound. We evaluate our approach on two problems: a SMS-retrieval task and a POS tagging experiment using 5-gram models. Results show that the same approach can be used for exact optimization and sampling,
while explicitly constructing only a fraction of the total implicit state-space.
If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library, or send a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible.