Term clouds as surrogates for user generated speech

Authors
Publication date 2008
Host editors
  • S.-H. Myaeng
  • D.W. Oard
  • F. Sebastiani
  • T.-S. Chua
  • M.-K. Leong
Book title ACM SIGIR 2008: Thirty-first Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 20-24 July 2008, Singapore: Proceedings
ISBN
  • 9781605581644
Event 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2008), Singapore
Pages (from-to) 773-774
Publisher New York, NY: Association for Computing Machinery (ACM)
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract User generated spoken audio remains a challenge for Automatic Speech Recognition (ASR) technology and content-based audio surrogates derived from ASR-transcripts must be error robust. An investigation of the use of term clouds as surrogates for podcasts demonstrates that ASR term clouds closely approximate term clouds derived from human-generated transcripts across a range of cloud sizes. A user study confirms the conclusion that ASR-clouds are viable surrogates for depicting the content of podcasts.
Document type Conference contribution
Language English
Published at https://doi.org/10.1145/1390334.1390497
Permalink to this page
Back