Term clouds as surrogates for user generated speech

Authors	M. Tsagkias M. Larson M. de Rijke
Publication date	2008
Host editors	S.-H. Myaeng D.W. Oard F. Sebastiani T.-S. Chua M.-K. Leong
Book title	ACM SIGIR 2008: Thirty-first Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 20-24 July 2008, Singapore: Proceedings
ISBN	9781605581644
Event	31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2008), Singapore
Pages (from-to)	773-774
Publisher	New York, NY: Association for Computing Machinery (ACM)
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	User generated spoken audio remains a challenge for Automatic Speech Recognition (ASR) technology and content-based audio surrogates derived from ASR-transcripts must be error robust. An investigation of the use of term clouds as surrogates for podcasts demonstrates that ASR term clouds closely approximate term clouds derived from human-generated transcripts across a range of cloud sizes. A user study confirms the conclusion that ASR-clouds are viable surrogates for depicting the content of podcasts.
Document type	Conference contribution
Language	English
Published at	https://doi.org/10.1145/1390334.1390497 (Final published version)
Permalink to this page

Back

UvA-DARE