Using term clouds to represent segment-level semantic content of podcasts
| Authors |
|
|---|---|
| Publication date | 2008 |
| Host editors |
|
| Book title | Proceedings of the ACM SIGIR Workshop 'Searching Spontaneous Conversational Speech' |
| ISBN |
|
| Event | 2nd SIGIR Workshop on Searching Spontaneous Conversational Speech (SSCS 2008), Singapore |
| Pages (from-to) | 12-19 |
| Publisher | Enschede: Centre for Telematics and Information Technology (CTIT) |
| Organisations |
|
| Abstract |
Spoken audio, like any time-continuous medium, is notoriously difficult to browse or skim without support of an interface providing semantically annotated jump points to signal the user where to listen in. Creation of time-aligned metadata by human annotators is prohibitively expensive, motivating the investigation of representations of segment-level semantic content based on transcripts generated by automatic speech recognition (ASR). This paper examines the feasibility of using term clouds to provide users with a structured representation of the semantic content of podcast episodes. Podcast episodes are visualized as a series of sub-episode segments, each represented by a term cloud derived from a transcript generated by automatic speech recognition (ASR). Quality of segment-level term clouds is measured quantitatively and their utility is investigated using a small-scale user study based on human labeled segment boundaries. Since the segment-level clouds generated from ASR-transcripts prove useful, we examine an adaptation of text tiling techniques to speech in order to be able to generate segments as part of a completely automated indexing and structuring system for browsing of spoken audio. Results demonstrate that the segments generated are comparable with human selected segment boundaries.
|
| Document type | Conference contribution |
| Published at | http://ilps.science.uva.nl/SSCS2008/Proceedings/sscs08_proceedings.pdf |
| Downloads | |
| Permalink to this page | |
