| Authors |
|
| Publication date |
2008
|
| Host editors |
-
A. Popescu-Belis
-
R. Stiefelhagen
|
| Book title |
Machine Learning for Multimodal Interaction
|
| Book subtitle |
5th International Workshop, MLMI 2008, Utrecht, The Netherlands, September 8-10, 2008 : proceedings
|
| ISBN |
|
| ISBN (electronic) |
|
| Series |
Lecture Notes in Computer Science
|
| Event |
5th Joint Workshop on Machine Learning and Multimodal Interaction (MLMI 2008), Utrecht, the Netherlands
|
| Pages (from-to) |
98-109
|
| Publisher |
Berlin: Springer
|
| Organisations |
-
Faculty of Science (FNWI) - Informatics Institute (IVI)
|
| Abstract |
In this paper we present a sound probabilistic approach to speaker diarization. We use a hybrid framework where a distribution over the number of speakers at each point of a multimodal stream is estimated with a discriminative model. The output of this process is used as input in a generative model that can adapt to a novel test set and perform high accuracy speaker diarization. We manage to deal efficiently with the less common, and therefore harder, segments like silence and multiple speaker parts in a principled probabilistic manner.
|
| Document type |
Conference contribution
|
| Language |
English
|
| Published at |
https://doi.org/10.1007/978-3-540-85853-9_9
|
|
Permalink to this page
|