Speaker detection for conversational robots using synchrony between audio and video

Authors
Publication date 2010
Host editors
  • M. Hanheide
  • H. Zender
Book title Proceedings of the ICRA 2010 Workshop on Interactive Communication for Autonomous Intelligent Robots (ICAIR): making robots articulate what they understand, intend, and do
Event ICRA 2010 Workshop on Interactive Communication for Autonomous Intelligent Robots (ICAIR), Anchorage, AK, USA
Pages (from-to) 11-16
Publisher Saarbrücken: German Research Center for Artificial Intelligence
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract This paper compares different methods for detecting the speaking person when multiple persons are interacting with a robot. We evaluate the state-of-the-art speaker detection methods on the iCat robot. These methods use the synchrony between audio and video to locate the most probable speaker. We compare them to simple motion-based speaker detection and present a simple heuristic with low computational requirements, which performs equally well to the audiovisual methods in a set of multiperson recordings with a fraction of the computational cost, thus making real-time interaction possible.
Document type Conference contribution
Language English
Published at http://www.dfki.de/cosy/www/events/icair-icra2010/icair2010-proceedings.pdf
Permalink to this page
Back