Future-Supervised Retrieval of Unseen Queries for Live Video

Open Access
Authors
Publication date 2017
Book title MM'17
Book subtitle proceedings of the 2017 ACM Multimedia Conference : October 23-27, 2017, Mountain View, CA, USA
ISBN (electronic)
  • 9781450349062
Event 25th ACM international conference on Multimedia
Pages (from-to) 28-36
Publisher New York, NY: Association for Computing Machinery
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Live streaming video presents new challenges for retrieval and content understanding. Its live nature means that video representations should be relevant to current content, and not necessarily to past content. We investigate retrieval of previously unseen queries for live video content. Drawing from existing whole-video techniques, we focus on adapting image-trained semantic models to the video domain. We introduce the use of future frame representations as a supervision signal for learning temporally aware semantic representations on unlabeled video data. Additionally, we introduce an approach for broadening a query's representation within a pre-constructed semantic space, with the aim of increasing overlap between embedded visual semantics and the query semantics. We demonstrate the efficacy of these contributions for unseen query retrieval on live videos. We further explore their applicability to tasks such as no example, whole-video action classification and no-example live video action prediction, and demonstrate state of the art results.
Document type Conference contribution
Language English
Published at https://doi.org/10.1145/3123266.3123437
Other links https://ivi.fnwi.uva.nl/isis/publications/2017/CappalloICM2017
Downloads
p28-cappallo (Final published version)
Permalink to this page
Back