Future-Supervised Retrieval of Unseen Queries for Live Video
| Authors | |
|---|---|
| Publication date | 2017 |
| Book title | MM'17 |
| Book subtitle | proceedings of the 2017 ACM Multimedia Conference : October 23-27, 2017, Mountain View, CA, USA |
| ISBN (electronic) |
|
| Event | 25th ACM international conference on Multimedia |
| Pages (from-to) | 28-36 |
| Publisher | New York, NY: Association for Computing Machinery |
| Organisations |
|
| Abstract |
Live streaming video presents new challenges for retrieval and content understanding. Its live nature means that video representations should be relevant to current content, and not necessarily to past content. We investigate retrieval of previously unseen queries for live video content. Drawing from existing whole-video techniques, we focus on adapting image-trained semantic models to the video domain. We introduce the use of future frame representations as a supervision signal for learning temporally aware semantic representations on unlabeled video data. Additionally, we introduce an approach for broadening a query's representation within a pre-constructed semantic space, with the aim of increasing overlap between embedded visual semantics and the query semantics. We demonstrate the efficacy of these contributions for unseen query retrieval on live videos. We further explore their applicability to tasks such as no example, whole-video action classification and no-example live video action prediction, and demonstrate state of the art results.
|
| Document type | Conference contribution |
| Language | English |
| Published at | https://doi.org/10.1145/3123266.3123437 |
| Other links | https://ivi.fnwi.uva.nl/isis/publications/2017/CappalloICM2017 |
| Downloads |
p28-cappallo
(Final published version)
|
| Permalink to this page | |
