VTC: Improving Video-Text Retrieval with User Comments

L. Hanu; J. Thewlis; Y.M. Asano; C. Rupprecht

doi:https://doi.org/10.1007/978-3-031-19833-5_36

VTC: Improving Video-Text Retrieval with User Comments

Authors	L. Hanu J. Thewlis Y.M. Asano C. Rupprecht
Publication date	2022
Host editors	S. Avidan G. Brostow M. Cissé G.M. Farinella T. Hassner
Book title	Computer Vision – ECCV 2022
Book subtitle	17th European Conference, Tel Aviv, Israel, October 23–27, 2022 : proceedings
ISBN	9783031198328
ISBN (electronic)	9783031198335
Series	Lecture Notes in Computer Science
Event	European Conference on Computer Vision (ECCV), 2022
Volume \| Issue number	XXXV
Pages (from-to)	616–633
Publisher	Cham: Springer
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	Multi-modal retrieval is an important problem for many applications, such as recommendation and search. Current benchmarks and even datasets are often manually constructed and consist of mostly clean samples where all modalities are well-correlated with the content. Thus, current video-text retrieval literature largely focuses on video titles or audio transcripts, while ignoring user comments, since users often tend to discuss topics only vaguely related to the video. Despite the ubiquity of user comments online, there is currently no multi-modal representation learning datasets that includes comments. In this paper, we a) introduce a new dataset of videos, titles and comments; b) present an attention-based mechanism that allows the model to learn from sometimes irrelevant data such as comments; c) show that by using comments, our method is able to learn better, more contextualised, representations for image, video and audio representations. Project page: https://unitaryai.github.io/vtc-paper.
Document type	Conference contribution
Note	With supplementary material. - Correction publ. online 10 January, 2023.
Language	English
Published at	https://doi.org/10.1007/978-3-031-19833-5_36
Other links	https://doi.org/10.1007/978-3-031-19833-5_43 https://unitaryai.github.io/vtc-paper
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

VTC: Improving Video-Text Retrieval with User Comments