Localizing the Common Action Among a Few Videos

Open Access
Authors
Publication date 2020
Host editors
  • A. Vedaldi
  • H. Bischof
  • T. Brox
  • J.-M. Frahm
Book title Computer Vision – ECCV 2020
Book subtitle 16th European Conference, Glasgow, UK, August 23–28, 2020 : proceedings
ISBN
  • 9783030585709
ISBN (electronic)
  • 9783030585716
Series Lecture Notes in Computer Science
Event 16th European Conference on Computer Vision
Volume | Issue number VII
Pages (from-to) 505-521
Publisher Cham: Springer
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
This paper strives to localize the temporal extent of an action in a long untrimmed video. Where existing work leverages many examples with their start, their ending, and/or the class of the action during training time, we propose few-shot common action localization. The start and end of an action in a long untrimmed video is determined based on just a hand-full of trimmed video examples containing the same action, without knowing their common class label. To address this task, we introduce a new 3D convolutional network architecture able to align representations from the support videos with the relevant query video segments. The network contains: (i) a mutual enhancement module to simultaneously complement the representation of the few trimmed support videos and the untrimmed query video; (ii) a progressive alignment module that iteratively fuses the support videos into the query branch; and (iii) a pairwise matching module to weigh the importance of different support videos. Evaluation of few-shot common action localization in untrimmed videos containing a single or multiple action instances demonstrates the effectiveness and general applicability of our proposal.
Document type Conference contribution
Note With supplementary material.
Language English
Published at https://doi.org/10.1007/978-3-030-58571-6_30
Other links https://github.com/PengWan-Yang/commonLocalization
Downloads
Supplementary materials
Permalink to this page
Back