Repetitive Activity Counting by Sight and Sound

Open Access
Authors
Publication date 2021
Book title Proceedings, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition
Book subtitle virtual, 9-25 June 2021
ISBN
  • 9781665445108
ISBN (electronic)
  • 9781665445092
Series CVPR
Event 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition
Pages (from-to) 14065-14074
Publisher Los Alamitos, California: Conference Publishing Services, IEEE Computer Society
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
This paper strives for repetitive activity counting in videos. Different from existing works, which all analyze the visual video content only, we incorporate for the first time the corresponding sound into the repetition counting process. This benefits accuracy in challenging vision conditions such as occlusion, dramatic camera view changes, low resolution, etc. We propose a model that starts with analyzing the sight and sound streams separately. Then an audiovisual temporal stride decision module and a reliability estimation module are introduced to exploit cross-modal temporal interaction. For learning and evaluation, an existing dataset is repurposed and reorganized to allow for repetition counting with sight and sound. We also introduce a variant of this dataset for repetition counting under challenging vision conditions. Experiments demonstrate the benefit of sound, as well as the other introduced modules, for repetition counting. Our sight-only model already outperforms the state-of-the-art by itself, when we add sound, results improve notably, especially under harsh vision conditions. The code and datasets are available at https://github.com/xiaobai1217/RepetitionCounting.
Document type Conference contribution
Note With supplemental material
Language English
Published at https://doi.org/10.48550/arXiv.2103.13096 https://doi.org/10.1109/CVPR46437.2021.01385
Published at https://openaccess.thecvf.com/content/CVPR2021/html/Zhang_Repetitive_Activity_Counting_by_Sight_and_Sound_CVPR_2021_paper.html
Other links https://github.com/xiaobai1217/RepetitionCounting https://www.proceedings.com/60773.html
Downloads
2103.13096 (Accepted author manuscript)
Supplementary materials
Permalink to this page
Back