How Severe is Benchmark-Sensitivity in Video Self-Supervised Learning?

doi:https://doi.org/10.1007/978-3-031-19830-4_36

How Severe is Benchmark-Sensitivity in Video Self-Supervised Learning?

Authors	F.M. Thoker H. Doughty P. Bagad C.G.M. Snoek
Publication date	2022
Host editors	S. Avidan G. Brostow M. Cissé G.M. Farinella T. Hassner
Book title	Computer Vision – ECCV 2022
Book subtitle	17th European Conference, Tel Aviv, Israel, October 23–27, 2022 : proceedings
ISBN	9783031198298
ISBN (electronic)	9783031198304
Series	Lecture Notes in Computer Science
Event	European Conference on Computer Vision (ECCV), 2022
Volume \| Issue number	XXXIV
Pages (from-to)	632–652
Publisher	Cham: Springer
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	Despite the recent success of video self-supervised learning models, there is much still to be understood about their generalization capability. In this paper, we investigate how sensitive video self-supervised learning is to the current conventional benchmark and whether methods generalize beyond the canonical evaluation setting. We do this across four different factors of sensitivity: domain, samples, actions and task. Our study which encompasses over 500 experiments on 7 video datasets, 9 self-supervised methods and 6 video understanding tasks, reveals that current benchmarks in video self-supervised learning are not good indicators of generalization along these sensitivity factors. Further, we find that self-supervised methods considerably lag behind vanilla supervised pre-training, especially when domain shift is large and the amount of available downstream samples are low. From our analysis we distill the SEVERE-benchmark, a subset of our experiments, and discuss its implication for evaluating the generalizability of representations obtained by existing and future self-supervised video learning methods. Code is available at https://github.com/fmthoker/SEVERE-BENCHMARK.
Document type	Conference contribution
Note	With supplementary file
Language	English
Published at	https://doi.org/10.1007/978-3-031-19830-4_36
Other links	https://github.com/fmthoker/SEVERE-BENCHMARK
Supplementary materials	540003_1_En_36_MOESM1_ESM
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

How Severe is Benchmark-Sensitivity in Video Self-Supervised Learning?