Actor-Transformers for Group Activity Recognition

K. Gavrilyuk; R. Sanford; M. Javan; C.G.M. Snoek

doi:https://doi.org/10.1109/CVPR42600.2020.00092

Actor-Transformers for Group Activity Recognition

Authors	K. Gavrilyuk R. Sanford M. Javan C.G.M. Snoek
Publication date	2020
Book title	2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition
Book subtitle	proceedings : virtual, 14-19 June 2020
ISBN	9781728171692
ISBN (electronic)	9781728171685
Series	CVPR
Event	2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition
Pages (from-to)	836-845
Publisher	Los Alamitos, California: IEEE Computer Society
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	This paper strives to recognize individual actions and group activities from videos. While existing solutions for this challenging problem explicitly model spatial and temporal relationships based on location of individual actors, we propose an actor-transformer model able to learn and selectively extract information relevant for group activity recognition. We feed the transformer with rich actor-specific static and dynamic representations expressed by features from a 2D pose network and 3D CNN, respectively. We empirically study different ways to combine these representations and show their complementary benefits. Experiments show what is important to transform and how it should be transformed. What is more, actor-transformers achieve state-of-the-art results on two publicly available benchmarks for group activity recognition, outperforming the previous best published results by a considerable margin.
Document type	Conference contribution
Language	English
Published at	https://doi.org/10.1109/CVPR42600.2020.00092 (Final published version)
Downloads	09156959 (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Actor-Transformers for Group Activity Recognition