The ImageNet Shuffle: Reorganized Pre-training for Video Event Detection

P. Mettes; D.C. Koelma; C.G.M. Snoek

doi:https://doi.org/10.1145/2911996.2912036

The ImageNet Shuffle: Reorganized Pre-training for Video Event Detection

Authors	P. Mettes D.C. Koelma C.G.M. Snoek
Publication date	2016
Book title	ICMR'16
Book subtitle	proceedings of the 2016 ACM International Conference on Multimedia Retrieval: June 6-9, 2016, New York, NY, USA
ISBN	9781450343596
Event	ACM International Conference on Multimedia Retrieval 2016
Pages (from-to)	175-182
Publisher	New York, NY: Association for Computing Machinery
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	This paper strives for video event detection using a representation learned from deep convolutional neural networks. Different from the leading approaches, who all learn from the 1,000 classes defined in the ImageNet Large Scale Visual Recognition Challenge, we investigate how to leverage the complete ImageNet hierarchy for pre-training deep networks. To deal with the problems of over-specific classes and classes with few images, we introduce a bottom-up and top-down approach for reorganization of the ImageNet hierarchy based on all its 21,814 classes and more than 14 million images. Experiments on the TRECVID Multimedia Event Detection 2013 and 2015 datasets show that video representations derived from the layers of a deep neural network pre-trained with our reorganized hierarchy i) improves over standard pre-training, ii) is complementary among different reorganizations, iii) maintains the benefits of fusion with other modalities, and v) leads to state-of-the-art event detection results. The reorganized hierarchies and their derived Caffe models are publicly available at http://tinyurl.com/imagenetshuffle.
Document type	Conference contribution
Language	English
Published at	https://doi.org/10.1145/2911996.2912036 (Final published version)
Downloads	MettesICMR2016 (Accepted author manuscript) 2911996.2912036 (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

The ImageNet Shuffle: Reorganized Pre-training for Video Event Detection