Equivariant convolutional networks

T.S. Cohen

Equivariant convolutional networks

Authors	T.S. Cohen
Supervisors	M. Welling
Award date	09-06-2021
Number of pages	233
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	Deep neural networks can solve many kinds of learning problems, but only if a lot of data is available. For many problems (e.g. in medical imaging), it is expensive to acquire a large amount of labelled data, so it would be highly desirable to improve the statistical efficiency of deep learning methods. In this thesis we explore ways to leverage symmetries to improve the ability of convolutional neural networks to generalize from relatively small samples. We argue and show empirically that in the context of deep learning it is better to learn equivariant rather than invariant representations, because invariant ones lose information too early on in the network. We present a sequence of increasingly general group equivariant convolutional neural networks (G-CNNs), adapted to the particular symmetries of various spaces. Specifically, we present roto-translation equivariant networks for planar images and volumetric signals, rotation equivariant spherical CNNs for analyzing spherical signals such as global weather patterns and omnidirectional images, and gauge equivariant CNNs for the analysis of signals on general manifolds. We have evaluated these networks on problems such as image classification and segmentation in vision and medical imaging, 3D model classification, detection of extreme weather events, quantum chemistry, and protein structure classification. We show that across the board, G-CNNs outperform conventional translation equivariant CNNs on problems that exhibit symmetries. In Part II we present a general mathematical theory of G-CNNs. The theory describes convolutional feature spaces as spaces of fields over a manifold, i.e. spaces of sections of an associated vector bundle. Symmetries are described as groups acting on a principal bundle by automorphisms, and layers of the network are described as linear and non-linear equivariant maps between spaces of fields. Through the use of a common mathematical language, an analogy to theoretical physics (especially gauge theory) is established. We show that in general, convolution-like maps arise from symmetry principles, and specifically that each one of the generalized convolutions used in Part I is recovered from symmetry principles as the most general class of linear maps that is equivariant to a certain group of symmetries.
Document type	PhD thesis
Language	English
Downloads	Thesis
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Equivariant convolutional networks