Information theory for representation learning

Open Access
Authors
Supervisors
Cosupervisors
Award date 23-06-2025
Number of pages 200
Organisations
  • Faculty of Science (FNWI) - Korteweg-de Vries Institute for Mathematics (KdVI)
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
This thesis explores the application of information theory to deep representation learning, with the aim of enhancing the understanding and generalization capabilities of deep learning models. Central to this work is the challenge of identifying and discarding unnecessary information, allowing models to capture the most relevant aspects of input data without requiring explicit label supervision.
We develop new theoretical frameworks and practical learning objectives that extend classical information-theoretic principles to a range of machine learning scenarios, including self-supervised learning, multi-view learning, and temporal modeling. These methods address key challenges in mutual information estimation, redundancy reduction, and efficient representation extraction, providing more robust and interpretable representations. In particular, we demonstrate how these approaches can improve the handling of complex, high-dimensional data and capture the essential dynamics of time-dependent systems, and can describe and mitigate the effect of distribution shifts.
Through extensive empirical validation, we show that information-theoretic methods can effectively balance compression and prediction, supporting more adaptable and data-efficient machine learning systems.
Collectively, this work advances the theoretical understanding and practical application of information theory in deep learning, offering new insights into the nature of effective representations and the challenges of learning from complex data.
Document type PhD thesis
Language English
Downloads
Permalink to this page
cover
Back