A coding perspective on deep latent variable models

Open Access
Authors
Supervisors
Cosupervisors
Award date 11-09-2020
Number of pages 138
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
In my thesis "A Coding Perspective on Deep Latent Variable Models", we discuss how statistical inference in Deep Latent Variable Models (DLVMs) relates to coding.
In particular, we examine the minimum deception length (MDL) principle as a guide for statistical inference. In this context, we explore its relation to Bayesian inference. We shall see that despite both leading to similar algorithms, the MDL principle allows us to make no assumption about the data generating process. We merely restrict ourselves to finding regularity in the observed data, where regularity is connected to the ability to compress. We thus find that learning DLVMs is equivalent to minimizing the cost for communicating (compressing) a set of observations. One common approach to communication is to send a hypothesis (or model), and subsequently the data misfit under the aforementioned model. This is known as the two-part code. In this thesis, we will mainly focus on the so-called Bayesian code -- a theoretically more effective code than the two-part code.
Somewhat counter-intuitively, the Bayesian inference method will allow us to compute the code length without knowing the code nor the coding scheme that achieved this code length. The purpose of this thesis is to close this gap by developing respective coding schemes. We will, inspired and guided by the MDL principle, look for the codes that achieve the code length predicted by MDL. A special focus lies on differentiable functions, and more precisely, deep neural networks, learned by way of large quantities of high dimensional data. We will investigate model compression as well as source compression through the lens of the MDL principle.
Document type PhD thesis
Language English
Downloads
Permalink to this page
cover
Back