Inverting Supervised Representations with Autoregressive Neural Density Models

Nash, Charlie; Kushman, Nate; Williams, Christopher K. I.

Statistics > Machine Learning

arXiv:1806.00400v1 (stat)

[Submitted on 1 Jun 2018 (this version), latest version 2 Jan 2019 (v2)]

Title:Inverting Supervised Representations with Autoregressive Neural Density Models

Authors:Charlie Nash, Nate Kushman, Christopher K. I. Williams

View PDF

Abstract:Understanding the nature of representations learned by supervised machine learning models is a significant goal in the machine learning community. We present a method for feature interpretation that makes use of recent advances in autoregressive density estimation models to invert model representations. We train generative inversion models to express a distribution over input features conditioned on intermediate model representations. Insights into the invariances learned by supervised models can be gained by viewing samples from these inversion models. In addition, we can use these inversion models to estimate the mutual information between a model's inputs and its intermediate representations, thus quantifying the amount of information preserved by the network at different stages. Using this method we examine the types of information preserved at different layers of convolutional neural networks, and explore the invariances induced by different architectural choices. Finally we show that the mutual information between inputs and network layers decreases over the course of training, supporting recent work by Shwartz-Ziv and Tishby (2017) on the information bottleneck theory of deep learning.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:1806.00400 [stat.ML]
	(or arXiv:1806.00400v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.1806.00400

Submission history

From: Charlie Nash [view email]
[v1] Fri, 1 Jun 2018 15:38:58 UTC (1,390 KB)
[v2] Wed, 2 Jan 2019 12:14:44 UTC (1,398 KB)

Statistics > Machine Learning

Title:Inverting Supervised Representations with Autoregressive Neural Density Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Inverting Supervised Representations with Autoregressive Neural Density Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators