Search | arXiv e-print repository

On the Limitations of Fractal Dimension as a Measure of Generalization

Authors: Charlie Tan, Inés García-Redondo, Qiquan Wang, Michael M. Bronstein, Anthea Monod

Abstract: Bounding and predicting the generalization gap of overparameterized neural networks remains a central open problem in theoretical machine learning. Neural network optimization trajectories have been proposed to possess fractal structure, leading to bounds and generalization measures based on notions of fractal dimension on these trajectories. Prominently, both the Hausdorff dimension and the persi… ▽ More Bounding and predicting the generalization gap of overparameterized neural networks remains a central open problem in theoretical machine learning. Neural network optimization trajectories have been proposed to possess fractal structure, leading to bounds and generalization measures based on notions of fractal dimension on these trajectories. Prominently, both the Hausdorff dimension and the persistent homology dimension have been proposed to correlate with generalization gap, thus serving as a measure of generalization. This work performs an extended evaluation of these topological generalization measures. We demonstrate that fractal dimension fails to predict generalization of models trained from poor initializations. We further identify that the $\ell^2$ norm of the final parameter iterate, one of the simplest complexity measures in learning theory, correlates more strongly with the generalization gap than these notions of fractal dimension. Finally, our study reveals the intriguing manifestation of model-wise double descent in persistent homology-based generalization measures. This work lays the ground for a deeper investigation of the causal relationships between fractal geometry, topological data analysis, and neural network optimization. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: 17 pages, 6 figures

arXiv:2405.20174 [pdf, other]

Tropical Expressivity of Neural Networks

Authors: Shiv Bhatia, Yueqi Cao, Paul Lezeau, Anthea Monod

Abstract: We propose an algebraic geometric framework to study the expressivity of linear activation neural networks. A particular quantity that has been actively studied in the field of deep learning is the number of linear regions, which gives an estimate of the information capacity of the architecture. To study and evaluate information capacity and expressivity, we work in the setting of tropical geometr… ▽ More We propose an algebraic geometric framework to study the expressivity of linear activation neural networks. A particular quantity that has been actively studied in the field of deep learning is the number of linear regions, which gives an estimate of the information capacity of the architecture. To study and evaluate information capacity and expressivity, we work in the setting of tropical geometry -- a combinatorial and polyhedral variant of algebraic geometry -- where there are known connections between tropical rational maps and feedforward neural networks. Our work builds on and expands this connection to capitalize on the rich theory of tropical geometry to characterize and study various architectural aspects of neural networks. Our contributions are threefold: we provide a novel tropical geometric approach to selecting sampling domains among linear regions; an algebraic result allowing for a guided restriction of the sampling domain for network architectures with symmetries; and an open source library to analyze neural networks as tropical Puiseux rational maps. We provide a comprehensive set of proof-of-concept numerical experiments demonstrating the breadth of neural network architectures to which tropical geometric theory can be applied to reveal insights on expressivity characteristics of a network. Our work provides the foundations for the adaptation of both theory and existing software from computational tropical geometry and symbolic computation to deep learning. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2310.05767 [pdf, other]

Topological Community Detection: A Sheaf-Theoretic Approach

Authors: Arne Wolf, Anthea Monod

Abstract: We propose a model for network community detection using topological data analysis, a branch of modern data science that leverages theory from algebraic topology to statistical analysis and machine learning. Specifically, we use cellular sheaves, which relate local to global properties of various algebraic topological constructions, to propose three new algorithms for vertex clustering over networ… ▽ More We propose a model for network community detection using topological data analysis, a branch of modern data science that leverages theory from algebraic topology to statistical analysis and machine learning. Specifically, we use cellular sheaves, which relate local to global properties of various algebraic topological constructions, to propose three new algorithms for vertex clustering over networks to detect communities. We apply our algorithms to real social network data in numerical experiments and obtain near optimal results in terms of modularity. Our work is the first implementation of sheaves on real social network data and provides a solid proof-of-concept for future work using sheaves as tools to study complex systems captured by networks and simplicial complexes. △ Less

Submitted 9 October, 2023; originally announced October 2023.

Comments: 11 pages, 11 figures

arXiv:2209.02854 [pdf, other]

Video Restoration with a Deep Plug-and-Play Prior

Authors: Antoine Monod, Julie Delon, Matias Tassano, Andrés Almansa

Abstract: This paper presents a novel method for restoring digital videos via a Deep Plug-and-Play (PnP) approach. Under a Bayesian formalism, the method consists in using a deep convolutional denoising network in place of the proximal operator of the prior in an alternating optimization scheme. We distinguish ourselves from prior PnP work by directly applying that method to restore a digital video from a d… ▽ More This paper presents a novel method for restoring digital videos via a Deep Plug-and-Play (PnP) approach. Under a Bayesian formalism, the method consists in using a deep convolutional denoising network in place of the proximal operator of the prior in an alternating optimization scheme. We distinguish ourselves from prior PnP work by directly applying that method to restore a digital video from a degraded video observation. This way, a network trained once for denoising can be repurposed for other video restoration tasks. Our experiments in video deblurring, super-resolution, and interpolation of random missing pixels all show a clear benefit to using a network specifically designed for video denoising, as it yields better restoration performance and better temporal stability than a single image network with similar denoising performance using the same PnP formulation. Moreover, our method compares favorably to applying a different state-of-the-art PnP scheme separately on each frame of the sequence. This opens new perspectives in the field of video restoration. △ Less

Submitted 15 September, 2022; v1 submitted 6 September, 2022; originally announced September 2022.

Comments: 10 pages + 4 pages supplementary; code at github.com/amonod/pnp-video

ACM Class: I.4

arXiv:2208.06701 [pdf, other]

Learning Linear Non-Gaussian Polytree Models

Authors: Daniele Tramontano, Anthea Monod, Mathias Drton

Abstract: In the context of graphical causal discovery, we adapt the versatile framework of linear non-Gaussian acyclic models (LiNGAMs) to propose new algorithms to efficiently learn graphs that are polytrees. Our approach combines the Chow--Liu algorithm, which first learns the undirected tree structure, with novel schemes to orient the edges. The orientation schemes assess algebraic relations among momen… ▽ More In the context of graphical causal discovery, we adapt the versatile framework of linear non-Gaussian acyclic models (LiNGAMs) to propose new algorithms to efficiently learn graphs that are polytrees. Our approach combines the Chow--Liu algorithm, which first learns the undirected tree structure, with novel schemes to orient the edges. The orientation schemes assess algebraic relations among moments of the data-generating distribution and are computationally inexpensive. We establish high-dimensional consistency results for our approach and compare different algorithmic versions in numerical experiments. △ Less

Submitted 13 August, 2022; originally announced August 2022.

arXiv:2207.08026 [pdf, other]

Rewiring Networks for Graph Neural Network Training Using Discrete Geometry

Authors: Jakub Bober, Anthea Monod, Emil Saucan, Kevin N. Webster

Abstract: Information over-squashing is a phenomenon of inefficient information propagation between distant nodes on networks. It is an important problem that is known to significantly impact the training of graph neural networks (GNNs), as the receptive field of a node grows exponentially. To mitigate this problem, a preprocessing procedure known as rewiring is often applied to the input network. In this p… ▽ More Information over-squashing is a phenomenon of inefficient information propagation between distant nodes on networks. It is an important problem that is known to significantly impact the training of graph neural networks (GNNs), as the receptive field of a node grows exponentially. To mitigate this problem, a preprocessing procedure known as rewiring is often applied to the input network. In this paper, we investigate the use of discrete analogues of classical geometric notions of curvature to model information flow on networks and rewire them. We show that these classical notions achieve state-of-the-art performance in GNN training accuracy on a variety of real-world network datasets. Moreover, compared to the current state-of-the-art, these classical notions exhibit a clear advantage in computational runtime by several orders of magnitude. △ Less

Submitted 16 July, 2022; originally announced July 2022.

Comments: 21 pages, 8 figures, 7 tables

arXiv:2204.09155 [pdf, other]

Approximating Persistent Homology for Large Datasets

Authors: Yueqi Cao, Anthea Monod

Abstract: Persistent homology is an important methodology from topological data analysis which adapts theory from algebraic topology to data settings and has been successfully implemented in many applications. It produces a statistical summary in the form of a persistence diagram, which captures the shape and size of the data. Despite its widespread use, persistent homology is simply impossible to implement… ▽ More Persistent homology is an important methodology from topological data analysis which adapts theory from algebraic topology to data settings and has been successfully implemented in many applications. It produces a statistical summary in the form of a persistence diagram, which captures the shape and size of the data. Despite its widespread use, persistent homology is simply impossible to implement when a dataset is very large. In this paper we address the problem of finding a representative persistence diagram for prohibitively large datasets. We adapt the classical statistical method of bootstrapping, namely, drawing and studying smaller multiple subsamples from the large dataset. We show that the mean of the persistence diagrams of subsamples -- taken as a mean persistence measure computed from the subsamples -- is a valid approximation of the true persistent homology of the larger dataset. We give the rate of convergence of the mean persistence diagram to the true persistence diagram in terms of the number of subsamples and size of each subsample. Given the complex algebraic and geometric nature of persistent homology, we adapt the convexity and stability properties in the space of persistence diagrams together with random set theory to achieve our theoretical results for the general setting of point cloud data. We demonstrate our approach on simulated and real data, including an application of shape clustering on complex large-scale point cloud data. △ Less

Submitted 18 May, 2022; v1 submitted 19 April, 2022; originally announced April 2022.

Comments: 24 pages, 9 figures

arXiv:2110.09354 [pdf, other]

doi 10.5201/ipol.2021.336

An Analysis and Implementation of the HDR+ Burst Denoising Method

Authors: Antoine Monod, Julie Delon, Thomas Veit

Abstract: HDR+ is an image processing pipeline presented by Google in 2016. At its core lies a denoising algorithm that uses a burst of raw images to produce a single higher quality image. Since it is designed as a versatile solution for smartphone cameras, it does not necessarily aim for the maximization of standard denoising metrics, but rather for the production of natural, visually pleasing images. In t… ▽ More HDR+ is an image processing pipeline presented by Google in 2016. At its core lies a denoising algorithm that uses a burst of raw images to produce a single higher quality image. Since it is designed as a versatile solution for smartphone cameras, it does not necessarily aim for the maximization of standard denoising metrics, but rather for the production of natural, visually pleasing images. In this article, we specifically discuss and analyze the HDR+ burst denoising algorithm architecture and the impact of its various parameters. With this publication, we provide an open source Python implementation of the algorithm, along with an interactive demo. △ Less

Submitted 18 October, 2021; originally announced October 2021.

Comments: 28 pages, 15 figures, published at https://doi.org/10.5201/ipol.2021.336, code on https://github.com/amonod/hdrplus-python

ACM Class: I.4; F.2.1

Journal ref: Image Processing On Line, 11 (2021), pp. 142-169

arXiv:2110.03413 [pdf, other]

Curved Markov Chain Monte Carlo for Network Learning

Authors: John Sigbeku, Emil Saucan, Anthea Monod

Abstract: We present a geometrically enhanced Markov chain Monte Carlo sampler for networks based on a discrete curvature measure defined on graphs. Specifically, we incorporate the concept of graph Forman curvature into sampling procedures on both the nodes and edges of a network explicitly, via the transition probability of the Markov chain, as well as implicitly, via the target stationary distribution, w… ▽ More We present a geometrically enhanced Markov chain Monte Carlo sampler for networks based on a discrete curvature measure defined on graphs. Specifically, we incorporate the concept of graph Forman curvature into sampling procedures on both the nodes and edges of a network explicitly, via the transition probability of the Markov chain, as well as implicitly, via the target stationary distribution, which gives a novel, curved Markov chain Monte Carlo approach to learning networks. We show that integrating curvature into the sampler results in faster convergence to a wide range of network statistics demonstrated on deterministic networks drawn from real-world data. △ Less

Submitted 11 October, 2021; v1 submitted 7 October, 2021; originally announced October 2021.

Comments: 12 pages, 5 figures. To appear in Studies in Computational Intelligence: Proceedings of The 10th International Conference on Complex Networks and Their Applications (2021)

arXiv:2104.01672 [pdf, other]

doi 10.1093/imaiai/iaad022

Topological Information Retrieval with Dilation-Invariant Bottleneck Comparative Measures

Authors: Yueqi Cao, Athanasios Vlontzos, Luca Schmidtke, Bernhard Kainz, Anthea Monod

Abstract: Appropriately representing elements in a database so that queries may be accurately matched is a central task in information retrieval; recently, this has been achieved by embedding the graphical structure of the database into a manifold in a hierarchy-preserving manner using a variety of metrics. Persistent homology is a tool commonly used in topological data analysis that is able to rigorously c… ▽ More Appropriately representing elements in a database so that queries may be accurately matched is a central task in information retrieval; recently, this has been achieved by embedding the graphical structure of the database into a manifold in a hierarchy-preserving manner using a variety of metrics. Persistent homology is a tool commonly used in topological data analysis that is able to rigorously characterize a database in terms of both its hierarchy and connectivity structure. Computing persistent homology on a variety of embedded datasets reveals that some commonly used embeddings fail to preserve the connectivity. We show that those embeddings which successfully retain the database topology coincide in persistent homology by introducing two dilation-invariant comparative measures to capture this effect: in particular, they address the issue of metric distortion on manifolds. We provide an algorithm for their computation that exhibits greatly reduced time complexity over existing methods. We use these measures to perform the first instance of topology-based information retrieval and demonstrate its increased performance over the standard bottleneck distance for persistent homology. We showcase our approach on databases of different data varieties including text, videos, and medical images. △ Less

Submitted 6 July, 2022; v1 submitted 4 April, 2021; originally announced April 2021.

Comments: 29 pages, 10 figures, 4 tables

MSC Class: 68P15; 68P20; 55N31

Journal ref: Information and Inference: A Journal of the IMA, Volume 12, Issue 3 (2023)

Showing 1–10 of 10 results for author: Monod, A