Search | arXiv e-print repository

A New Algorithm for Whitney Stratification of Varieties

Abstract: We describe a new algorithm to compute Whitney stratifications of real and complex algebraic varieties. This algorithm is a modification of the algorithm of Helmer and Nanda (HN), but is made more efficient by using techniques for equidimensional decomposition rather than computing the set of associated primes of a polynomial ideal at a key step in the HN algorithm. We note that this modified algo… ▽ More We describe a new algorithm to compute Whitney stratifications of real and complex algebraic varieties. This algorithm is a modification of the algorithm of Helmer and Nanda (HN), but is made more efficient by using techniques for equidimensional decomposition rather than computing the set of associated primes of a polynomial ideal at a key step in the HN algorithm. We note that this modified algorithm may fail to produce a minimal Whitney stratification even when the HN algorithm would produce a minimal stratification. We, additionally, present an algorithm to coarsen any Whitney stratification of a complex variety to a minimal Whitney stratification; the theoretical basis for our approach is a classical result of Teissier. △ Less

Submitted 24 June, 2024; originally announced June 2024.

MSC Class: 14B05; 14Q20; 32S60; 32S15

arXiv:2402.03144 [pdf, ps, other]

Computing Generic Fibres of Polynomial Ideals with FGLM and Hensel Lifting

Authors: Jérémy Berthomieu, Rafael Mohr

Abstract: We describe a version of the FGLM algorithm that can be used to compute generic fibers of positive-dimensional polynomial ideals. It combines the FGLM algorithm with a Hensel lifting strategy. We show that this algorithm has a complexity quasi-linear in the number of lifting steps. Some provided experimental data also demonstrates the practical efficacy of our algorithm. Additionally, we sketch a… ▽ More We describe a version of the FGLM algorithm that can be used to compute generic fibers of positive-dimensional polynomial ideals. It combines the FGLM algorithm with a Hensel lifting strategy. We show that this algorithm has a complexity quasi-linear in the number of lifting steps. Some provided experimental data also demonstrates the practical efficacy of our algorithm. Additionally, we sketch a related Hensel lifting method to compute Gröbner bases using so-called tracers. △ Less

Submitted 5 February, 2024; originally announced February 2024.

arXiv:2302.09160 [pdf, other]

Identifying Equivalent Training Dynamics

Authors: William T. Redman, Juan M. Bello-Rivas, Maria Fonoberova, Ryan Mohr, Ioannis G. Kevrekidis, Igor Mezić

Abstract: Study of the nonlinear evolution deep neural network (DNN) parameters undergo during training has uncovered regimes of distinct dynamical behavior. While a detailed understanding of these phenomena has the potential to advance improvements in training efficiency and robustness, the lack of methods for identifying when DNN models have equivalent dynamics limits the insight that can be gained from p… ▽ More Study of the nonlinear evolution deep neural network (DNN) parameters undergo during training has uncovered regimes of distinct dynamical behavior. While a detailed understanding of these phenomena has the potential to advance improvements in training efficiency and robustness, the lack of methods for identifying when DNN models have equivalent dynamics limits the insight that can be gained from prior work. Topological conjugacy, a notion from dynamical systems theory, provides a precise definition of dynamical equivalence, offering a possible route to address this need. However, topological conjugacies have historically been challenging to compute. By leveraging advances in Koopman operator theory, we develop a framework for identifying conjugate and non-conjugate training dynamics. To validate our approach, we demonstrate that it can correctly identify a known equivalence between online mirror descent and online gradient descent. We then utilize it to: identify non-conjugate training dynamics between shallow and wide fully connected neural networks; characterize the early phase of training dynamics in convolutional neural networks; uncover non-conjugate training dynamics in Transformers that do and do not undergo grokking. Our results, across a range of DNN architectures, illustrate the flexibility of our framework and highlight its potential for shedding new light on training dynamics. △ Less

Submitted 4 June, 2024; v1 submitted 17 February, 2023; originally announced February 2023.

Comments: 18 pages, 6 figures, 3 supplemental figures

arXiv:2302.08174 [pdf, ps, other]

A Direttissimo Algorithm for Equidimensional Decomposition

Authors: Christian Eder, Pierre Lairez, Rafael Mohr, Mohab Safey El Din

Abstract: We describe a recursive algorithm that decomposes an algebraic set into locally closed equidimensional sets, i.e. sets which each have irreducible components of the same dimension. At the core of this algorithm, we combine ideas from the theory of triangular sets, a.k.a. regular chains, with Gröbner bases to encode and work with locally closed algebraic sets. Equipped with this, our algorithm avoi… ▽ More We describe a recursive algorithm that decomposes an algebraic set into locally closed equidimensional sets, i.e. sets which each have irreducible components of the same dimension. At the core of this algorithm, we combine ideas from the theory of triangular sets, a.k.a. regular chains, with Gröbner bases to encode and work with locally closed algebraic sets. Equipped with this, our algorithm avoids projections of the algebraic sets that are decomposed and certain genericity assumptions frequently made when decomposing polynomial systems, such as assumptions about Noether position. This makes it produce fine decompositions on more structured systems where ensuring genericity assumptions often destroys the structure of the system at hand. Practical experiments demonstrate its efficiency compared to state-of-the-art implementations. △ Less

Submitted 9 June, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

Comments: Some minor revisions, corrects a mistake in the proof of lemma 2.2

arXiv:2209.06374 [pdf, other]

Algorithmic (Semi-)Conjugacy via Koopman Operator Theory

Authors: William T. Redman, Maria Fonoberova, Ryan Mohr, Ioannis G. Kevrekidis, Igor Mezić

Abstract: Iterative algorithms are of utmost importance in decision and control. With an ever growing number of algorithms being developed, distributed, and proprietarized, there is a similarly growing need for methods that can provide classification and comparison. By viewing iterative algorithms as discrete-time dynamical systems, we leverage Koopman operator theory to identify (semi-)conjugacies between… ▽ More Iterative algorithms are of utmost importance in decision and control. With an ever growing number of algorithms being developed, distributed, and proprietarized, there is a similarly growing need for methods that can provide classification and comparison. By viewing iterative algorithms as discrete-time dynamical systems, we leverage Koopman operator theory to identify (semi-)conjugacies between algorithms using their spectral properties. This provides a general framework with which to classify and compare algorithms. △ Less

Submitted 13 September, 2022; originally announced September 2022.

Comments: 6 pages, 5 figures, accepted to IEEE CDC 2022

arXiv:2202.13784 [pdf, ps, other]

doi 10.1016/j.jsc.2023.02.001

A Signature-based Algorithm for Computing the Nondegenerate Locus of a Polynomial System

Authors: Christian Eder, Pierre Lairez, Rafael Mohr, Mohab Safey El Din

Abstract: Polynomial system solving arises in many application areas to model non-linear geometric properties. In such settings, polynomial systems may come with degeneration which the end-user wants to exclude from the solution set. The nondegenerate locus of a polynomial system is the set of points where the codimension of the solution set matches the number of equations. Computing the nondegenerate locus… ▽ More Polynomial system solving arises in many application areas to model non-linear geometric properties. In such settings, polynomial systems may come with degeneration which the end-user wants to exclude from the solution set. The nondegenerate locus of a polynomial system is the set of points where the codimension of the solution set matches the number of equations. Computing the nondegenerate locus is classically done through ideal-theoretic operations in commutative algebra such as saturation ideals or equidimensional decompositions to extract the component of maximal codimension. By exploiting the algebraic features of signature-based Gröbner basis algorithms we design an algorithm which computes a Gröbner basis of the equations describing the closure of the nondegenerate locus of a polynomial system, without computing first a Gröbner basis for the whole polynomial system. △ Less

Submitted 22 July, 2022; v1 submitted 28 February, 2022; originally announced February 2022.

Comments: 22 pages, 2 figures. Substantial rewrite of content of the parts of the paper involving signature-based Gröbner basis algorithms, both the exposition and the description of the core algorithm of the paper changed

MSC Class: 13P10; 13P05 ACM Class: I.1.2; G.4

Journal ref: Journal of Symbolic Computation 119, 2023

arXiv:2110.14856 [pdf, other]

An Operator Theoretic View on Pruning Deep Neural Networks

Authors: William T. Redman, Maria Fonoberova, Ryan Mohr, Ioannis G. Kevrekidis, Igor Mezic

Abstract: The discovery of sparse subnetworks that are able to perform as well as full models has found broad applied and theoretical interest. While many pruning methods have been developed to this end, the naïve approach of removing parameters based on their magnitude has been found to be as robust as more complex, state-of-the-art algorithms. The lack of theory behind magnitude pruning's success, especia… ▽ More The discovery of sparse subnetworks that are able to perform as well as full models has found broad applied and theoretical interest. While many pruning methods have been developed to this end, the naïve approach of removing parameters based on their magnitude has been found to be as robust as more complex, state-of-the-art algorithms. The lack of theory behind magnitude pruning's success, especially pre-convergence, and its relation to other pruning methods, such as gradient based pruning, are outstanding open questions in the field that are in need of being addressed. We make use of recent advances in dynamical systems theory, namely Koopman operator theory, to define a new class of theoretically motivated pruning algorithms. We show that these algorithms can be equivalent to magnitude and gradient based pruning, unifying these seemingly disparate methods, and find that they can be used to shed light on magnitude pruning's performance during the early part of training. △ Less

Submitted 12 March, 2022; v1 submitted 27 October, 2021; originally announced October 2021.

Comments: 14 pages, 5 figures

arXiv:2012.11734 [pdf, other]

doi 10.3390/e23010037

Predicting the Critical Number of Layers for Hierarchical Support Vector Regression

Authors: Ryan Mohr, Maria Fonoberova, Zlatko Drmač, Iva Manojlović, Igor Mezić

Abstract: Hierarchical support vector regression (HSVR) models a function from data as a linear combination of SVR models at a range of scales, starting at a coarse scale and moving to finer scales as the hierarchy continues. In the original formulation of HSVR, there were no rules for choosing the depth of the model. In this paper, we observe in a number of models a phase transition in the training error -… ▽ More Hierarchical support vector regression (HSVR) models a function from data as a linear combination of SVR models at a range of scales, starting at a coarse scale and moving to finer scales as the hierarchy continues. In the original formulation of HSVR, there were no rules for choosing the depth of the model. In this paper, we observe in a number of models a phase transition in the training error -- the error remains relatively constant as layers are added, until a critical scale is passed, at which point the training error drops close to zero and remains nearly constant for added layers. We introduce a method to predict this critical scale a priori with the prediction based on the support of either a Fourier transform of the data or the Dynamic Mode Decomposition (DMD) spectrum. This allows us to determine the required number of layers prior to training any models. △ Less

Submitted 21 December, 2020; originally announced December 2020.

Comments: 18 pages, 9 figures

MSC Class: 68Q32; 68T05; 68T07; 37M05; 37M10; 37M25 ACM Class: I.5.2; F.2.1; G.1.3; G.1.6

arXiv:2006.11765 [pdf, other]

Applications of Koopman Mode Analysis to Neural Networks

Authors: Iva Manojlović, Maria Fonoberova, Ryan Mohr, Aleksandr Andrejčuk, Zlatko Drmač, Yannis Kevrekidis, Igor Mezić

Abstract: We consider the training process of a neural network as a dynamical system acting on the high-dimensional weight space. Each epoch is an application of the map induced by the optimization algorithm and the loss function. Using this induced map, we can apply observables on the weight space and measure their evolution. The evolution of the observables are given by the Koopman operator associated wit… ▽ More We consider the training process of a neural network as a dynamical system acting on the high-dimensional weight space. Each epoch is an application of the map induced by the optimization algorithm and the loss function. Using this induced map, we can apply observables on the weight space and measure their evolution. The evolution of the observables are given by the Koopman operator associated with the induced dynamical system. We use the spectrum and modes of the Koopman operator to realize the above objectives. Our methods can help to, a priori, determine the network depth; determine if we have a bad initialization of the network weights, allowing a restart before training too long; speeding up the training time. Additionally, our methods help enable noise rejection and improve robustness. We show how the Koopman spectrum can be used to determine the number of layers required for the architecture. Additionally, we show how we can elucidate the convergence versus non-convergence of the training process by monitoring the spectrum, in particular, how the existence of eigenvalues clustering around 1 determines when to terminate the learning process. We also show how using Koopman modes we can selectively prune the network to speed up the training procedure. Finally, we show that incorporating loss functions based on negative Sobolev norms can allow for the reconstruction of a multi-scale signal polluted by very large amounts of noise. △ Less

Submitted 21 June, 2020; originally announced June 2020.

ACM Class: I.2.6

Showing 1–9 of 9 results for author: Mohr, R