Zum Hauptinhalt springen

Showing 1–9 of 9 results for author: Cornacchia, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2306.16921  [pdf, other

    cs.LG stat.ML

    Provable Advantage of Curriculum Learning on Parity Targets with Mixed Inputs

    Authors: Emmanuel Abbe, Elisabetta Cornacchia, Aryo Lotfi

    Abstract: Experimental results have shown that curriculum learning, i.e., presenting simpler examples before more complex ones, can improve the efficiency of learning. Some recent theoretical results also showed that changing the sampling distribution can help neural networks learn parities, with formal results only for large learning rates and one-step arguments. Here we show a separation result in the num… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

    Comments: 34 pages, 8 figures

  2. arXiv:2306.16255  [pdf, other

    math.OC cs.IT math.ST

    Theory and applications of the Sum-Of-Squares technique

    Authors: Francis Bach, Elisabetta Cornacchia, Luca Pesce, Giovanni Piccioli

    Abstract: The Sum-of-Squares (SOS) approximation method is a technique used in optimization problems to derive lower bounds on the optimal value of an objective function. By representing the objective function as a sum of squares in a feature space, the SOS method transforms non-convex global optimization problems into solvable semidefinite programs. This note presents an overview of the SOS method. We star… ▽ More

    Submitted 11 March, 2024; v1 submitted 28 June, 2023; originally announced June 2023.

    Comments: These are notes from the lecture of Francis Bach given at the summer school "Statistical Physics & Machine Learning", that took place in Les Houches School of Physics in France from 4th to 29th July 2022. The school was organized by Florent Krzakala and Lenka Zdeborová from EPFL. 19 pages, 4 figures

  3. arXiv:2301.13833  [pdf, other

    cs.LG

    A Mathematical Model for Curriculum Learning for Parities

    Authors: Elisabetta Cornacchia, Elchanan Mossel

    Abstract: Curriculum learning (CL) - training using samples that are generated and presented in a meaningful order - was introduced in the machine learning context around a decade ago. While CL has been extensively used and analysed empirically, there has been very little mathematical justification for its advantages. We introduce a CL model for learning the class of k-parities on d bits of a binary string… ▽ More

    Submitted 22 April, 2024; v1 submitted 31 January, 2023; originally announced January 2023.

    Journal ref: ICML 2023

  4. arXiv:2205.13647  [pdf, other

    cs.LG stat.ML

    Learning to Reason with Neural Networks: Generalization, Unseen Data and Boolean Measures

    Authors: Emmanuel Abbe, Samy Bengio, Elisabetta Cornacchia, Jon Kleinberg, Aryo Lotfi, Maithra Raghu, Chiyuan Zhang

    Abstract: This paper considers the Pointer Value Retrieval (PVR) benchmark introduced in [ZRKB21], where a 'reasoning' function acts on a string of digits to produce the label. More generally, the paper considers the learning of logical functions with gradient descent (GD) on neural networks. It is first shown that in order to learn logical functions with gradient descent on symmetric neural networks, the g… ▽ More

    Submitted 20 October, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

    Comments: To appear in NeurIPS 2022

  5. arXiv:2203.12094  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Learning curves for the multi-class teacher-student perceptron

    Authors: Elisabetta Cornacchia, Francesca Mignacco, Rodrigo Veiga, Cédric Gerbelot, Bruno Loureiro, Lenka Zdeborová

    Abstract: One of the most classical results in high-dimensional learning theory provides a closed-form expression for the generalisation error of binary classification with the single-layer teacher-student perceptron on i.i.d. Gaussian inputs. Both Bayes-optimal estimation and empirical risk minimisation (ERM) were extensively analysed for this setting. At the same time, a considerable part of modern machin… ▽ More

    Submitted 22 March, 2022; originally announced March 2022.

    Comments: 14 pages + appendix

    Journal ref: Machine Learning: Science and Technology 4 015019 (2022)

  6. arXiv:2202.12846  [pdf, other

    cs.LG

    An initial alignment between neural network and target is needed for gradient descent to learn

    Authors: Emmanuel Abbe, Elisabetta Cornacchia, Jan Hązła, Christopher Marquis

    Abstract: This paper introduces the notion of ``Initial Alignment'' (INAL) between a neural network at initialization and a target function. It is proved that if a network and a Boolean target function do not have a noticeable INAL, then noisy gradient descent on a fully connected network with normalized i.i.d. initialization will not learn in polynomial time. Thus a certain amount of knowledge about the ta… ▽ More

    Submitted 16 August, 2022; v1 submitted 25 February, 2022; originally announced February 2022.

    Journal ref: Proceedings of the International Conference on Machine Learning, 2022

  7. arXiv:2111.02154  [pdf, ps, other

    cs.LG cs.NE

    Regularization by Misclassification in ReLU Neural Networks

    Authors: Elisabetta Cornacchia, Jan Hązła, Ido Nachum, Amir Yehudayoff

    Abstract: We study the implicit bias of ReLU neural networks trained by a variant of SGD where at each step, the label is changed with probability $p$ to a random label (label smoothing being a close variant of this procedure). Our experiments demonstrate that label noise propels the network to a sparse solution in the following sense: for a typical input, a small fraction of neurons are active, and the fir… ▽ More

    Submitted 3 November, 2021; originally announced November 2021.

  8. arXiv:2101.12601  [pdf, other

    math.PR cs.IT

    Stochastic block model entropy and broadcasting on trees with survey

    Authors: Emmanuel Abbe, Elisabetta Cornacchia, Yuzhou Gu, Yury Polyanskiy

    Abstract: The limit of the entropy in the stochastic block model (SBM) has been characterized in the sparse regime for the special case of disassortative communities [COKPZ17] and for the classical case of assortative communities but in the dense regime [DAM16]. The problem has not been closed in the classical sparse and assortative case. This paper establishes the result in this case for any SNR besides fo… ▽ More

    Submitted 29 January, 2021; originally announced January 2021.

  9. arXiv:2006.05251  [pdf, other

    cs.MA math.PR

    Polarization in Attraction-Repulsion Models

    Authors: Elisabetta Cornacchia, Neta Singer, Emmanuel Abbe

    Abstract: This paper introduces a model for opinion dynamics, where at each time step, randomly selected agents see their opinions - modeled as scalars in [0,1] - evolve depending on a local interaction function. In the classical Bounded Confidence Model, agents opinions get attracted when they are close enough. The proposed model extends this by adding a repulsion component, which models the effect of opin… ▽ More

    Submitted 9 June, 2020; originally announced June 2020.