Skip to main content

Showing 1–16 of 16 results for author: Sandon, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.06467  [pdf, other

    cs.LG cs.AI stat.ML

    How Far Can Transformers Reason? The Locality Barrier and Inductive Scratchpad

    Authors: Emmanuel Abbe, Samy Bengio, Aryo Lotfi, Colin Sandon, Omid Saremi

    Abstract: Can Transformers predict new syllogisms by composing established ones? More generally, what type of targets can be learned by such models from scratch? Recent works show that Transformers can be Turing-complete in terms of expressivity, but this does not address the learnability objective. This paper puts forward the notion of 'distribution locality' to capture when weak learning is efficiently ac… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 38 pages, 11 figures

  2. arXiv:2312.04329  [pdf, other

    cs.IT cs.DM math.CO

    Reed-Muller codes have vanishing bit-error probability below capacity: a simple tighter proof via camellia boosting

    Authors: Emmanuel Abbe, Colin Sandon

    Abstract: This paper shows that a class of codes such as Reed-Muller (RM) codes have vanishing bit-error probability below capacity on symmetric channels. The proof relies on the notion of `camellia codes': a class of symmetric codes decomposable into `camellias', i.e., set systems that differ from sunflowers by allowing for scattered petal overlaps. The proof then follows from a boosting argument on the ca… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

  3. arXiv:2304.02509  [pdf, other

    cs.IT cs.DM

    A proof that Reed-Muller codes achieve Shannon capacity on symmetric channels

    Authors: Emmanuel Abbe, Colin Sandon

    Abstract: Reed-Muller codes were introduced in 1954, with a simple explicit construction based on polynomial evaluations, and have long been conjectured to achieve Shannon capacity on symmetric channels. Major progress was made towards a proof over the last decades; using combinatorial weight enumerator bounds, a breakthrough on the erasure channel from sharp thresholds, hypercontractivity arguments, and po… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

  4. arXiv:2210.05893  [pdf, other

    math.ST cs.DS

    The Power of Two Matrices in Spectral Algorithms

    Authors: Souvik Dhara, Julia Gaudio, Elchanan Mossel, Colin Sandon

    Abstract: Spectral algorithms are some of the main tools in optimization and inference problems on graphs. Typically, the graph is encoded as a matrix and eigenvectors and eigenvalues of the matrix are then used to solve the given graph problem. Spectral algorithms have been successfully used for graph partitioning, hidden clique recovery and graph coloring. In this paper, we study the power of spectral alg… ▽ More

    Submitted 7 March, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

    Comments: 34 pages, 1 figure Added results on more than two communities; corrected proof of statistical achievability

  5. arXiv:2203.11847  [pdf, other

    cs.DS math.PR stat.ML

    Spectral Algorithms Optimally Recover Planted Sub-structures

    Authors: Souvik Dhara, Julia Gaudio, Elchanan Mossel, Colin Sandon

    Abstract: Spectral algorithms are an important building block in machine learning and graph algorithms. We are interested in studying when such algorithms can be applied directly to provide optimal solutions to inference tasks. Previous works by Abbe, Fan, Wang and Zhong (2020) and by Dhara, Gaudio, Mossel and Sandon (2022) showed the optimality for community detection in the Stochastic Block Model (SBM), a… ▽ More

    Submitted 11 October, 2022; v1 submitted 22 March, 2022; originally announced March 2022.

    Comments: 28 pages, 2 figures; New content on submatrix localization

  6. arXiv:2108.04190  [pdf, ps, other

    cs.LG stat.ML

    On the Power of Differentiable Learning versus PAC and SQ Learning

    Authors: Emmanuel Abbe, Pritish Kamath, Eran Malach, Colin Sandon, Nathan Srebro

    Abstract: We study the power of learning via mini-batch stochastic gradient descent (SGD) on the population loss, and batch Gradient Descent (GD) on the empirical loss, of a differentiable model or neural network, and ask what learning problems can be learnt using these paradigms. We show that SGD and GD can always simulate learning with statistical queries (SQ), but their ability to go beyond that depends… ▽ More

    Submitted 5 February, 2022; v1 submitted 9 August, 2021; originally announced August 2021.

  7. arXiv:2106.08393  [pdf, ps, other

    cs.LG cs.CR cs.DS

    Spoofing Generalization: When Can't You Trust Proprietary Models?

    Authors: Ankur Moitra, Elchanan Mossel, Colin Sandon

    Abstract: In this work, we study the computational complexity of determining whether a machine learning model that perfectly fits the training data will generalizes to unseen data. In particular, we study the power of a malicious agent whose goal is to construct a model g that fits its training data and nothing else, but is indistinguishable from an accurate model f. We say that g strongly spoofs f if no po… ▽ More

    Submitted 23 March, 2022; v1 submitted 15 June, 2021; originally announced June 2021.

  8. arXiv:2101.06178  [pdf, ps, other

    cs.LG

    Learning to Sample from Censored Markov Random Fields

    Authors: Ankur Moitra, Elchanan Mossel, Colin Sandon

    Abstract: We study learning Censor Markov Random Fields (abbreviated CMRFs). These are Markov Random Fields where some of the nodes are censored (not observed). We present an algorithm for learning high-temperature CMRFs within o(n) transportation distance. Crucially our algorithm makes no assumption about the structure of the graph or the number or location of the observed nodes. We obtain stronger results… ▽ More

    Submitted 15 January, 2021; originally announced January 2021.

  9. arXiv:2001.02992  [pdf, other

    cs.LG cs.CC cs.IT stat.ML

    Poly-time universality and limitations of deep learning

    Authors: Emmanuel Abbe, Colin Sandon

    Abstract: The goal of this paper is to characterize function distributions that deep learning can or cannot learn in poly-time. A universality result is proved for SGD-based deep learning and a non-universality result is proved for GD-based deep learning; this also gives a separation between SGD-based deep learning and statistical query algorithms: (1) {\it Deep learning with SGD is efficiently universal.… ▽ More

    Submitted 7 January, 2020; originally announced January 2020.

    Comments: arXiv admin note: substantial text overlap with arXiv:1812.06369

  10. arXiv:1904.05483  [pdf, ps, other

    cs.CC

    Parallels Between Phase Transitions and Circuit Complexity?

    Authors: Ankur Moitra, Elchanan Mossel, Colin Sandon

    Abstract: In many natural average-case problems, there are or there are believed to be critical values in the parameter space where the structure of the space of solutions changes in a fundamental way. These phase transitions are often believed to coincide with drastic changes in the computational complexity of the associated problem. In this work, we study the circuit complexity of inference in the broad… ▽ More

    Submitted 9 December, 2019; v1 submitted 10 April, 2019; originally announced April 2019.

    Comments: The paper was titled: The Circuit Complexity of Inference in the first version

  11. arXiv:1812.06369  [pdf, other

    cs.LG cs.CC cs.IT stat.ML

    Provable limitations of deep learning

    Authors: Emmanuel Abbe, Colin Sandon

    Abstract: As the success of deep learning reaches more grounds, one would like to also envision the potential limits of deep learning. This paper gives a first set of results proving that certain deep learning algorithms fail at learning certain efficiently learnable functions. The results put forward a notion of cross-predictability that characterizes when such failures take place. Parity functions provide… ▽ More

    Submitted 29 April, 2019; v1 submitted 15 December, 2018; originally announced December 2018.

  12. arXiv:1809.04818  [pdf, other

    cs.DS cs.DM math.PR

    Graph powering and spectral robustness

    Authors: Emmanuel Abbe, Enric Boix, Peter Ralli, Colin Sandon

    Abstract: Spectral algorithms, such as principal component analysis and spectral clustering, typically require careful data transformations to be effective: upon observing a matrix $A$, one may look at the spectrum of $ψ(A)$ for a properly chosen $ψ$. The issue is that the spectrum of $A$ might be contaminated by non-informational top eigenvalues, e.g., due to scale` variations in the data, and the applicat… ▽ More

    Submitted 13 September, 2018; originally announced September 2018.

  13. arXiv:1512.09080  [pdf, other

    math.PR cs.CC cs.IT cs.LG cs.SI

    Detection in the stochastic block model with multiple clusters: proof of the achievability conjectures, acyclic BP, and the information-computation gap

    Authors: Emmanuel Abbe, Colin Sandon

    Abstract: In a paper that initiated the modern study of the stochastic block model, Decelle et al., backed by Mossel et al., made the following conjecture: Denote by $k$ the number of balanced communities, $a/n$ the probability of connecting inside communities and $b/n$ across, and set $\mathrm{SNR}=(a-b)^2/(k(a+(k-1)b)$; for any $k \geq 2$, it is possible to detect communities efficiently whenever… ▽ More

    Submitted 14 September, 2016; v1 submitted 30 December, 2015; originally announced December 2015.

    Comments: Extended version with further details on the algorithms and methods

  14. arXiv:1506.03729  [pdf, other

    math.PR cs.IT cs.LG cs.SI

    Recovering communities in the general stochastic block model without knowing the parameters

    Authors: Emmanuel Abbe, Colin Sandon

    Abstract: Most recent developments on the stochastic block model (SBM) rely on the knowledge of the model parameters, or at least on the number of communities. This paper introduces efficient algorithms that do not require such knowledge and yet achieve the optimal information-theoretic tradeoffs identified in [AS15] for linear size communities. The results are three-fold: (i) in the constant degree regime,… ▽ More

    Submitted 11 June, 2015; originally announced June 2015.

    Comments: arXiv admin note: substantial text overlap with arXiv:1503.00609

  15. arXiv:1503.00609  [pdf, other

    math.PR cs.IT cs.SI

    Community detection in general stochastic block models: fundamental limits and efficient recovery algorithms

    Authors: Emmanuel Abbe, Colin Sandon

    Abstract: New phase transition phenomena have recently been discovered for the stochastic block model, for the special case of two non-overlapping symmetric communities. This gives raise in particular to new algorithmic challenges driven by the thresholds. This paper investigates whether a general phenomenon takes place for multiple communities, without imposing symmetry. In the general stochastic block m… ▽ More

    Submitted 4 April, 2015; v1 submitted 2 March, 2015; originally announced March 2015.

  16. arXiv:1401.6528  [pdf, other

    cs.IT

    Linear Boolean classification, coding and "the critical problem"

    Authors: Emmanuel Abbe, Noga Alon, Afonso S. Bandeira, Colin Sandon

    Abstract: The problem of constructing a minimal rank matrix over GF(2) whose kernel does not intersect a given set S is considered. In the case where S is a Hamming ball centered at 0, this is equivalent to finding linear codes of largest dimension. For a general set, this is an instance of "the critical problem" posed by Crapo and Rota in 1970. This work focuses on the case where S is an annulus. As oppose… ▽ More

    Submitted 27 June, 2015; v1 submitted 25 January, 2014; originally announced January 2014.