Zum Hauptinhalt springen

Showing 1–7 of 7 results for author: Mingard, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.04371  [pdf, other

    quant-ph cs.AI

    Exploiting the equivalence between quantum neural networks and perceptrons

    Authors: Chris Mingard, Jessica Pointing, Charles London, Yoonsoo Nam, Ard A. Louis

    Abstract: Quantum machine learning models based on parametrized quantum circuits, also called quantum neural networks (QNNs), are considered to be among the most promising candidates for applications on near-term quantum devices. Here we explore the expressivity and inductive bias of QNNs by exploiting an exact mapping from QNNs with inputs $x$ to classical perceptrons acting on $x \otimes x$ (generalised t… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  2. arXiv:2404.17563  [pdf, other

    cs.LG cond-mat.dis-nn stat.ML

    An exactly solvable model for emergence and scaling laws

    Authors: Yoonsoo Nam, Nayara Fonseca, Seok Hyeong Lee, Chris Mingard, Ard A. Louis

    Abstract: Deep learning models can exhibit what appears to be a sudden ability to solve a new problem as training time, training data, or model size increases, a phenomenon known as emergence. In this paper, we present a framework where each new ability (a skill) is represented as a basis function. We solve a simple multi-linear model in this skill-basis, finding analytic expressions for the emergence of ne… ▽ More

    Submitted 14 July, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

  3. arXiv:2304.06670  [pdf, other

    cs.LG cs.AI stat.ML

    Do deep neural networks have an inbuilt Occam's razor?

    Authors: Chris Mingard, Henry Rees, Guillermo Valle-Pérez, Ard A. Louis

    Abstract: The remarkable performance of overparameterized deep neural networks (DNNs) must arise from an interplay between network architecture, training algorithms, and structure in the data. To disentangle these three components, we apply a Bayesian picture, based on the functions expressed by a DNN, to supervised learning. The prior over functions is determined by the network, and is varied by exploiting… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

  4. arXiv:2304.05187  [pdf, other

    cs.LG cs.AI cs.NE math.NA stat.ML

    Automatic Gradient Descent: Deep Learning without Hyperparameters

    Authors: Jeremy Bernstein, Chris Mingard, Kevin Huang, Navid Azizan, Yisong Yue

    Abstract: The architecture of a deep neural network is defined explicitly in terms of the number of layers, the width of each layer and the general network topology. Existing optimisation frameworks neglect this information in favour of implicit architectural information (e.g. second-order methods) or architecture-agnostic distance functions (e.g. mirror descent). Meanwhile, the most popular optimiser in pr… ▽ More

    Submitted 11 April, 2023; originally announced April 2023.

  5. arXiv:2110.11749  [pdf, other

    stat.ML cs.LG

    Feature Learning and Signal Propagation in Deep Neural Networks

    Authors: Yizhang Lou, Chris Mingard, Yoonsoo Nam, Soufiane Hayou

    Abstract: Recent work by Baratin et al. (2021) sheds light on an intriguing pattern that occurs during the training of deep neural networks: some layers align much more with data compared to other layers (where the alignment is defined as the euclidean product of the tangent features matrix and the data labels matrix). The curve of the alignment as a function of layer index (generally) exhibits an ascent-de… ▽ More

    Submitted 22 May, 2022; v1 submitted 22 October, 2021; originally announced October 2021.

    Comments: 35 pages

    Journal ref: International Conference on Machine Learning. PMLR, 2022

  6. arXiv:2006.15191  [pdf, other

    cs.LG stat.ML

    Is SGD a Bayesian sampler? Well, almost

    Authors: Chris Mingard, Guillermo Valle-Pérez, Joar Skalse, Ard A. Louis

    Abstract: Overparameterised deep neural networks (DNNs) are highly expressive and so can, in principle, generate almost any function that fits a training dataset with zero error. The vast majority of these functions will perform poorly on unseen data, and yet in practice DNNs often generalise remarkably well. This success suggests that a trained DNN must have a strong inductive bias towards functions with l… ▽ More

    Submitted 24 October, 2020; v1 submitted 26 June, 2020; originally announced June 2020.

    Journal ref: Journal of Machine Learning Research, 22 79 (2021), 1-64

  7. arXiv:1909.11522  [pdf, other

    cs.LG stat.ML

    Neural networks are a priori biased towards Boolean functions with low entropy

    Authors: Chris Mingard, Joar Skalse, Guillermo Valle-Pérez, David Martínez-Rubio, Vladimir Mikulik, Ard A. Louis

    Abstract: Understanding the inductive bias of neural networks is critical to explaining their ability to generalise. Here, for one of the simplest neural networks -- a single-layer perceptron with n input neurons, one output neuron, and no threshold bias term -- we prove that upon random initialisation of weights, the a priori probability $P(t)$ that it represents a Boolean function that classifies t points… ▽ More

    Submitted 2 January, 2020; v1 submitted 25 September, 2019; originally announced September 2019.