Zum Hauptinhalt springen

Showing 1–7 of 7 results for author: Clerico, E

.
  1. arXiv:2406.12600  [pdf, ps, other

    cs.LG

    Generalization bounds for mixing processes via delayed online-to-PAC conversions

    Authors: Baptiste Abeles, Eugenio Clerico, Gergely Neu

    Abstract: We study the generalization error of statistical learning algorithms in a non-i.i.d. setting, where the training data is sampled from a stationary mixing process. We develop an analytic framework for this scenario based on a reduction to online learning with delayed feedback. In particular, we show that the existence of an online learning algorithm with bounded regret (against a fixed statistical… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  2. arXiv:2312.13259  [pdf, ps, other

    stat.ML cs.LG

    A note on regularised NTK dynamics with an application to PAC-Bayesian training

    Authors: Eugenio Clerico, Benjamin Guedj

    Abstract: We establish explicit dynamics for neural networks whose training objective has a regularising term that constrains the parameters to remain close to their initial value. This keeps the network in a lazy training regime, where the dynamics can be linearised around the initialisation. The standard neural tangent kernel (NTK) governs the evolution during the training in the infinite-width limit, alt… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

  3. arXiv:2209.02525  [pdf, other

    stat.ML cs.LG

    Generalisation under gradient descent via deterministic PAC-Bayes

    Authors: Eugenio Clerico, Tyler Farghly, George Deligiannidis, Benjamin Guedj, Arnaud Doucet

    Abstract: We establish disintegrated PAC-Bayesian generalisation bounds for models trained with gradient descent methods or continuous gradient flows. Contrary to standard practice in the PAC-Bayesian setting, our result applies to optimisation algorithms that are deterministic, without requiring any de-randomisation step. Our bounds are fully computable, depending on the density of the initial distribution… ▽ More

    Submitted 4 April, 2023; v1 submitted 6 September, 2022; originally announced September 2022.

  4. arXiv:2203.00977  [pdf, ps, other

    stat.ML cs.IT cs.LG

    Chained Generalisation Bounds

    Authors: Eugenio Clerico, Amitis Shidani, George Deligiannidis, Arnaud Doucet

    Abstract: This work discusses how to derive upper bounds for the expected generalisation error of supervised learning algorithms by means of the chaining technique. By developing a general theoretical framework, we establish a duality between generalisation bounds based on the regularity of the loss function, and their chained counterparts, which can be obtained by lifting the regularity assumption from the… ▽ More

    Submitted 30 June, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

    Journal ref: Proceedings of the 35th Conference on Learning Theory, PMLR 178:4212-4257, 2022

  5. arXiv:2110.11886  [pdf, other

    cs.LG stat.ML

    Conditionally Gaussian PAC-Bayes

    Authors: Eugenio Clerico, George Deligiannidis, Arnaud Doucet

    Abstract: Recent studies have empirically investigated different methods to train stochastic neural networks on a classification task by optimising a PAC-Bayesian bound via stochastic gradient descent. Most of these procedures need to replace the misclassification error with a surrogate loss, leading to a mismatch between the optimisation objective and the actual generalisation bound. The present paper prop… ▽ More

    Submitted 24 February, 2022; v1 submitted 22 October, 2021; originally announced October 2021.

    Journal ref: Proceedings of the 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:2311-2329, 2022

  6. arXiv:2106.09798  [pdf, other

    stat.ML cs.LG

    Wide stochastic networks: Gaussian limit and PAC-Bayesian training

    Authors: Eugenio Clerico, George Deligiannidis, Arnaud Doucet

    Abstract: The limit of infinite width allows for substantial simplifications in the analytical study of over-parameterised neural networks. With a suitable random initialisation, an extremely large network exhibits an approximately Gaussian behaviour. In the present work, we establish a similar result for a simple stochastic architecture whose parameters are random variables, holding both before and during… ▽ More

    Submitted 13 February, 2023; v1 submitted 17 June, 2021; originally announced June 2021.

    Journal ref: The 34th International Conference on Algorithmic Learning Theory (ALT 2023)

  7. arXiv:2010.12859  [pdf, other

    cs.LG stat.ML

    Stable ResNet

    Authors: Soufiane Hayou, Eugenio Clerico, Bobby He, George Deligiannidis, Arnaud Doucet, Judith Rousseau

    Abstract: Deep ResNet architectures have achieved state of the art performance on many tasks. While they solve the problem of gradient vanishing, they might suffer from gradient exploding as the depth becomes large (Yang et al. 2017). Moreover, recent results have shown that ResNet might lose expressivity as the depth goes to infinity (Yang et al. 2017, Hayou et al. 2019). To resolve these issues, we introd… ▽ More

    Submitted 18 March, 2021; v1 submitted 24 October, 2020; originally announced October 2020.

    Comments: 43 pages, 4 figures