Zum Hauptinhalt springen

Showing 1–22 of 22 results for author: Berthet, Q

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.05468  [pdf, other

    cs.LG

    Implicit Diffusion: Efficient Optimization through Stochastic Sampling

    Authors: Pierre Marion, Anna Korba, Peter Bartlett, Mathieu Blondel, Valentin De Bortoli, Arnaud Doucet, Felipe Llinares-López, Courtney Paquette, Quentin Berthet

    Abstract: We present a new algorithm to optimize distributions defined implicitly by parameterized stochastic diffusions. Doing so allows us to modify the outcome distribution of sampling processes by optimizing over their parameters. We introduce a general framework for first-order optimization of these processes, that performs jointly, in a single loop, optimization and sampling steps. This approach is in… ▽ More

    Submitted 22 May, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: 38 pages, 16 figures. Updated with additional experiments

  2. arXiv:2402.02992  [pdf, other

    cs.LG cs.AI cs.CL

    Decoding-time Realignment of Language Models

    Authors: Tianlin Liu, Shangmin Guo, Leonardo Bianco, Daniele Calandriello, Quentin Berthet, Felipe Llinares, Jessica Hoffmann, Lucas Dixon, Michal Valko, Mathieu Blondel

    Abstract: Aligning language models with human preferences is crucial for reducing errors and biases in these models. Alignment techniques, such as reinforcement learning from human feedback (RLHF), are typically cast as optimizing a tradeoff between human preference rewards and a proximity regularization term that encourages staying close to the unaligned model. Selecting an appropriate level of regularizat… ▽ More

    Submitted 24 May, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: In Proceedings of the 41st International Conference on Machine Learning (ICML 2024)

  3. arXiv:2305.16358  [pdf, other

    cs.LG cs.AI stat.ML

    Differentiable Clustering with Perturbed Spanning Forests

    Authors: Lawrence Stewart, Francis S Bach, Felipe Llinares López, Quentin Berthet

    Abstract: We introduce a differentiable clustering method based on stochastic perturbations of minimum-weight spanning forests. This allows us to include clustering in end-to-end trainable pipelines, with efficient gradients. We show that our method performs well even in difficult settings, such as data sets with high noise and challenging geometries. We also formulate an ad hoc loss to efficiently learn fr… ▽ More

    Submitted 6 November, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Journal ref: 37th Conference on Neural Information Processing Systems, Dec 2023, New Orleans, United States

  4. arXiv:2211.10420  [pdf, other

    cs.LG stat.ML

    Mirror Sinkhorn: Fast Online Optimization on Transport Polytopes

    Authors: Marin Ballu, Quentin Berthet

    Abstract: Optimal transport is an important tool in machine learning, allowing to capture geometric properties of the data through a linear program on transport polytopes. We present a single-loop optimization algorithm for minimizing general convex objectives on these domains, utilizing the principles of Sinkhorn matrix scaling and mirror descent. The proposed algorithm is robust to noise, and can be used… ▽ More

    Submitted 20 June, 2023; v1 submitted 18 November, 2022; originally announced November 2022.

    Comments: ICML 2023

  5. arXiv:2211.05641  [pdf, other

    cs.LG cs.AI stat.ML

    Regression as Classification: Influence of Task Formulation on Neural Network Features

    Authors: Lawrence Stewart, Francis Bach, Quentin Berthet, Jean-Philippe Vert

    Abstract: Neural networks can be trained to solve regression problems by using gradient-based methods to minimize the square loss. However, practitioners often prefer to reformulate regression as a classification problem, observing that training on the cross entropy loss results in better performance. By focusing on two-layer ReLU networks, which can be fully characterized by measures over their feature spa… ▽ More

    Submitted 1 March, 2023; v1 submitted 10 November, 2022; originally announced November 2022.

  6. arXiv:2205.12751  [pdf, ps, other

    math.OC cs.LG

    Fast Stochastic Composite Minimization and an Accelerated Frank-Wolfe Algorithm under Parallelization

    Authors: Benjamin Dubois-Taine, Francis Bach, Quentin Berthet, Adrien Taylor

    Abstract: We consider the problem of minimizing the sum of two convex functions. One of those functions has Lipschitz-continuous gradients, and can be accessed via stochastic oracles, whereas the other is "simple". We provide a Bregman-type algorithm with accelerated convergence in function values to a ball containing the minimum. The radius of this ball depends on problem-dependent constants, including the… ▽ More

    Submitted 12 October, 2022; v1 submitted 25 May, 2022; originally announced May 2022.

  7. arXiv:2201.02262  [pdf, other

    cs.NE

    A unified software/hardware scalable architecture for brain-inspired computing based on self-organizing neural models

    Authors: Artem R. Muliukov, Laurent Rodriguez, Benoit Miramond, Lyes Khacef, Joachim Schmidt, Quentin Berthet, Andres Upegui

    Abstract: The field of artificial intelligence has significantly advanced over the past decades, inspired by discoveries from the fields of biology and neuroscience. The idea of this work is inspired by the process of self-organization of cortical areas in the human brain from both afferent and lateral/internal connections. In this work, we develop an original brain-inspired neural model associating Self-Or… ▽ More

    Submitted 6 January, 2022; originally announced January 2022.

  8. arXiv:2105.15183  [pdf, other

    cs.LG math.NA stat.ML

    Efficient and Modular Implicit Differentiation

    Authors: Mathieu Blondel, Quentin Berthet, Marco Cuturi, Roy Frostig, Stephan Hoyer, Felipe Llinares-López, Fabian Pedregosa, Jean-Philippe Vert

    Abstract: Automatic differentiation (autodiff) has revolutionized machine learning. It allows to express complex computations by composing elementary ones in creative ways and removes the burden of computing their derivatives by hand. More recently, differentiation of optimization problem solutions has attracted widespread attention with applications such as optimization layers, and in bi-level problems suc… ▽ More

    Submitted 12 October, 2022; v1 submitted 31 May, 2021; originally announced May 2021.

    Comments: V3: added more related work and Jacobian precision figure

  9. arXiv:2103.09879  [pdf, other

    cs.SD cs.AI eess.AS

    Self-Supervised Learning of Audio Representations from Permutations with Differentiable Ranking

    Authors: Andrew N Carr, Quentin Berthet, Mathieu Blondel, Olivier Teboul, Neil Zeghidour

    Abstract: Self-supervised pre-training using so-called "pretext" tasks has recently shown impressive performance across a wide range of modalities. In this work, we advance self-supervised learning from permutations, by pre-training a model to reorder shuffled parts of the spectrogram of an audio signal, to improve downstream classification performance. We make two main contributions. First, we overcome the… ▽ More

    Submitted 17 March, 2021; originally announced March 2021.

  10. arXiv:2004.12508  [pdf, other

    stat.ME cs.LG stat.AP

    Noisy Adaptive Group Testing using Bayesian Sequential Experimental Design

    Authors: Marco Cuturi, Olivier Teboul, Quentin Berthet, Arnaud Doucet, Jean-Philippe Vert

    Abstract: When the infection prevalence of a disease is low, Dorfman showed 80 years ago that testing groups of people can prove more efficient than testing people individually. Our goal in this paper is to propose new group testing algorithms that can operate in a noisy setting (tests can be mistaken) to decide adaptively (looking at past results) which groups to test next, with the goal to converge to a g… ▽ More

    Submitted 22 July, 2020; v1 submitted 26 April, 2020; originally announced April 2020.

    Comments: Latest version, with updated experiments, new conclusions on LBP vs SMC decoding and new approach

  11. arXiv:2002.08871  [pdf, other

    stat.ML cs.LG

    Fast Differentiable Sorting and Ranking

    Authors: Mathieu Blondel, Olivier Teboul, Quentin Berthet, Josip Djolonga

    Abstract: The sorting operation is one of the most commonly used building blocks in computer programming. In machine learning, it is often used for robust statistics. However, seen as a function, it is piecewise linear and as a result includes many kinks where it is non-differentiable. More problematic is the related ranking operator, often used for order statistics and ranking metrics. It is a piecewise co… ▽ More

    Submitted 29 June, 2020; v1 submitted 20 February, 2020; originally announced February 2020.

    Comments: In proceedings of ICML 2020

  12. arXiv:2002.08695  [pdf, other

    cs.LG math.OC stat.ML

    Stochastic Optimization for Regularized Wasserstein Estimators

    Authors: Marin Ballu, Quentin Berthet, Francis Bach

    Abstract: Optimal transport is a foundational problem in optimization, that allows to compare probability distributions while taking into account geometric aspects. Its optimal objective value, the Wasserstein distance, provides an important loss between distributions that has been used in many applications throughout machine learning and statistics. Recent algorithmic progress on this problem and its regul… ▽ More

    Submitted 20 February, 2020; originally announced February 2020.

  13. arXiv:2002.08676  [pdf, other

    cs.LG math.OC stat.ML

    Learning with Differentiable Perturbed Optimizers

    Authors: Quentin Berthet, Mathieu Blondel, Olivier Teboul, Marco Cuturi, Jean-Philippe Vert, Francis Bach

    Abstract: Machine learning pipelines often rely on optimization procedures to make discrete decisions (e.g., sorting, picking closest neighbors, or shortest paths). Although these discrete decisions are easily computed, they break the back-propagation of computational graphs. In order to expand the scope of learning problems that can be solved in an end-to-end fashion, we propose a systematic method to tran… ▽ More

    Submitted 9 June, 2020; v1 submitted 20 February, 2020; originally announced February 2020.

  14. arXiv:1810.05065  [pdf, ps, other

    stat.ML cs.LG math.OC

    Regularized Contextual Bandits

    Authors: Xavier Fontaine, Quentin Berthet, Vianney Perchet

    Abstract: We consider the stochastic contextual bandit problem with additional regularization. The motivation comes from problems where the policy of the agent must be close to some baseline policy which is known to perform well on the task. To tackle this problem we use a nonparametric model and propose an algorithm splitting the context space into bins, and solving simultaneously - and independently - reg… ▽ More

    Submitted 5 June, 2019; v1 submitted 11 October, 2018; originally announced October 2018.

    Comments: AISTATS 2019, 23 pages, 2 figures

    Journal ref: Proceedings of Machine Learning Research, PMLR 89:2144-2153, 2019

  15. arXiv:1808.01857  [pdf, other

    math.ST cs.LG stat.ML

    Statistical Windows in Testing for the Initial Distribution of a Reversible Markov Chain

    Authors: Quentin Berthet, Varun Kanade

    Abstract: We study the problem of hypothesis testing between two discrete distributions, where we only have access to samples after the action of a known reversible Markov chain, playing the role of noise. We derive instance-dependent minimax rates for the sample complexity of this problem, and show how its dependence in time is related to the spectral properties of the Markov chain. We show that there exis… ▽ More

    Submitted 6 August, 2018; originally announced August 2018.

    MSC Class: 62C20

  16. arXiv:1805.11222  [pdf, other

    cs.LG cs.CL stat.ML

    Unsupervised Alignment of Embeddings with Wasserstein Procrustes

    Authors: Edouard Grave, Armand Joulin, Quentin Berthet

    Abstract: We consider the task of aligning two sets of points in high dimension, which has many applications in natural language processing and computer vision. As an example, it was recently shown that it is possible to infer a bilingual lexicon, without supervised data, by aligning word embeddings trained on monolingual data. These recent advances are based on adversarial training to learn the mapping bet… ▽ More

    Submitted 28 May, 2018; originally announced May 2018.

  17. arXiv:1702.06917  [pdf, ps, other

    cs.LG math.OC stat.ML

    Fast Rates for Bandit Optimization with Upper-Confidence Frank-Wolfe

    Authors: Quentin Berthet, Vianney Perchet

    Abstract: We consider the problem of bandit optimization, inspired by stochastic optimization and online learning problems with bandit feedback. In this problem, the objective is to minimize a global loss function of all the actions, not necessarily a cumulative loss. This framework allows us to study a very general class of problems, with applications in statistics, machine learning, and other fields. To s… ▽ More

    Submitted 6 September, 2017; v1 submitted 22 February, 2017; originally announced February 2017.

  18. arXiv:1612.03880  [pdf, other

    math.ST cs.DS

    Exact recovery in the Ising blockmodel

    Authors: Quentin Berthet, Philippe Rigollet, Piyush Srivastava

    Abstract: We consider the problem associated to recovering the block structure of an Ising model given independent observations on the binary hypercube. This new model, called the Ising blockmodel, is a perturbation of the mean field approximation of the Ising model known as the Curie-Weiss model: the sites are partitioned into two blocks of equal size and the interaction between those of the same block is… ▽ More

    Submitted 2 February, 2017; v1 submitted 12 December, 2016; originally announced December 2016.

    MSC Class: 62H30

  19. arXiv:1605.09646  [pdf, other

    cs.LG cs.CC math.ST stat.ML

    Average-case Hardness of RIP Certification

    Authors: Tengyao Wang, Quentin Berthet, Yaniv Plan

    Abstract: The restricted isometry property (RIP) for design matrices gives guarantees for optimal recovery in sparse linear models. It is of high interest in compressed sensing and statistical learning. This property is particularly important for computationally efficient recovery methods. As a consequence, even though it is in general NP-hard to check that RIP holds, there have been substantial efforts to… ▽ More

    Submitted 31 May, 2016; originally announced May 2016.

  20. arXiv:1502.06144  [pdf, ps, other

    math.ST cs.CC cs.LG

    Detection of Planted Solutions for Flat Satisfiability Problems

    Authors: Quentin Berthet, Jordan S. Ellenberg

    Abstract: We study the detection problem of finding planted solutions in random instances of flat satisfiability problems, a generalization of boolean satisfiability formulas. We describe the properties of random instances of flat satisfiability, as well of the optimal rates of detection of the associated hypothesis testing problem. We also study the performance of an algorithmically efficient testing proce… ▽ More

    Submitted 6 March, 2019; v1 submitted 21 February, 2015; originally announced February 2015.

    MSC Class: 62C20; 68R01; 60C05

  21. arXiv:1401.2205  [pdf, ps, other

    math.ST cs.CC math.PR

    Optimal Testing for Planted Satisfiability Problems

    Authors: Quentin Berthet

    Abstract: We study the problem of detecting planted solutions in a random satisfiability formula. Adopting the formalism of hypothesis testing in statistical analysis, we describe the minimax optimal rates of detection. Our analysis relies on the study of the number of satisfying assignments, for which we prove new results. We also address algorithmic issues, and give a computationally efficient test with o… ▽ More

    Submitted 7 February, 2015; v1 submitted 9 January, 2014; originally announced January 2014.

  22. arXiv:1304.0828  [pdf, ps, other

    math.ST cs.CC stat.ML

    Computational Lower Bounds for Sparse PCA

    Authors: Quentin Berthet, Philippe Rigollet

    Abstract: In the context of sparse principal component detection, we bring evidence towards the existence of a statistical price to pay for computational efficiency. We measure the performance of a test by the smallest signal strength that it can detect and we propose a computationally efficient method based on semidefinite programming. We also prove that the statistical performance of this test cannot be s… ▽ More

    Submitted 26 April, 2013; v1 submitted 2 April, 2013; originally announced April 2013.

    Comments: Alternate title: "Complexity Theoretic Lower Bounds for Sparse Principal Component Detection"

    MSC Class: 62C20