Zum Hauptinhalt springen

Showing 1–43 of 43 results for author: Gelly, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2104.04191  [pdf, other

    cs.CV cs.AI cs.LG

    SI-Score: An image dataset for fine-grained analysis of robustness to object location, rotation and size

    Authors: Jessica Yung, Rob Romijnders, Alexander Kolesnikov, Lucas Beyer, Josip Djolonga, Neil Houlsby, Sylvain Gelly, Mario Lucic, Xiaohua Zhai

    Abstract: Before deploying machine learning models it is critical to assess their robustness. In the context of deep neural networks for image understanding, changing the object location, rotation and size may affect the predictions in non-trivial ways. In this work we perform a fine-grained analysis of robustness with respect to these factors of variation using SI-Score, a synthetic dataset. In particular,… ▽ More

    Submitted 9 April, 2021; originally announced April 2021.

    Comments: 4 pages (10 pages including references and appendix), 10 figures. Accepted at the ICLR 2021 RobustML Workshop. arXiv admin note: text overlap with arXiv:2007.08558

  2. arXiv:2104.02638  [pdf, other

    cs.LG cs.CV

    Comparing Transfer and Meta Learning Approaches on a Unified Few-Shot Classification Benchmark

    Authors: Vincent Dumoulin, Neil Houlsby, Utku Evci, Xiaohua Zhai, Ross Goroshin, Sylvain Gelly, Hugo Larochelle

    Abstract: Meta and transfer learning are two successful families of approaches to few-shot learning. Despite highly related goals, state-of-the-art advances in each family are measured largely in isolation of each other. As a result of diverging evaluation norms, a direct or thorough comparison of different approaches is challenging. To bridge this gap, we perform a cross-family study of the best transfer a… ▽ More

    Submitted 6 April, 2021; originally announced April 2021.

  3. arXiv:2010.14766  [pdf, other

    cs.LG stat.ML

    A Sober Look at the Unsupervised Learning of Disentangled Representations and their Evaluation

    Authors: Francesco Locatello, Stefan Bauer, Mario Lucic, Gunnar Rätsch, Sylvain Gelly, Bernhard Schölkopf, Olivier Bachem

    Abstract: The idea behind the \emph{unsupervised} learning of \emph{disentangled} representations is that real-world data is generated by a few explanatory factors of variation which can be recovered by unsupervised learning algorithms. In this paper, we provide a sober look at recent progress in the field and challenge some common assumptions. We first theoretically show that the unsupervised learning of d… ▽ More

    Submitted 27 October, 2020; originally announced October 2020.

    Comments: arXiv admin note: substantial text overlap with arXiv:1811.12359

    Journal ref: Journal of Machine Learning Research 2020, Volume 21, Number 209

  4. arXiv:2010.11929  [pdf, other

    cs.CV cs.AI cs.LG

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    Authors: Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby

    Abstract: While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In vision, attention is either applied in conjunction with convolutional networks, or used to replace certain components of convolutional networks while keeping their overall structure in place. We show that this reliance on CNNs is not nece… ▽ More

    Submitted 3 June, 2021; v1 submitted 22 October, 2020; originally announced October 2020.

    Comments: Fine-tuning code and pre-trained models are available at https://github.com/google-research/vision_transformer. ICLR camera-ready version with 2 small modifications: 1) Added a discussion of CLS vs GAP classifier in the appendix, 2) Fixed an error in exaFLOPs computation in Figure 5 and Table 6 (relative performance of models is basically not affected)

  5. arXiv:2009.13239  [pdf, other

    cs.LG cs.CV stat.ML

    Scalable Transfer Learning with Expert Models

    Authors: Joan Puigcerver, Carlos Riquelme, Basil Mustafa, Cedric Renggli, André Susano Pinto, Sylvain Gelly, Daniel Keysers, Neil Houlsby

    Abstract: Transfer of pre-trained representations can improve sample efficiency and reduce computational requirements for new tasks. However, representations used for transfer are usually generic, and are not tailored to a particular distribution of downstream tasks. We explore the use of expert representations for transfer with a simple, yet effective, strategy. We train a diverse set of experts by exploit… ▽ More

    Submitted 28 September, 2020; originally announced September 2020.

  6. arXiv:2007.14184  [pdf, other

    cs.LG cs.AI stat.ML

    A Commentary on the Unsupervised Learning of Disentangled Representations

    Authors: Francesco Locatello, Stefan Bauer, Mario Lucic, Gunnar Rätsch, Sylvain Gelly, Bernhard Schölkopf, Olivier Bachem

    Abstract: The goal of the unsupervised learning of disentangled representations is to separate the independent explanatory factors of variation in the data without access to supervision. In this paper, we summarize the results of Locatello et al., 2019, and focus on their implications for practitioners. We discuss the theoretical result showing that the unsupervised learning of disentangled representations… ▽ More

    Submitted 28 July, 2020; originally announced July 2020.

    Journal ref: The Thirty-Fourth AAAI Conference on Artificial Intelligence 2020 (AAAI-20)

  7. arXiv:2007.08558  [pdf, other

    cs.CV cs.LG

    On Robustness and Transferability of Convolutional Neural Networks

    Authors: Josip Djolonga, Jessica Yung, Michael Tschannen, Rob Romijnders, Lucas Beyer, Alexander Kolesnikov, Joan Puigcerver, Matthias Minderer, Alexander D'Amour, Dan Moldovan, Sylvain Gelly, Neil Houlsby, Xiaohua Zhai, Mario Lucic

    Abstract: Modern deep convolutional networks (CNNs) are often criticized for not generalizing under distributional shifts. However, several recent breakthroughs in transfer learning suggest that these networks can cope with severe distribution shifts and successfully adapt to new tasks from a few training examples. In this work we study the interplay between out-of-distribution and transfer performance of m… ▽ More

    Submitted 23 March, 2021; v1 submitted 16 July, 2020; originally announced July 2020.

    Comments: Accepted at CVPR 2021

  8. arXiv:2006.10455  [pdf, other

    stat.ML cs.LG

    What Do Neural Networks Learn When Trained With Random Labels?

    Authors: Hartmut Maennel, Ibrahim Alabdulmohsin, Ilya Tolstikhin, Robert J. N. Baldock, Olivier Bousquet, Sylvain Gelly, Daniel Keysers

    Abstract: We study deep neural networks (DNNs) trained on natural image data with entirely random labels. Despite its popularity in the literature, where it is often used to study memorization, generalization, and other phenomena, little is known about what DNNs learn in this setting. In this paper, we show analytically for convolutional and fully connected networks that an alignment between the principal c… ▽ More

    Submitted 11 November, 2020; v1 submitted 18 June, 2020; originally announced June 2020.

    Comments: Accepted, NeurIPS2020

  9. arXiv:2006.05990  [pdf, other

    cs.LG stat.ML

    What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study

    Authors: Marcin Andrychowicz, Anton Raichuk, Piotr Stańczyk, Manu Orsini, Sertan Girgin, Raphael Marinier, Léonard Hussenot, Matthieu Geist, Olivier Pietquin, Marcin Michalski, Sylvain Gelly, Olivier Bachem

    Abstract: In recent years, on-policy reinforcement learning (RL) has been successfully applied to many different continuous control tasks. While RL algorithms are often conceptually simple, their state-of-the-art implementations take numerous low- and high-level design decisions that strongly affect the performance of the resulting agents. Those choices are usually not extensively discussed in the literatur… ▽ More

    Submitted 10 June, 2020; originally announced June 2020.

  10. arXiv:2002.11448  [pdf, other

    stat.ML cs.LG

    Predicting Neural Network Accuracy from Weights

    Authors: Thomas Unterthiner, Daniel Keysers, Sylvain Gelly, Olivier Bousquet, Ilya Tolstikhin

    Abstract: We show experimentally that the accuracy of a trained neural network can be predicted surprisingly well by looking only at its weights, without evaluating it on input data. We motivate this task and introduce a formal setting for it. Even when using simple statistics of the weights, the predictors are able to rank neural networks by their performance with very high accuracy (R2 score more than 0.9… ▽ More

    Submitted 9 April, 2021; v1 submitted 26 February, 2020; originally announced February 2020.

    Comments: Updated the Small CNN Zoo dataset: reduced the maximal learning rate and got rid of multiple bad runs. Replaced all the experiments with the new numbers. Added MLP. Fixed typo in the abstract (R2 score instead of Kendall's tau). Added several earlier related works to the literature overview

  11. arXiv:2001.08049  [pdf, other

    stat.ML cs.LG

    On Last-Layer Algorithms for Classification: Decoupling Representation from Uncertainty Estimation

    Authors: Nicolas Brosse, Carlos Riquelme, Alice Martin, Sylvain Gelly, Éric Moulines

    Abstract: Uncertainty quantification for deep learning is a challenging open problem. Bayesian statistics offer a mathematically grounded framework to reason about uncertainties; however, approximate posteriors for modern neural networks still require prohibitive computational costs. We propose a family of algorithms which split the classification task into two stages: representation learning and uncertaint… ▽ More

    Submitted 22 January, 2020; originally announced January 2020.

  12. arXiv:1912.11370  [pdf, other

    cs.CV cs.LG

    Big Transfer (BiT): General Visual Representation Learning

    Authors: Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Joan Puigcerver, Jessica Yung, Sylvain Gelly, Neil Houlsby

    Abstract: Transfer of pre-trained representations improves sample efficiency and simplifies hyperparameter tuning when training deep neural networks for vision. We revisit the paradigm of pre-training on large supervised datasets and fine-tuning the model on a target task. We scale up pre-training, and propose a simple recipe that we call Big Transfer (BiT). By combining a few carefully selected components,… ▽ More

    Submitted 5 May, 2020; v1 submitted 24 December, 2019; originally announced December 2019.

    Comments: The first three authors contributed equally. Results on ObjectNet are reported in v3

  13. arXiv:1912.02783  [pdf, other

    cs.CV cs.LG

    Self-Supervised Learning of Video-Induced Visual Invariances

    Authors: Michael Tschannen, Josip Djolonga, Marvin Ritter, Aravindh Mahendran, Xiaohua Zhai, Neil Houlsby, Sylvain Gelly, Mario Lucic

    Abstract: We propose a general framework for self-supervised learning of transferable visual representations based on Video-Induced Visual Invariances (VIVI). We consider the implicit hierarchy present in the videos and make use of (i) frame-level invariances (e.g. stability to color and contrast perturbations), (ii) shot/clip-level invariances (e.g. robustness to changes in object orientation and lighting… ▽ More

    Submitted 1 April, 2020; v1 submitted 5 December, 2019; originally announced December 2019.

    Comments: CVPR 2020

  14. arXiv:1911.11357  [pdf, other

    cs.LG cs.CV stat.ML

    Semantic Bottleneck Scene Generation

    Authors: Samaneh Azadi, Michael Tschannen, Eric Tzeng, Sylvain Gelly, Trevor Darrell, Mario Lucic

    Abstract: Coupling the high-fidelity generation capabilities of label-conditional image synthesis methods with the flexibility of unconditional generative models, we propose a semantic bottleneck GAN model for unconditional synthesis of complex scenes. We assume pixel-wise segmentation labels are available during training and use them to learn the scene structure. During inference, our model first synthesiz… ▽ More

    Submitted 26 November, 2019; originally announced November 2019.

  15. arXiv:1910.04867  [pdf, other

    cs.CV cs.LG stat.ML

    A Large-scale Study of Representation Learning with the Visual Task Adaptation Benchmark

    Authors: Xiaohua Zhai, Joan Puigcerver, Alexander Kolesnikov, Pierre Ruyssen, Carlos Riquelme, Mario Lucic, Josip Djolonga, Andre Susano Pinto, Maxim Neumann, Alexey Dosovitskiy, Lucas Beyer, Olivier Bachem, Michael Tschannen, Marcin Michalski, Olivier Bousquet, Sylvain Gelly, Neil Houlsby

    Abstract: Representation learning promises to unlock deep learning for the long tail of vision tasks without expensive labelled datasets. Yet, the absence of a unified evaluation for general visual representations hinders progress. Popular protocols are often too constrained (linear classification), limited in diversity (ImageNet, CIFAR, Pascal-VOC), or only weakly related to representation quality (ELBO, r… ▽ More

    Submitted 21 February, 2020; v1 submitted 1 October, 2019; originally announced October 2019.

  16. arXiv:1907.13625  [pdf, other

    cs.LG stat.ML

    On Mutual Information Maximization for Representation Learning

    Authors: Michael Tschannen, Josip Djolonga, Paul K. Rubenstein, Sylvain Gelly, Mario Lucic

    Abstract: Many recent methods for unsupervised or self-supervised representation learning train feature extractors by maximizing an estimate of the mutual information (MI) between different views of the data. This comes with several immediate problems: For example, MI is notoriously hard to estimate, and using it as an objective for representation learning may lead to highly entangled representations due to… ▽ More

    Submitted 23 January, 2020; v1 submitted 31 July, 2019; originally announced July 2019.

    Comments: ICLR 2020. Michael Tschannen and Josip Djolonga contributed equally

  17. arXiv:1907.11180  [pdf, other

    cs.LG stat.ML

    Google Research Football: A Novel Reinforcement Learning Environment

    Authors: Karol Kurach, Anton Raichuk, Piotr Stańczyk, Michał Zając, Olivier Bachem, Lasse Espeholt, Carlos Riquelme, Damien Vincent, Marcin Michalski, Olivier Bousquet, Sylvain Gelly

    Abstract: Recent progress in the field of reinforcement learning has been accelerated by virtual learning environments such as video games, where novel algorithms and ideas can be quickly tested in a safe and reproducible manner. We introduce the Google Research Football Environment, a new reinforcement learning environment where agents are trained to play football in an advanced, physics-based 3D simulator… ▽ More

    Submitted 14 April, 2020; v1 submitted 25 July, 2019; originally announced July 2019.

  18. arXiv:1907.00868  [pdf, other

    cs.LG cs.AI stat.ML

    MULEX: Disentangling Exploitation from Exploration in Deep RL

    Authors: Lucas Beyer, Damien Vincent, Olivier Teboul, Sylvain Gelly, Matthieu Geist, Olivier Pietquin

    Abstract: An agent learning through interactions should balance its action selection process between probing the environment to discover new rewards and using the information acquired in the past to adopt useful behaviour. This trade-off is usually obtained by perturbing either the agent's actions (e.g., e-greedy or Gibbs sampling) or the agent's parameters (e.g., NoisyNet), or by modifying the reward it re… ▽ More

    Submitted 1 July, 2019; originally announced July 2019.

  19. arXiv:1906.07987  [pdf, other

    cs.LG cs.AI stat.ML

    Adaptive Temporal-Difference Learning for Policy Evaluation with Per-State Uncertainty Estimates

    Authors: Hugo Penedones, Carlos Riquelme, Damien Vincent, Hartmut Maennel, Timothy Mann, Andre Barreto, Sylvain Gelly, Gergely Neu

    Abstract: We consider the core reinforcement-learning problem of on-policy value function approximation from a batch of trajectory data, and focus on various issues of Temporal Difference (TD) learning and Monte Carlo (MC) policy evaluation. The two methods are known to achieve complementary bias-variance trade-off properties, with TD tending to achieve lower variance but potentially higher bias. In this pa… ▽ More

    Submitted 19 June, 2019; originally announced June 2019.

  20. arXiv:1905.11866  [pdf, ps, other

    cs.LG stat.ML

    When can unlabeled data improve the learning rate?

    Authors: Christina Göpfert, Shai Ben-David, Olivier Bousquet, Sylvain Gelly, Ilya Tolstikhin, Ruth Urner

    Abstract: In semi-supervised classification, one is given access both to labeled and unlabeled data. As unlabeled data is typically cheaper to acquire than labeled data, this setup becomes advantageous as soon as one can exploit the unlabeled data in order to produce a better classifier than with labeled data alone. However, the conditions under which such an improvement is possible are not fully understood… ▽ More

    Submitted 9 February, 2022; v1 submitted 28 May, 2019; originally announced May 2019.

    Comments: Small correction in proof of Theorem 1

    Journal ref: Proceedings of the Thirty-Second Conference on Learning Theory, PMLR 99:1500-1518, 2019

  21. arXiv:1905.10768  [pdf, other

    cs.LG stat.ML

    Precision-Recall Curves Using Information Divergence Frontiers

    Authors: Josip Djolonga, Mario Lucic, Marco Cuturi, Olivier Bachem, Olivier Bousquet, Sylvain Gelly

    Abstract: Despite the tremendous progress in the estimation of generative models, the development of tools for diagnosing their failures and assessing their performance has advanced at a much slower pace. Recent developments have investigated metrics that quantify which parts of the true distribution is modeled well, and, on the contrary, what the model fails to capture, akin to precision and recall in info… ▽ More

    Submitted 8 June, 2020; v1 submitted 26 May, 2019; originally announced May 2019.

    Comments: Updated to the AISTATS 2020 version

  22. arXiv:1903.02271  [pdf, other

    cs.LG cs.CV stat.ML

    High-Fidelity Image Generation With Fewer Labels

    Authors: Mario Lucic, Michael Tschannen, Marvin Ritter, Xiaohua Zhai, Olivier Bachem, Sylvain Gelly

    Abstract: Deep generative models are becoming a cornerstone of modern machine learning. Recent work on conditional generative adversarial networks has shown that learning complex, high-dimensional distributions over natural images is within reach. While the latest models are able to generate high-fidelity, diverse natural images at high resolution, they rely on a vast quantity of labeled data. In this work… ▽ More

    Submitted 14 May, 2019; v1 submitted 6 March, 2019; originally announced March 2019.

    Comments: Mario Lucic, Michael Tschannen, and Marvin Ritter contributed equally to this work. ICML 2019 camera-ready version. Code available at https://github.com/google/compare_gan

  23. arXiv:1902.08077  [pdf, other

    cs.LG stat.ML

    Breaking the Softmax Bottleneck via Learnable Monotonic Pointwise Non-linearities

    Authors: Octavian-Eugen Ganea, Sylvain Gelly, Gary Bécigneul, Aliaksei Severyn

    Abstract: The Softmax function on top of a final linear layer is the de facto method to output probability distributions in neural networks. In many applications such as language models or text generation, this model has to produce distributions over large output vocabularies. Recently, this has been shown to have limited representational capacity due to its connection with the rank bottleneck in matrix fac… ▽ More

    Submitted 13 May, 2019; v1 submitted 21 February, 2019; originally announced February 2019.

    Journal ref: ICML 2019

  24. arXiv:1902.00751  [pdf, other

    cs.LG cs.CL stat.ML

    Parameter-Efficient Transfer Learning for NLP

    Authors: Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, Sylvain Gelly

    Abstract: Fine-tuning large pre-trained models is an effective transfer mechanism in NLP. However, in the presence of many downstream tasks, fine-tuning is parameter inefficient: an entire new model is required for every task. As an alternative, we propose transfer with adapter modules. Adapter modules yield a compact and extensible model; they add only a few trainable parameters per task, and new tasks can… ▽ More

    Submitted 13 June, 2019; v1 submitted 2 February, 2019; originally announced February 2019.

  25. arXiv:1812.01717  [pdf, other

    cs.CV cs.AI cs.LG cs.NE stat.ML

    Towards Accurate Generative Models of Video: A New Metric & Challenges

    Authors: Thomas Unterthiner, Sjoerd van Steenkiste, Karol Kurach, Raphael Marinier, Marcin Michalski, Sylvain Gelly

    Abstract: Recent advances in deep generative models have lead to remarkable progress in synthesizing high quality images. Following their successful application in image processing and representation learning, an important next step is to consider videos. Learning generative models of video is a much harder task, requiring a model to capture the temporal dynamics of a scene, in addition to the visual presen… ▽ More

    Submitted 27 March, 2019; v1 submitted 2 December, 2018; originally announced December 2018.

  26. arXiv:1811.12359  [pdf, other

    cs.LG cs.AI stat.ML

    Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations

    Authors: Francesco Locatello, Stefan Bauer, Mario Lucic, Gunnar Rätsch, Sylvain Gelly, Bernhard Schölkopf, Olivier Bachem

    Abstract: The key idea behind the unsupervised learning of disentangled representations is that real-world data is generated by a few explanatory factors of variation which can be recovered by unsupervised learning algorithms. In this paper, we provide a sober look at recent progress in the field and challenge some common assumptions. We first theoretically show that the unsupervised learning of disentangle… ▽ More

    Submitted 18 June, 2019; v1 submitted 29 November, 2018; originally announced November 2018.

    Journal ref: Proceedings of the 36th International Conference on Machine Learning (ICML 2019)

  27. Investigating Object Compositionality in Generative Adversarial Networks

    Authors: Sjoerd van Steenkiste, Karol Kurach, Jürgen Schmidhuber, Sylvain Gelly

    Abstract: Deep generative models seek to recover the process with which the observed data was generated. They may be used to synthesize new samples or to subsequently extract representations. Successful approaches in the domain of images are driven by several core inductive biases. However, a bias to account for the compositional way in which humans structure a visual scene in terms of objects has frequentl… ▽ More

    Submitted 24 July, 2020; v1 submitted 17 October, 2018; originally announced October 2018.

    Comments: A preliminary version of this work (arXiv v1) appeared under the title "A Case for Object Compositionality in Deep Generative Models of Images" as a workshop paper at the NeurIPS2018 workshop on "Modeling the Physical World: Perception, Learning, and Control", and at the NeurIPS2018 workshop on "Relational Representation Learning"

    MSC Class: I.2.6 ACM Class: I.2.6

  28. arXiv:1810.02274  [pdf, other

    cs.LG cs.AI cs.CV cs.RO stat.ML

    Episodic Curiosity through Reachability

    Authors: Nikolay Savinov, Anton Raichuk, Raphaël Marinier, Damien Vincent, Marc Pollefeys, Timothy Lillicrap, Sylvain Gelly

    Abstract: Rewards are sparse in the real world and most of today's reinforcement learning algorithms struggle with such sparsity. One solution to this problem is to allow the agent to create rewards for itself - thus making rewards dense and more suitable for learning. In particular, inspired by curious behaviour in animals, observing something novel could be rewarded with a bonus. Such bonus is summed up w… ▽ More

    Submitted 6 August, 2019; v1 submitted 4 October, 2018; originally announced October 2018.

    Comments: Accepted to ICLR 2019. Code at https://github.com/google-research/episodic-curiosity/. Videos at https://sites.google.com/view/episodic-curiosity/

  29. arXiv:1810.01365  [pdf, other

    cs.LG cs.CV stat.ML

    On Self Modulation for Generative Adversarial Networks

    Authors: Ting Chen, Mario Lucic, Neil Houlsby, Sylvain Gelly

    Abstract: Training Generative Adversarial Networks (GANs) is notoriously challenging. We propose and study an architectural modification, self-modulation, which improves GAN performance across different data sets, architectures, losses, regularizers, and hyperparameter settings. Intuitively, self-modulation allows the intermediate feature maps of a generator to change as a function of the input noise vector… ▽ More

    Submitted 2 May, 2019; v1 submitted 2 October, 2018; originally announced October 2018.

  30. arXiv:1807.04720  [pdf, other

    cs.LG stat.ML

    A Large-Scale Study on Regularization and Normalization in GANs

    Authors: Karol Kurach, Mario Lucic, Xiaohua Zhai, Marcin Michalski, Sylvain Gelly

    Abstract: Generative adversarial networks (GANs) are a class of deep generative models which aim to learn a target distribution in an unsupervised fashion. While they were successfully applied to many problems, training a GAN is a notoriously challenging task and requires a significant number of hyperparameter tuning, neural architecture engineering, and a non-trivial amount of "tricks". The success in many… ▽ More

    Submitted 14 May, 2019; v1 submitted 12 July, 2018; originally announced July 2018.

    Comments: Revision accepted to ICML'19: More focus on regularization and normalization aspects. Added recent references and promising future directions

  31. arXiv:1807.03064  [pdf, other

    cs.LG stat.ML

    Temporal Difference Learning with Neural Networks - Study of the Leakage Propagation Problem

    Authors: Hugo Penedones, Damien Vincent, Hartmut Maennel, Sylvain Gelly, Timothy Mann, Andre Barreto

    Abstract: Temporal-Difference learning (TD) [Sutton, 1988] with function approximation can converge to solutions that are worse than those obtained by Monte-Carlo regression, even in the simple case of on-policy evaluation. To increase our understanding of the problem, we investigate the issue of approximation errors in areas of sharp discontinuities of the value function being further propagated by bootstr… ▽ More

    Submitted 9 July, 2018; originally announced July 2018.

  32. arXiv:1806.04936  [pdf, other

    cs.CL

    On Accurate Evaluation of GANs for Language Generation

    Authors: Stanislau Semeniuta, Aliaksei Severyn, Sylvain Gelly

    Abstract: Generative Adversarial Networks (GANs) are a promising approach to language generation. The latest works introducing novel GAN models for language generation use n-gram based metrics for evaluation and only report single scores of the best run. In this paper, we argue that this often misrepresents the true picture and does not tell the full story, as GAN models can be extremely sensitive to the ra… ▽ More

    Submitted 18 July, 2019; v1 submitted 13 June, 2018; originally announced June 2018.

  33. arXiv:1806.00035  [pdf, other

    stat.ML cs.LG

    Assessing Generative Models via Precision and Recall

    Authors: Mehdi S. M. Sajjadi, Olivier Bachem, Mario Lucic, Olivier Bousquet, Sylvain Gelly

    Abstract: Recent advances in generative modeling have led to an increased interest in the study of statistical divergences as means of model comparison. Commonly used evaluation methods, such as the Frechet Inception Distance (FID), correlate well with the perceived quality of samples and are sensitive to mode dropping. However, these metrics are unable to distinguish between different failure cases since t… ▽ More

    Submitted 28 October, 2018; v1 submitted 31 May, 2018; originally announced June 2018.

    Comments: NIPS 2018

  34. arXiv:1804.11130  [pdf, other

    cs.LG cs.AI stat.ML

    Competitive Training of Mixtures of Independent Deep Generative Models

    Authors: Francesco Locatello, Damien Vincent, Ilya Tolstikhin, Gunnar Rätsch, Sylvain Gelly, Bernhard Schölkopf

    Abstract: A common assumption in causal modeling posits that the data is generated by a set of independent mechanisms, and algorithms should aim to recover this structure. Standard unsupervised learning, however, is often concerned with training a single model to capture the overall distribution or aspects thereof. Inspired by clustering approaches, we consider mixtures of implicit generative models that ``… ▽ More

    Submitted 3 March, 2019; v1 submitted 30 April, 2018; originally announced April 2018.

  35. arXiv:1803.11203  [pdf, other

    cs.LG

    MemGEN: Memory is All You Need

    Authors: Sylvain Gelly, Karol Kurach, Marcin Michalski, Xiaohua Zhai

    Abstract: We propose a new learning paradigm called Deep Memory. It has the potential to completely revolutionize the Machine Learning field. Surprisingly, this paradigm has not been reinvented yet, unlike Deep Learning. At the core of this approach is the \textit{Learning By Heart} principle, well studied in primary schools all over the world. Inspired by poem recitation, or by $π$ decimal memorization,… ▽ More

    Submitted 29 March, 2018; originally announced March 2018.

  36. arXiv:1803.08367  [pdf, other

    stat.ML cs.LG

    Gradient Descent Quantizes ReLU Network Features

    Authors: Hartmut Maennel, Olivier Bousquet, Sylvain Gelly

    Abstract: Deep neural networks are often trained in the over-parametrized regime (i.e. with far more parameters than training examples), and understanding why the training converges to solutions that generalize remains an open problem. Several studies have highlighted the fact that the training procedure, i.e. mini-batch Stochastic Gradient Descent (SGD) leads to solutions that have specific properties in t… ▽ More

    Submitted 22 March, 2018; originally announced March 2018.

  37. arXiv:1711.10337  [pdf, other

    stat.ML cs.LG

    Are GANs Created Equal? A Large-Scale Study

    Authors: Mario Lucic, Karol Kurach, Marcin Michalski, Sylvain Gelly, Olivier Bousquet

    Abstract: Generative adversarial networks (GAN) are a powerful subclass of generative models. Despite a very rich research activity leading to numerous interesting GAN algorithms, it is still very hard to assess which algorithm(s) perform better than others. We conduct a neutral, multi-faceted large-scale empirical study on state-of-the art models and evaluation measures. We find that most models can reach… ▽ More

    Submitted 29 October, 2018; v1 submitted 28 November, 2017; originally announced November 2017.

    Comments: NIPS'18: Added a section on the limitations of the study and additional empirical results

  38. arXiv:1711.01558  [pdf, other

    stat.ML cs.LG

    Wasserstein Auto-Encoders

    Authors: Ilya Tolstikhin, Olivier Bousquet, Sylvain Gelly, Bernhard Schoelkopf

    Abstract: We propose the Wasserstein Auto-Encoder (WAE)---a new algorithm for building a generative model of the data distribution. WAE minimizes a penalized form of the Wasserstein distance between the model distribution and the target distribution, which leads to a different regularizer than the one used by the Variational Auto-Encoder (VAE). This regularizer encourages the encoded training distribution t… ▽ More

    Submitted 5 December, 2019; v1 submitted 5 November, 2017; originally announced November 2017.

    Comments: Published at ICLR 2018.. Included much wider hyperparameter sweep: in significant improvements in FIDs on CelebA

  39. arXiv:1706.03200  [pdf, other

    cs.LG

    Critical Hyper-Parameters: No Random, No Cry

    Authors: Olivier Bousquet, Sylvain Gelly, Karol Kurach, Olivier Teytaud, Damien Vincent

    Abstract: The selection of hyper-parameters is critical in Deep Learning. Because of the long training time of complex models and the availability of compute resources in the cloud, "one-shot" optimization schemes - where the sets of hyper-parameters are selected in advance (e.g. on a grid or in a random manner) and the training is executed in parallel - are commonly used. It is known that grid search is su… ▽ More

    Submitted 10 June, 2017; originally announced June 2017.

  40. arXiv:1706.03199  [pdf, other

    cs.LG

    Toward Optimal Run Racing: Application to Deep Learning Calibration

    Authors: Olivier Bousquet, Sylvain Gelly, Karol Kurach, Marc Schoenauer, Michele Sebag, Olivier Teytaud, Damien Vincent

    Abstract: This paper aims at one-shot learning of deep neural nets, where a highly parallel setting is considered to address the algorithm calibration problem - selecting the best neural architecture and learning hyper-parameter values depending on the dataset at hand. The notoriously expensive calibration problem is optimally reduced by detecting and early stopping non-optimal runs. The theoretical contrib… ▽ More

    Submitted 20 June, 2017; v1 submitted 10 June, 2017; originally announced June 2017.

  41. arXiv:1705.08386  [pdf, other

    cs.CL cs.CV cs.LG

    Better Text Understanding Through Image-To-Text Transfer

    Authors: Karol Kurach, Sylvain Gelly, Michal Jastrzebski, Philip Haeusser, Olivier Teytaud, Damien Vincent, Olivier Bousquet

    Abstract: Generic text embeddings are successfully used in a variety of tasks. However, they are often learnt by capturing the co-occurrence structure from pure text corpora, resulting in limitations of their ability to generalize. In this paper, we explore models that incorporate visual information into the text representation. Based on comprehensive ablation studies, we propose a conceptually simple, yet… ▽ More

    Submitted 26 May, 2017; v1 submitted 23 May, 2017; originally announced May 2017.

  42. arXiv:1701.02386  [pdf, other

    stat.ML cs.LG

    AdaGAN: Boosting Generative Models

    Authors: Ilya Tolstikhin, Sylvain Gelly, Olivier Bousquet, Carl-Johann Simon-Gabriel, Bernhard Schölkopf

    Abstract: Generative Adversarial Networks (GAN) (Goodfellow et al., 2014) are an effective method for training generative models of complex data such as natural images. However, they are notoriously hard to train and can suffer from the problem of missing modes where the model is not able to produce examples in certain regions of the space. We propose an iterative procedure, called AdaGAN, where at every st… ▽ More

    Submitted 24 May, 2017; v1 submitted 9 January, 2017; originally announced January 2017.

    Comments: Updated with MNIST pictures and discussions + Unrolled GAN experiments

  43. arXiv:cs/0511093  [pdf, ps, other

    cs.GT cs.AI

    Artificial Agents and Speculative Bubbles

    Authors: Yann Semet, Sylvain Gelly, Marc Schoenauer, Michèle Sebag

    Abstract: Pertaining to Agent-based Computational Economics (ACE), this work presents two models for the rise and downfall of speculative bubbles through an exchange price fixing based on double auction mechanisms. The first model is based on a finite time horizon context, where the expected dividends decrease along time. The second model follows the {\em greater fool} hypothesis; the agent behaviour depe… ▽ More

    Submitted 28 November, 2005; originally announced November 2005.