Zum Hauptinhalt springen

Showing 1–18 of 18 results for author: Galashov, A

.
  1. arXiv:2405.06780  [pdf, other

    cs.LG cs.AI

    Deep MMD Gradient Flow without adversarial training

    Authors: Alexandre Galashov, Valentin de Bortoli, Arthur Gretton

    Abstract: We propose a gradient flow procedure for generative modeling by transporting particles from an initial source distribution to a target distribution, where the gradient field on the particles is given by a noise-adaptive Wasserstein Gradient of the Maximum Mean Discrepancy (MMD). The noise-adaptive MMD is trained on data distributions corrupted by increasing levels of noise, obtained via a forward… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

  2. arXiv:2403.01518  [pdf, other

    cs.CL cs.LG

    Revisiting Dynamic Evaluation: Online Adaptation for Large Language Models

    Authors: Amal Rannen-Triki, Jorg Bornschein, Razvan Pascanu, Marcus Hutter, Andras György, Alexandre Galashov, Yee Whye Teh, Michalis K. Titsias

    Abstract: We consider the problem of online fine tuning the parameters of a language model at test time, also known as dynamic evaluation. While it is generally known that this approach improves the overall predictive performance, especially when considering distributional shift between training and evaluation data, we here emphasize the perspective that online adaptation turns parameters into temporally ch… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

  3. arXiv:2306.08448  [pdf, other

    cs.LG cs.AI

    Kalman Filter for Online Classification of Non-Stationary Data

    Authors: Michalis K. Titsias, Alexandre Galashov, Amal Rannen-Triki, Razvan Pascanu, Yee Whye Teh, Jorg Bornschein

    Abstract: In Online Continual Learning (OCL) a learning system receives a stream of data and sequentially performs prediction and training steps. Important challenges in OCL are concerned with automatic adaptation to the particular non-stationary structure of the data, and with quantification of predictive uncertainty. Motivated by these challenges we introduce a probabilistic Bayesian online learning model… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

  4. arXiv:2304.13164  [pdf, other

    cs.LG cs.AI

    Towards Compute-Optimal Transfer Learning

    Authors: Massimo Caccia, Alexandre Galashov, Arthur Douillard, Amal Rannen-Triki, Dushyant Rao, Michela Paganini, Laurent Charlin, Marc'Aurelio Ranzato, Razvan Pascanu

    Abstract: The field of transfer learning is undergoing a significant shift with the introduction of large pretrained models which have demonstrated strong adaptability to a variety of downstream tasks. However, the high computational and memory requirements to finetune or use these models can be a hindrance to their widespread use. In this study, we present a solution to this issue by proposing a simple yet… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

  5. arXiv:2211.11747  [pdf, other

    cs.LG cs.CV

    NEVIS'22: A Stream of 100 Tasks Sampled from 30 Years of Computer Vision Research

    Authors: Jorg Bornschein, Alexandre Galashov, Ross Hemsley, Amal Rannen-Triki, Yutian Chen, Arslan Chaudhry, Xu Owen He, Arthur Douillard, Massimo Caccia, Qixuang Feng, Jiajun Shen, Sylvestre-Alvise Rebuffi, Kitty Stacpoole, Diego de las Casas, Will Hawkins, Angeliki Lazaridou, Yee Whye Teh, Andrei A. Rusu, Razvan Pascanu, Marc'Aurelio Ranzato

    Abstract: A shared goal of several machine learning communities like continual learning, meta-learning and transfer learning, is to design algorithms and models that efficiently and robustly adapt to unseen tasks. An even more ambitious goal is to build models that never stop adapting, and that become increasingly more efficient through time by suitably transferring the accrued knowledge. Beyond the study o… ▽ More

    Submitted 16 May, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

  6. arXiv:2205.11448  [pdf, other

    cs.LG cs.AI

    Data augmentation for efficient learning from parametric experts

    Authors: Alexandre Galashov, Josh Merel, Nicolas Heess

    Abstract: We present a simple, yet powerful data-augmentation technique to enable data-efficient learning from parametric experts for reinforcement and imitation learning. We focus on what we call the policy cloning setting, in which we use online or offline queries of an expert or expert policy to inform the behavior of a student policy. This setting arises naturally in a number of problems, for instance a… ▽ More

    Submitted 23 May, 2022; originally announced May 2022.

  7. arXiv:2011.09192  [pdf, other

    cs.AI cs.GT cs.MA

    Game Plan: What AI can do for Football, and What Football can do for AI

    Authors: Karl Tuyls, Shayegan Omidshafiei, Paul Muller, Zhe Wang, Jerome Connor, Daniel Hennes, Ian Graham, William Spearman, Tim Waskett, Dafydd Steele, Pauline Luc, Adria Recasens, Alexandre Galashov, Gregory Thornton, Romuald Elie, Pablo Sprechmann, Pol Moreno, Kris Cao, Marta Garnelo, Praneet Dutta, Michal Valko, Nicolas Heess, Alex Bridgland, Julien Perolat, Bart De Vylder , et al. (11 additional authors not shown)

    Abstract: The rapid progress in artificial intelligence (AI) and machine learning has opened unprecedented analytics possibilities in various team and individual sports, including baseball, basketball, and tennis. More recently, AI techniques have been applied to football, due to a huge increase in data collection by professional teams, increased computational power, and advances in machine learning, with t… ▽ More

    Submitted 18 November, 2020; originally announced November 2020.

  8. arXiv:2010.14274  [pdf, other

    cs.AI cs.LG

    Behavior Priors for Efficient Reinforcement Learning

    Authors: Dhruva Tirumala, Alexandre Galashov, Hyeonwoo Noh, Leonard Hasenclever, Razvan Pascanu, Jonathan Schwarz, Guillaume Desjardins, Wojciech Marian Czarnecki, Arun Ahuja, Yee Whye Teh, Nicolas Heess

    Abstract: As we deploy reinforcement learning agents to solve increasingly challenging problems, methods that allow us to inject prior knowledge about the structure of the world and effective solution strategies becomes increasingly important. In this work we consider how information and architectural constraints can be combined with ideas from the probabilistic modeling literature to learn behavior priors… ▽ More

    Submitted 27 October, 2020; originally announced October 2020.

    Comments: Submitted to Journal of Machine Learning Research (JMLR)

  9. arXiv:2010.08587  [pdf, other

    cs.RO cs.AI

    Learning Dexterous Manipulation from Suboptimal Experts

    Authors: Rae Jeong, Jost Tobias Springenberg, Jackie Kay, Daniel Zheng, Yuxiang Zhou, Alexandre Galashov, Nicolas Heess, Francesco Nori

    Abstract: Learning dexterous manipulation in high-dimensional state-action spaces is an important open challenge with exploration presenting a major bottleneck. Although in many cases the learning process could be guided by demonstrations or other suboptimal experts, current RL algorithms for continuous action spaces often fail to effectively utilize combinations of highly off-policy expert data and on-poli… ▽ More

    Submitted 5 January, 2021; v1 submitted 16 October, 2020; originally announced October 2020.

  10. arXiv:2010.02255  [pdf, other

    cs.AI cs.LG stat.ML

    Temporal Difference Uncertainties as a Signal for Exploration

    Authors: Sebastian Flennerhag, Jane X. Wang, Pablo Sprechmann, Francesco Visin, Alexandre Galashov, Steven Kapturowski, Diana L. Borsa, Nicolas Heess, Andre Barreto, Razvan Pascanu

    Abstract: An effective approach to exploration in reinforcement learning is to rely on an agent's uncertainty over the optimal policy, which can yield near-optimal exploration strategies in tabular settings. However, in non-tabular settings that involve function approximators, obtaining accurate uncertainty estimates is almost as challenging a problem. In this paper, we highlight that value estimates are ea… ▽ More

    Submitted 1 July, 2021; v1 submitted 5 October, 2020; originally announced October 2020.

    Comments: 9 pages, 11 figures, 5 tables

  11. arXiv:2009.04875  [pdf, other

    cs.LG cs.AI stat.ML

    Importance Weighted Policy Learning and Adaptation

    Authors: Alexandre Galashov, Jakub Sygnowski, Guillaume Desjardins, Jan Humplik, Leonard Hasenclever, Rae Jeong, Yee Whye Teh, Nicolas Heess

    Abstract: The ability to exploit prior experience to solve novel problems rapidly is a hallmark of biological learning systems and of great practical importance for artificial ones. In the meta reinforcement learning literature much recent work has focused on the problem of optimizing the learning process itself. In this paper we study a complementary approach which is conceptually simple, general, modular… ▽ More

    Submitted 4 June, 2021; v1 submitted 10 September, 2020; originally announced September 2020.

  12. arXiv:2009.03228  [pdf, other

    cs.LG cs.AI stat.ML

    Information Theoretic Meta Learning with Gaussian Processes

    Authors: Michalis K. Titsias, Francisco J. R. Ruiz, Sotirios Nikoloutsopoulos, Alexandre Galashov

    Abstract: We formulate meta learning using information theoretic concepts; namely, mutual information and the information bottleneck. The idea is to learn a stochastic representation or encoding of the task description, given by a training set, that is highly informative about predicting the validation set. By making use of variational approximations to the mutual information, we derive a general and tracta… ▽ More

    Submitted 5 July, 2021; v1 submitted 7 September, 2020; originally announced September 2020.

    Comments: 15 pages, 2 figures

  13. arXiv:1906.05201  [pdf, other

    stat.ML cs.LG cs.NE

    Task Agnostic Continual Learning via Meta Learning

    Authors: Xu He, Jakub Sygnowski, Alexandre Galashov, Andrei A. Rusu, Yee Whye Teh, Razvan Pascanu

    Abstract: While neural networks are powerful function approximators, they suffer from catastrophic forgetting when the data distribution is not stationary. One particular formalism that studies learning under non-stationary distribution is provided by continual learning, where the non-stationarity is imposed by a sequence of distinct tasks. Most methods in this space assume, however, the knowledge of task b… ▽ More

    Submitted 12 June, 2019; originally announced June 2019.

  14. arXiv:1905.06424  [pdf, other

    cs.LG cs.AI stat.ML

    Meta reinforcement learning as task inference

    Authors: Jan Humplik, Alexandre Galashov, Leonard Hasenclever, Pedro A. Ortega, Yee Whye Teh, Nicolas Heess

    Abstract: Humans achieve efficient learning by relying on prior knowledge about the structure of naturally occurring tasks. There is considerable interest in designing reinforcement learning (RL) algorithms with similar properties. This includes proposals to learn the learning algorithm itself, an idea also known as meta learning. One formal interpretation of this idea is as a partially observable multi-tas… ▽ More

    Submitted 22 October, 2019; v1 submitted 15 May, 2019; originally announced May 2019.

  15. arXiv:1905.01240  [pdf, other

    cs.LG cs.AI stat.ML

    Information asymmetry in KL-regularized RL

    Authors: Alexandre Galashov, Siddhant M. Jayakumar, Leonard Hasenclever, Dhruva Tirumala, Jonathan Schwarz, Guillaume Desjardins, Wojciech M. Czarnecki, Yee Whye Teh, Razvan Pascanu, Nicolas Heess

    Abstract: Many real world tasks exhibit rich structure that is repeated across different parts of the state space or in time. In this work we study the possibility of leveraging such repeated structure to speed up and regularize learning. We start from the KL regularized expected reward objective which introduces an additional component, a default policy. Instead of relying on a fixed default policy, we lea… ▽ More

    Submitted 3 May, 2019; originally announced May 2019.

    Comments: Accepted as a conference paper at ICLR 2019

  16. arXiv:1903.11907  [pdf, other

    stat.ML cs.LG

    Meta-Learning surrogate models for sequential decision making

    Authors: Alexandre Galashov, Jonathan Schwarz, Hyunjik Kim, Marta Garnelo, David Saxton, Pushmeet Kohli, S. M. Ali Eslami, Yee Whye Teh

    Abstract: We introduce a unified probabilistic framework for solving sequential decision making problems ranging from Bayesian optimisation to contextual bandits and reinforcement learning. This is accomplished by a probabilistic model-based approach that explains observed data while capturing predictive uncertainty during the decision making process. Crucially, this probabilistic model is chosen to be a Me… ▽ More

    Submitted 12 June, 2019; v1 submitted 28 March, 2019; originally announced March 2019.

  17. arXiv:1903.07438  [pdf, other

    cs.LG stat.ML

    Exploiting Hierarchy for Learning and Transfer in KL-regularized RL

    Authors: Dhruva Tirumala, Hyeonwoo Noh, Alexandre Galashov, Leonard Hasenclever, Arun Ahuja, Greg Wayne, Razvan Pascanu, Yee Whye Teh, Nicolas Heess

    Abstract: As reinforcement learning agents are tasked with solving more challenging and diverse tasks, the ability to incorporate prior knowledge into the learning system and to exploit reusable structure in solution space is likely to become increasingly important. The KL-regularized expected reward objective constitutes one possible tool to this end. It introduces an additional component, a default or pri… ▽ More

    Submitted 23 January, 2020; v1 submitted 18 March, 2019; originally announced March 2019.

  18. arXiv:1811.11711  [pdf, other

    cs.LG cs.AI cs.RO

    Neural probabilistic motor primitives for humanoid control

    Authors: Josh Merel, Leonard Hasenclever, Alexandre Galashov, Arun Ahuja, Vu Pham, Greg Wayne, Yee Whye Teh, Nicolas Heess

    Abstract: We focus on the problem of learning a single motor module that can flexibly express a range of behaviors for the control of high-dimensional physically simulated humanoids. To do this, we propose a motor architecture that has the general structure of an inverse model with a latent-variable bottleneck. We show that it is possible to train this model entirely offline to compress thousands of expert… ▽ More

    Submitted 15 January, 2019; v1 submitted 28 November, 2018; originally announced November 2018.

    Comments: Accepted as a conference paper at ICLR 2019