Zum Hauptinhalt springen

Showing 1–16 of 16 results for author: Buesing, L

Searching in archive stat. Search in all archives.
.
  1. arXiv:2201.05119  [pdf, other

    cs.CV cs.LG stat.ML

    Pushing the limits of self-supervised ResNets: Can we outperform supervised learning without labels on ImageNet?

    Authors: Nenad Tomasev, Ioana Bica, Brian McWilliams, Lars Buesing, Razvan Pascanu, Charles Blundell, Jovana Mitrovic

    Abstract: Despite recent progress made by self-supervised methods in representation learning with residual networks, they still underperform supervised learning on the ImageNet classification benchmark, limiting their applicability in performance-critical settings. Building on prior theoretical insights from ReLIC [Mitrovic et al., 2021], we include additional inductive biases into self-supervised learning.… ▽ More

    Submitted 3 November, 2022; v1 submitted 13 January, 2022; originally announced January 2022.

  2. arXiv:2010.07922  [pdf, other

    cs.LG cs.CV stat.ML

    Representation Learning via Invariant Causal Mechanisms

    Authors: Jovana Mitrovic, Brian McWilliams, Jacob Walker, Lars Buesing, Charles Blundell

    Abstract: Self-supervised learning has emerged as a strategy to reduce the reliance on costly supervised signal by pretraining representations only using unlabeled data. These methods combine heuristic proxy classification tasks with data augmentations and have achieved significant success, but our theoretical understanding of this success remains limited. In this paper we analyze self-supervised representa… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

  3. arXiv:2010.01298  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Beyond Tabula-Rasa: a Modular Reinforcement Learning Approach for Physically Embedded 3D Sokoban

    Authors: Peter Karkus, Mehdi Mirza, Arthur Guez, Andrew Jaegle, Timothy Lillicrap, Lars Buesing, Nicolas Heess, Theophane Weber

    Abstract: Intelligent robots need to achieve abstract objectives using concrete, spatiotemporally complex sensory information and motor control. Tabula rasa deep reinforcement learning (RL) has tackled demanding tasks in terms of either visual, abstract, or physical reasoning, but solving these jointly remains a formidable challenge. One recent, unsolved benchmark task that integrates these challenges is Mu… ▽ More

    Submitted 3 October, 2020; originally announced October 2020.

  4. arXiv:2006.06380  [pdf, other

    stat.ML cs.DS cs.LG

    Pointer Graph Networks

    Authors: Petar Veličković, Lars Buesing, Matthew C. Overlan, Razvan Pascanu, Oriol Vinyals, Charles Blundell

    Abstract: Graph neural networks (GNNs) are typically applied to static graphs that are assumed to be known upfront. This static input structure is often informed purely by insight of the machine learning practitioner, and might not be optimal for the actual task the GNN is solving. In absence of reliable domain expertise, one might resort to inferring the latent graph structure, which is often difficult due… ▽ More

    Submitted 18 October, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: To appear at NeurIPS 2020 (Spotlight talk)

  5. arXiv:2004.11410  [pdf, other

    cs.LG cs.AI stat.ML

    Divide-and-Conquer Monte Carlo Tree Search For Goal-Directed Planning

    Authors: Giambattista Parascandolo, Lars Buesing, Josh Merel, Leonard Hasenclever, John Aslanides, Jessica B. Hamrick, Nicolas Heess, Alexander Neitz, Theophane Weber

    Abstract: Standard planners for sequential decision making (including Monte Carlo planning, tree search, dynamic programming, etc.) are constrained by an implicit sequential planning assumption: The order in which a plan is constructed is the same in which it is executed. We consider alternatives to this assumption for the class of goal-directed Reinforcement Learning (RL) problems. Instead of an environmen… ▽ More

    Submitted 23 April, 2020; originally announced April 2020.

  6. arXiv:2002.08329  [pdf, other

    cs.LG stat.ML

    Value-driven Hindsight Modelling

    Authors: Arthur Guez, Fabio Viola, Théophane Weber, Lars Buesing, Steven Kapturowski, Doina Precup, David Silver, Nicolas Heess

    Abstract: Value estimation is a critical component of the reinforcement learning (RL) paradigm. The question of how to effectively learn value predictors from data is one of the major problems studied by the RL community, and different approaches exploit structure in the problem domain in different ways. Model learning can make use of the rich transition structure present in sequences of observations, but t… ▽ More

    Submitted 20 October, 2020; v1 submitted 19 February, 2020; originally announced February 2020.

    Comments: 9 pages + reference + appendix. NeurIPS 2020 version

  7. arXiv:2002.02836  [pdf, other

    cs.LG cs.AI stat.ML

    Causally Correct Partial Models for Reinforcement Learning

    Authors: Danilo J. Rezende, Ivo Danihelka, George Papamakarios, Nan Rosemary Ke, Ray Jiang, Theophane Weber, Karol Gregor, Hamza Merzic, Fabio Viola, Jane Wang, Jovana Mitrovic, Frederic Besse, Ioannis Antonoglou, Lars Buesing

    Abstract: In reinforcement learning, we can learn a model of future observations and rewards, and use it to plan the agent's next actions. However, jointly modeling future observations can be computationally expensive or even intractable if the observations are high-dimensional (e.g. images). For this reason, previous works have considered partial models, which model only part of the observation. In this pa… ▽ More

    Submitted 7 February, 2020; originally announced February 2020.

  8. arXiv:1912.02807  [pdf, other

    cs.LG stat.ML

    Combining Q-Learning and Search with Amortized Value Estimates

    Authors: Jessica B. Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Tobias Pfaff, Theophane Weber, Lars Buesing, Peter W. Battaglia

    Abstract: We introduce "Search with Amortized Value Estimates" (SAVE), an approach for combining model-free Q-learning with model-based Monte-Carlo Tree Search (MCTS). In SAVE, a learned prior over state-action values is used to guide MCTS, which estimates an improved set of state-action values. The new Q-estimates are then used in combination with real experience to update the prior. This effectively amort… ▽ More

    Submitted 10 January, 2020; v1 submitted 5 December, 2019; originally announced December 2019.

    Comments: Published as a conference paper at ICLR 2020

  9. arXiv:1901.01761  [pdf, other

    cs.LG stat.ML

    Credit Assignment Techniques in Stochastic Computation Graphs

    Authors: Théophane Weber, Nicolas Heess, Lars Buesing, David Silver

    Abstract: Stochastic computation graphs (SCGs) provide a formalism to represent structured optimization problems arising in artificial intelligence, including supervised, unsupervised, and reinforcement learning. Previous work has shown that an unbiased estimator of the gradient of the expected loss of SCGs can be derived from a single principle. However, this estimator often has high variance and requires… ▽ More

    Submitted 7 January, 2019; originally announced January 2019.

  10. arXiv:1811.06272  [pdf, other

    cs.LG stat.ML

    Woulda, Coulda, Shoulda: Counterfactually-Guided Policy Search

    Authors: Lars Buesing, Theophane Weber, Yori Zwols, Sebastien Racaniere, Arthur Guez, Jean-Baptiste Lespiau, Nicolas Heess

    Abstract: Learning policies on data synthesized by models can in principle quench the thirst of reinforcement learning algorithms for large amounts of real experience, which is often costly to acquire. However, simulating plausible experience de novo is a hard problem for many complex environments, often resulting in biases for model-based policy evaluation and search. Instead of de novo synthesis of data,… ▽ More

    Submitted 15 November, 2018; originally announced November 2018.

  11. arXiv:1806.03107  [pdf, other

    cs.LG stat.ML

    Temporal Difference Variational Auto-Encoder

    Authors: Karol Gregor, George Papamakarios, Frederic Besse, Lars Buesing, Theophane Weber

    Abstract: To act and plan in complex environments, we posit that agents should have a mental simulator of the world with three characteristics: (a) it should build an abstract state representing the condition of the world; (b) it should form a belief which represents uncertainty on the world; (c) it should go beyond simple step-by-step simulation, and exhibit temporal abstraction. Motivated by the absence o… ▽ More

    Submitted 2 January, 2019; v1 submitted 8 June, 2018; originally announced June 2018.

  12. arXiv:1711.01846  [pdf, other

    stat.ML cs.LG q-bio.NC

    Fast amortized inference of neural activity from calcium imaging data with variational autoencoders

    Authors: Artur Speiser, Jinyao Yan, Evan Archer, Lars Buesing, Srinivas C. Turaga, Jakob H. Macke

    Abstract: Calcium imaging permits optical measurement of neural activity. Since intracellular calcium concentration is an indirect measurement of neural activity, computational tools are necessary to infer the true underlying spiking activity from fluorescence measurements. Bayesian model inversion can be used to solve this problem, but typically requires either computationally expensive MCMC sampling, or f… ▽ More

    Submitted 6 November, 2017; originally announced November 2017.

    Comments: NIPS 2017

  13. arXiv:1707.06203  [pdf, other

    cs.LG cs.AI stat.ML

    Imagination-Augmented Agents for Deep Reinforcement Learning

    Authors: Théophane Weber, Sébastien Racanière, David P. Reichert, Lars Buesing, Arthur Guez, Danilo Jimenez Rezende, Adria Puigdomènech Badia, Oriol Vinyals, Nicolas Heess, Yujia Li, Razvan Pascanu, Peter Battaglia, Demis Hassabis, David Silver, Daan Wierstra

    Abstract: We introduce Imagination-Augmented Agents (I2As), a novel architecture for deep reinforcement learning combining model-free and model-based aspects. In contrast to most existing model-based reinforcement learning and planning methods, which prescribe how a model should be used to arrive at a policy, I2As learn to interpret predictions from a learned environment model to construct implicit plans in… ▽ More

    Submitted 14 February, 2018; v1 submitted 19 July, 2017; originally announced July 2017.

  14. arXiv:1707.06170  [pdf, other

    cs.AI cs.LG cs.NE stat.ML

    Learning model-based planning from scratch

    Authors: Razvan Pascanu, Yujia Li, Oriol Vinyals, Nicolas Heess, Lars Buesing, Sebastien Racanière, David Reichert, Théophane Weber, Daan Wierstra, Peter Battaglia

    Abstract: Conventional wisdom holds that model-based planning is a powerful approach to sequential decision-making. It is often very challenging in practice, however, because while a model can be used to evaluate a plan, it does not prescribe how to construct a plan. Here we introduce the "Imagination-based Planner", the first model-based, sequential decision-making agent that can learn to construct, evalua… ▽ More

    Submitted 19 July, 2017; originally announced July 2017.

  15. arXiv:1511.07367  [pdf, other

    stat.ML

    Black box variational inference for state space models

    Authors: Evan Archer, Il Memming Park, Lars Buesing, John Cunningham, Liam Paninski

    Abstract: Latent variable time-series models are among the most heavily used tools from machine learning and applied statistics. These models have the advantage of learning latent structure both from noisy observations and from the temporal ordering in the data, where it is assumed that meaningful correlation structure exists across time. A few highly-structured models, such as the linear dynamical system w… ▽ More

    Submitted 23 November, 2015; originally announced November 2015.

  16. arXiv:1410.6791  [pdf, other

    stat.ML

    Bayesian Manifold Learning: The Locally Linear Latent Variable Model (LL-LVM)

    Authors: Mijung Park, Wittawat Jitkrittum, Ahmad Qamar, Zoltan Szabo, Lars Buesing, Maneesh Sahani

    Abstract: We introduce the Locally Linear Latent Variable Model (LL-LVM), a probabilistic model for non-linear manifold discovery that describes a joint distribution over observations, their manifold coordinates and locally linear maps conditioned on a set of neighbourhood relationships. The model allows straightforward variational optimisation of the posterior distribution on coordinates and locally linear… ▽ More

    Submitted 1 December, 2015; v1 submitted 24 October, 2014; originally announced October 2014.

    Comments: accepted to NIPS 2015

    MSC Class: 62F15 ACM Class: G.3; I.2.6