Skip to main content

Showing 1–50 of 56 results for author: Botvinick, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.15059  [pdf

    cs.AI cs.CY cs.GT

    Using deep reinforcement learning to promote sustainable human behaviour on a common pool resource problem

    Authors: Raphael Koster, Miruna Pîslar, Andrea Tacchetti, Jan Balaguer, Leqi Liu, Romuald Elie, Oliver P. Hauser, Karl Tuyls, Matt Botvinick, Christopher Summerfield

    Abstract: A canonical social dilemma arises when finite resources are allocated to a group of people, who can choose to either reciprocate with interest, or keep the proceeds for themselves. What resource allocation mechanisms will encourage levels of reciprocation that sustain the commons? Here, in an iterated multiplayer trust game, we use deep reinforcement learning (RL) to design an allocation mechanism… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  2. arXiv:2312.03759  [pdf, ps, other

    cs.CL cs.AI cs.CY cs.DL

    How should the advent of large language models affect the practice of science?

    Authors: Marcel Binz, Stephan Alaniz, Adina Roskies, Balazs Aczel, Carl T. Bergstrom, Colin Allen, Daniel Schad, Dirk Wulff, Jevin D. West, Qiong Zhang, Richard M. Shiffrin, Samuel J. Gershman, Ven Popov, Emily M. Bender, Marco Marelli, Matthew M. Botvinick, Zeynep Akata, Eric Schulz

    Abstract: Large language models (LLMs) are being increasingly incorporated into scientific workflows. However, we have yet to fully grasp the implications of this integration. How should the advent of large language models affect the practice of science? For this opinion piece, we have invited four diverse groups of scientists to reflect on this query, sharing their perspectives and engaging in debate. Schu… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  3. arXiv:2305.12907  [pdf, other

    cs.CL cs.AI cs.LG

    Meta-in-context learning in large language models

    Authors: Julian Coda-Forno, Marcel Binz, Zeynep Akata, Matthew Botvinick, Jane X. Wang, Eric Schulz

    Abstract: Large language models have shown tremendous performance in a variety of tasks. In-context learning -- the ability to improve at a task after being provided with a number of demonstrations -- is seen as one of the main contributors to their success. In the present paper, we demonstrate that the in-context learning abilities of large language models can be recursively improved via in-context learnin… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

  4. arXiv:2304.06729  [pdf, other

    cs.AI cs.LG

    Meta-Learned Models of Cognition

    Authors: Marcel Binz, Ishita Dasgupta, Akshay Jagadish, Matthew Botvinick, Jane X. Wang, Eric Schulz

    Abstract: Meta-learning is a framework for learning learning algorithms through repeated interactions with an environment as opposed to designing them by hand. In recent years, this framework has established itself as a promising tool for building models of human cognition. Yet, a coherent research program around meta-learned models of cognition is still missing. The purpose of this article is to synthesize… ▽ More

    Submitted 12 April, 2023; originally announced April 2023.

  5. arXiv:2304.05823  [pdf, other

    q-bio.MN cs.LG q-bio.GN

    DiscoGen: Learning to Discover Gene Regulatory Networks

    Authors: Nan Rosemary Ke, Sara-Jane Dunn, Jorg Bornschein, Silvia Chiappa, Melanie Rey, Jean-Baptiste Lespiau, Albin Cassirer, Jane Wang, Theophane Weber, David Barrett, Matthew Botvinick, Anirudh Goyal, Mike Mozer, Danilo Rezende

    Abstract: Accurately inferring Gene Regulatory Networks (GRNs) is a critical and challenging task in biology. GRNs model the activatory and inhibitory interactions between genes and are inherently causal in nature. To accurately identify GRNs, perturbational data is required. However, most GRN discovery methods only operate on observational data. Recent advances in neural network-based causal discovery meth… ▽ More

    Submitted 12 April, 2023; originally announced April 2023.

  6. arXiv:2211.15006  [pdf, other

    cs.LG cs.CL

    Fine-tuning language models to find agreement among humans with diverse preferences

    Authors: Michiel A. Bakker, Martin J. Chadwick, Hannah R. Sheahan, Michael Henry Tessler, Lucy Campbell-Gillingham, Jan Balaguer, Nat McAleese, Amelia Glaese, John Aslanides, Matthew M. Botvinick, Christopher Summerfield

    Abstract: Recent work in large language modeling (LLMs) has used fine-tuning to align outputs with the preferences of a prototypical user. This work assumes that human preferences are static and homogeneous across individuals, so that aligning to a a single "generic" user will confer more general alignment. Here, we embrace the heterogeneity of human preferences to consider a different challenge: how might… ▽ More

    Submitted 27 November, 2022; originally announced November 2022.

  7. arXiv:2210.08340  [pdf

    cs.AI q-bio.NC

    Toward Next-Generation Artificial Intelligence: Catalyzing the NeuroAI Revolution

    Authors: Anthony Zador, Sean Escola, Blake Richards, Bence Ölveczky, Yoshua Bengio, Kwabena Boahen, Matthew Botvinick, Dmitri Chklovskii, Anne Churchland, Claudia Clopath, James DiCarlo, Surya Ganguli, Jeff Hawkins, Konrad Koerding, Alexei Koulakov, Yann LeCun, Timothy Lillicrap, Adam Marblestone, Bruno Olshausen, Alexandre Pouget, Cristina Savin, Terrence Sejnowski, Eero Simoncelli, Sara Solla, David Sussillo , et al. (2 additional authors not shown)

    Abstract: Neuroscience has long been an essential driver of progress in artificial intelligence (AI). We propose that to accelerate progress in AI, we must invest in fundamental research in NeuroAI. A core component of this is the embodied Turing test, which challenges AI animal models to interact with the sensorimotor world at skill levels akin to their living counterparts. The embodied Turing test shifts… ▽ More

    Submitted 22 February, 2023; v1 submitted 15 October, 2022; originally announced October 2022.

    Comments: White paper, 10 pages + 8 pages of references, 1 figures

  8. arXiv:2210.08085  [pdf, other

    cs.AI q-bio.NC

    Adaptive patch foraging in deep reinforcement learning agents

    Authors: Nathan J. Wispinski, Andrew Butcher, Kory W. Mathewson, Craig S. Chapman, Matthew M. Botvinick, Patrick M. Pilarski

    Abstract: Patch foraging is one of the most heavily studied behavioral optimization challenges in biology. However, despite its importance to biological intelligence, this behavioral optimization problem is understudied in artificial intelligence research. Patch foraging is especially amenable to study given that it has a known optimal solution, which may be difficult to discover given current techniques in… ▽ More

    Submitted 21 April, 2023; v1 submitted 14 October, 2022; originally announced October 2022.

    Comments: Published in Transactions on Machine Learning Research (TMLR). See: https://openreview.net/pdf?id=a0T3nOP9sB

  9. arXiv:2207.08258  [pdf, other

    cs.LG

    Minimum Description Length Control

    Authors: Ted Moskovitz, Ta-Chu Kao, Maneesh Sahani, Matthew M. Botvinick

    Abstract: We propose a novel framework for multitask reinforcement learning based on the minimum description length (MDL) principle. In this approach, which we term MDL-control (MDL-C), the agent learns the common structure among the tasks with which it is faced and then distills it into a simpler representation which facilitates faster convergence and generalization to new tasks. In doing so, MDL-C natural… ▽ More

    Submitted 24 July, 2022; v1 submitted 17 July, 2022; originally announced July 2022.

  10. arXiv:2203.09498  [pdf, other

    cs.AI cs.CL cs.LG cs.MA

    The Frost Hollow Experiments: Pavlovian Signalling as a Path to Coordination and Communication Between Agents

    Authors: Patrick M. Pilarski, Andrew Butcher, Elnaz Davoodi, Michael Bradley Johanson, Dylan J. A. Brenneis, Adam S. R. Parker, Leslie Acker, Matthew M. Botvinick, Joseph Modayil, Adam White

    Abstract: Learned communication between agents is a powerful tool when approaching decision-making problems that are hard to overcome by any single agent in isolation. However, continual coordination and communication learning between machine agents or human-machine partnerships remains a challenging open problem. As a stepping stone toward solving the continual communication learning problem, in this paper… ▽ More

    Submitted 17 March, 2022; originally announced March 2022.

    Comments: 54 pages, 29 figures, 4 tables

  11. arXiv:2202.10890  [pdf, other

    cs.CV

    HiP: Hierarchical Perceiver

    Authors: Joao Carreira, Skanda Koppula, Daniel Zoran, Adria Recasens, Catalin Ionescu, Olivier Henaff, Evan Shelhamer, Relja Arandjelovic, Matt Botvinick, Oriol Vinyals, Karen Simonyan, Andrew Zisserman, Andrew Jaegle

    Abstract: General perception systems such as Perceivers can process arbitrary modalities in any combination and are able to handle up to a few hundred thousand inputs. They achieve this generality by using exclusively global attention operations. This however hinders them from scaling up to the inputs sizes required to process raw high-resolution images or video. In this paper, we show that some degree of l… ▽ More

    Submitted 3 November, 2022; v1 submitted 22 February, 2022; originally announced February 2022.

  12. arXiv:2202.10122  [pdf, other

    cs.MA cs.AI cs.LG econ.GN

    HCMD-zero: Learning Value Aligned Mechanisms from Data

    Authors: Jan Balaguer, Raphael Koster, Ari Weinstein, Lucy Campbell-Gillingham, Christopher Summerfield, Matthew Botvinick, Andrea Tacchetti

    Abstract: Artificial learning agents are mediating a larger and larger number of interactions among humans, firms, and organizations, and the intersection between mechanism design and machine learning has been heavily investigated in recent years. However, mechanism design methods often make strong assumptions on how participants behave (e.g. rationality), on the kind of knowledge designers have access to a… ▽ More

    Submitted 20 May, 2022; v1 submitted 21 February, 2022; originally announced February 2022.

  13. arXiv:2202.07765  [pdf, other

    cs.LG cs.AI cs.CV cs.SD eess.AS

    General-purpose, long-context autoregressive modeling with Perceiver AR

    Authors: Curtis Hawthorne, Andrew Jaegle, Cătălina Cangea, Sebastian Borgeaud, Charlie Nash, Mateusz Malinowski, Sander Dieleman, Oriol Vinyals, Matthew Botvinick, Ian Simon, Hannah Sheahan, Neil Zeghidour, Jean-Baptiste Alayrac, João Carreira, Jesse Engel

    Abstract: Real-world data is high-dimensional: a book, image, or musical performance can easily contain hundreds of thousands of elements even after compression. However, the most commonly used autoregressive models, Transformers, are prohibitively expensive to scale to the number of inputs and layers needed to capture this long-range structure. We develop Perceiver AR, an autoregressive, modality-agnostic… ▽ More

    Submitted 14 June, 2022; v1 submitted 15 February, 2022; originally announced February 2022.

    Comments: ICML 2022

  14. arXiv:2201.11441  [pdf

    cs.AI cs.HC cs.MA econ.GN

    Human-centered mechanism design with Democratic AI

    Authors: Raphael Koster, Jan Balaguer, Andrea Tacchetti, Ari Weinstein, Tina Zhu, Oliver Hauser, Duncan Williams, Lucy Campbell-Gillingham, Phoebe Thacker, Matthew Botvinick, Christopher Summerfield

    Abstract: Building artificial intelligence (AI) that aligns with human values is an unsolved problem. Here, we developed a human-in-the-loop research pipeline called Democratic AI, in which reinforcement learning is used to design a social mechanism that humans prefer by majority. A large group of humans played an online investment game that involved deciding whether to keep a monetary endowment or to share… ▽ More

    Submitted 27 January, 2022; originally announced January 2022.

    Comments: 18 pages, 4 figures, 54 pages including supplemental materials

  15. arXiv:2112.07774  [pdf, other

    cs.AI cs.HC cs.MA

    Assessing Human Interaction in Virtual Reality With Continually Learning Prediction Agents Based on Reinforcement Learning Algorithms: A Pilot Study

    Authors: Dylan J. A. Brenneis, Adam S. Parker, Michael Bradley Johanson, Andrew Butcher, Elnaz Davoodi, Leslie Acker, Matthew M. Botvinick, Joseph Modayil, Adam White, Patrick M. Pilarski

    Abstract: Artificial intelligence systems increasingly involve continual learning to enable flexibility in general situations that are not encountered during system training. Human interaction with autonomous systems is broadly studied, but research has hitherto under-explored interactions that occur while the system is actively learning, and can noticeably change its behaviour in minutes. In this pilot stu… ▽ More

    Submitted 22 April, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

  16. arXiv:2110.08176  [pdf, other

    cs.LG cs.HC cs.MA

    Collaborating with Humans without Human Data

    Authors: DJ Strouse, Kevin R. McKee, Matt Botvinick, Edward Hughes, Richard Everett

    Abstract: Collaborating with humans requires rapidly adapting to their individual strengths, weaknesses, and preferences. Unfortunately, most standard multi-agent reinforcement learning techniques, such as self-play (SP) or population play (PP), produce agents that overfit to their training partners and do not generalize well to humans. Alternatively, researchers can collect human data, train a human model… ▽ More

    Submitted 7 January, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: Accepted at NeurIPS 2021 (spotlight)

  17. arXiv:2107.14795  [pdf, other

    cs.LG cs.CL cs.CV cs.SD eess.AS

    Perceiver IO: A General Architecture for Structured Inputs & Outputs

    Authors: Andrew Jaegle, Sebastian Borgeaud, Jean-Baptiste Alayrac, Carl Doersch, Catalin Ionescu, David Ding, Skanda Koppula, Daniel Zoran, Andrew Brock, Evan Shelhamer, Olivier Hénaff, Matthew M. Botvinick, Andrew Zisserman, Oriol Vinyals, Joāo Carreira

    Abstract: A central goal of machine learning is the development of systems that can solve many problems in as many data domains as possible. Current architectures, however, cannot be applied beyond a small set of stereotyped settings, as they bake in domain & task assumptions or scale poorly to large inputs or outputs. In this work, we propose Perceiver IO, a general-purpose architecture that handles data f… ▽ More

    Submitted 15 March, 2022; v1 submitted 30 July, 2021; originally announced July 2021.

    Comments: ICLR 2022 camera ready. Code: https://dpmd.ai/perceiver-code

  18. arXiv:2106.03849  [pdf, other

    cs.CV cs.LG

    SIMONe: View-Invariant, Temporally-Abstracted Object Representations via Unsupervised Video Decomposition

    Authors: Rishabh Kabra, Daniel Zoran, Goker Erdogan, Loic Matthey, Antonia Creswell, Matthew Botvinick, Alexander Lerchner, Christopher P. Burgess

    Abstract: To help agents reason about scenes in terms of their building blocks, we wish to extract the compositional structure of any given scene (in particular, the configuration and characteristics of objects comprising the scene). This problem is especially difficult when scene structure needs to be inferred while also estimating the agent's location/viewpoint, as the two variables jointly give rise to t… ▽ More

    Submitted 6 December, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

    Comments: Animated figures are available at https://sites.google.com/view/simone-scene-understanding/

  19. arXiv:2103.04982  [pdf, other

    cs.MA cs.AI cs.GT

    A multi-agent reinforcement learning model of reputation and cooperation in human groups

    Authors: Kevin R. McKee, Edward Hughes, Tina O. Zhu, Martin J. Chadwick, Raphael Koster, Antonio Garcia Castaneda, Charlie Beattie, Thore Graepel, Matt Botvinick, Joel Z. Leibo

    Abstract: Collective action demands that individuals efficiently coordinate how much, where, and when to cooperate. Laboratory experiments have extensively explored the first part of this process, demonstrating that a variety of social-cognitive mechanisms influence how much individuals choose to invest in group efforts. However, experimental research has been unable to shed light on how social cognitive me… ▽ More

    Submitted 22 February, 2023; v1 submitted 8 March, 2021; originally announced March 2021.

  20. arXiv:2102.12425  [pdf, other

    cs.LG

    Synthetic Returns for Long-Term Credit Assignment

    Authors: David Raposo, Sam Ritter, Adam Santoro, Greg Wayne, Theophane Weber, Matt Botvinick, Hado van Hasselt, Francis Song

    Abstract: Since the earliest days of reinforcement learning, the workhorse method for assigning credit to actions over time has been temporal-difference (TD) learning, which propagates credit backward timestep-by-timestep. This approach suffers when delays between actions and rewards are long and when intervening unrelated events contribute variance to long-term returns. We propose state-associative (SA) le… ▽ More

    Submitted 24 February, 2021; originally announced February 2021.

  21. arXiv:2102.02926  [pdf, other

    cs.LG cs.AI

    Alchemy: A benchmark and analysis toolkit for meta-reinforcement learning agents

    Authors: Jane X. Wang, Michael King, Nicolas Porcel, Zeb Kurth-Nelson, Tina Zhu, Charlie Deck, Peter Choy, Mary Cassin, Malcolm Reynolds, Francis Song, Gavin Buttimore, David P. Reichert, Neil Rabinowitz, Loic Matthey, Demis Hassabis, Alexander Lerchner, Matthew Botvinick

    Abstract: There has been rapidly growing interest in meta-learning as a method for increasing the flexibility and sample efficiency of reinforcement learning. One problem in this area of research, however, has been a scarcity of adequate benchmark tasks. In general, the structure underlying past benchmarks has either been too simple to be inherently interesting, or too ill-defined to support principled anal… ▽ More

    Submitted 20 October, 2021; v1 submitted 4 February, 2021; originally announced February 2021.

    Comments: Published in Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 2021

  22. arXiv:2012.08508  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Attention over learned object embeddings enables complex visual reasoning

    Authors: David Ding, Felix Hill, Adam Santoro, Malcolm Reynolds, Matt Botvinick

    Abstract: Neural networks have achieved success in a wide array of perceptual tasks but often fail at tasks involving both perception and higher-level reasoning. On these more challenging tasks, bespoke approaches (such as modular symbolic components, independent dynamics models or semantic parsers) targeted towards that specific type of task have typically performed better. The downside to these targeted a… ▽ More

    Submitted 26 October, 2021; v1 submitted 15 December, 2020; originally announced December 2020.

    Comments: 22 pages, 5 figures

  23. arXiv:2010.09054  [pdf, other

    cs.MA

    Model-free conventions in multi-agent reinforcement learning with heterogeneous preferences

    Authors: Raphael Köster, Kevin R. McKee, Richard Everett, Laura Weidinger, William S. Isaac, Edward Hughes, Edgar A. Duéñez-Guzmán, Thore Graepel, Matthew Botvinick, Joel Z. Leibo

    Abstract: Game theoretic views of convention generally rest on notions of common knowledge and hyper-rational models of individual behavior. However, decades of work in behavioral economics have questioned the validity of both foundations. Meanwhile, computational neuroscience has contributed a modernized 'dual process' account of decision-making where model-free (MF) reinforcement learning trades off with… ▽ More

    Submitted 14 December, 2020; v1 submitted 18 October, 2020; originally announced October 2020.

  24. arXiv:2007.03750  [pdf, other

    cs.AI cs.LG q-bio.NC

    Deep Reinforcement Learning and its Neuroscientific Implications

    Authors: Matthew Botvinick, Jane X. Wang, Will Dabney, Kevin J. Miller, Zeb Kurth-Nelson

    Abstract: The emergence of powerful artificial intelligence is defining new research directions in neuroscience. To date, this research has focused largely on deep neural networks trained using supervised learning, in tasks such as image classification. However, there is another area of recent AI work which has so far received less attention from neuroscientists, but which may have profound neuroscientific… ▽ More

    Submitted 7 July, 2020; originally announced July 2020.

    Comments: 22 pages, 5 figures

  25. arXiv:2006.03662  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    Rapid Task-Solving in Novel Environments

    Authors: Sam Ritter, Ryan Faulkner, Laurent Sartran, Adam Santoro, Matt Botvinick, David Raposo

    Abstract: We propose the challenge of rapid task-solving in novel environments (RTS), wherein an agent must solve a series of tasks as rapidly as possible in an unfamiliar environment. An effective RTS agent must balance between exploring the unfamiliar environment and solving its current task, all while building a model of the new environment over which it can plan when faced with later tasks. While modern… ▽ More

    Submitted 19 April, 2021; v1 submitted 5 June, 2020; originally announced June 2020.

  26. arXiv:2004.11935  [pdf, other

    stat.ML cs.LG

    The Variational Bandwidth Bottleneck: Stochastic Evaluation on an Information Budget

    Authors: Anirudh Goyal, Yoshua Bengio, Matthew Botvinick, Sergey Levine

    Abstract: In many applications, it is desirable to extract only the relevant information from complex input data, which involves making a decision about which input features are relevant. The information bottleneck method formalizes this as an information-theoretic optimization problem by maintaining an optimal tradeoff between compression (throwing away irrelevant input information), and predicting the tar… ▽ More

    Submitted 24 April, 2020; originally announced April 2020.

    Comments: Published as a conference paper at ICLR 2020

  27. arXiv:2001.10913  [pdf, other

    cs.LG cs.AI

    MEMO: A Deep Network for Flexible Combination of Episodic Memories

    Authors: Andrea Banino, Adrià Puigdomènech Badia, Raphael Köster, Martin J. Chadwick, Vinicius Zambaldi, Demis Hassabis, Caswell Barry, Matthew Botvinick, Dharshan Kumaran, Charles Blundell

    Abstract: Recent research developing neural network architectures with external memory have often used the benchmark bAbI question and answering dataset which provides a challenging number of tasks requiring reasoning. Here we employed a classic associative inference task from the memory-based reasoning neuroscience literature in order to more carefully probe the reasoning capacity of existing memory-augmen… ▽ More

    Submitted 29 January, 2020; originally announced January 2020.

    Comments: 9 pages, 2 figures, 3 tables, to be published as a conference paper at ICLR 2020

    ACM Class: I.2.6

  28. arXiv:1910.06764  [pdf, other

    cs.LG cs.AI stat.ML

    Stabilizing Transformers for Reinforcement Learning

    Authors: Emilio Parisotto, H. Francis Song, Jack W. Rae, Razvan Pascanu, Caglar Gulcehre, Siddhant M. Jayakumar, Max Jaderberg, Raphael Lopez Kaufman, Aidan Clark, Seb Noury, Matthew M. Botvinick, Nicolas Heess, Raia Hadsell

    Abstract: Owing to their ability to both effectively integrate information over long time horizons and scale to massive amounts of data, self-attention architectures have recently shown breakthrough success in natural language processing (NLP), achieving state-of-the-art results in domains such as language modeling and machine translation. Harnessing the transformer's ability to process long time horizons o… ▽ More

    Submitted 13 October, 2019; originally announced October 2019.

  29. arXiv:1910.00571  [pdf, other

    cs.AI

    Environmental drivers of systematicity and generalization in a situated agent

    Authors: Felix Hill, Andrew Lampinen, Rosalia Schneider, Stephen Clark, Matthew Botvinick, James L. McClelland, Adam Santoro

    Abstract: The question of whether deep neural networks are good at generalising beyond their immediate training experience is of critical importance for learning-based approaches to AI. Here, we consider tests of out-of-sample generalisation that require an agent to respond to never-seen-before instructions by manipulating and positioning objects in a 3D Unity simulated room. We first describe a comparative… ▽ More

    Submitted 19 February, 2020; v1 submitted 1 October, 2019; originally announced October 2019.

  30. arXiv:1909.12238  [pdf, other

    cs.AI cs.LG

    V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control

    Authors: H. Francis Song, Abbas Abdolmaleki, Jost Tobias Springenberg, Aidan Clark, Hubert Soyer, Jack W. Rae, Seb Noury, Arun Ahuja, Siqi Liu, Dhruva Tirumala, Nicolas Heess, Dan Belov, Martin Riedmiller, Matthew M. Botvinick

    Abstract: Some of the most successful applications of deep reinforcement learning to challenging domains in discrete and continuous control have used policy gradient methods in the on-policy setting. However, policy gradients can suffer from large variance that may limit performance, and in practice require carefully tuned entropy regularization to prevent policy collapse. As an alternative to policy gradie… ▽ More

    Submitted 26 September, 2019; originally announced September 2019.

    Comments: * equal contribution

  31. arXiv:1905.03030  [pdf, other

    cs.LG cs.AI stat.ML

    Meta-learning of Sequential Strategies

    Authors: Pedro A. Ortega, Jane X. Wang, Mark Rowland, Tim Genewein, Zeb Kurth-Nelson, Razvan Pascanu, Nicolas Heess, Joel Veness, Alex Pritzel, Pablo Sprechmann, Siddhant M. Jayakumar, Tom McGrath, Kevin Miller, Mohammad Azar, Ian Osband, Neil Rabinowitz, András György, Silvia Chiappa, Simon Osindero, Yee Whye Teh, Hado van Hasselt, Nando de Freitas, Matthew Botvinick, Shane Legg

    Abstract: In this report we review memory-based meta-learning as a tool for building sample-efficient strategies that learn from past experience to adapt to any task within a target class. Our goal is to equip the reader with the conceptual foundations of this tool for building new, scalable agents that operate on broad domains. To do so, we present basic algorithmic templates for building near-optimal pred… ▽ More

    Submitted 18 July, 2019; v1 submitted 8 May, 2019; originally announced May 2019.

    Comments: DeepMind Technical Report (15 pages, 6 figures). Version V1.1

  32. arXiv:1905.02691  [pdf, other

    cs.AI cs.HC cs.LG

    Learned human-agent decision-making, communication and joint action in a virtual reality environment

    Authors: Patrick M. Pilarski, Andrew Butcher, Michael Johanson, Matthew M. Botvinick, Andrew Bolt, Adam S. R. Parker

    Abstract: Humans make decisions and act alongside other humans to pursue both short-term and long-term goals. As a result of ongoing progress in areas such as computing science and automation, humans now also interact with non-human agents of varying complexity as part of their day-to-day activities; substantial work is being done to integrate increasingly intelligent machine agents into human work and play… ▽ More

    Submitted 7 May, 2019; originally announced May 2019.

    Comments: 5 pages, 3 figures. Accepted to The 4th Multidisciplinary Conference on Reinforcement Learning and Decision Making, July 7-10, 2019, McGill University, Montreal, Quebec, Canada

  33. arXiv:1904.10396  [pdf, other

    q-bio.NC cs.AI cs.LG

    Is coding a relevant metaphor for building AI? A commentary on "Is coding a relevant metaphor for the brain?", by Romain Brette

    Authors: Adam Santoro, Felix Hill, David Barrett, David Raposo, Matthew Botvinick, Timothy Lillicrap

    Abstract: Brette contends that the neural coding metaphor is an invalid basis for theories of what the brain does. Here, we argue that it is an insufficient guide for building an artificial intelligence that learns to accomplish short- and long-term goals in a complex, changing environment.

    Submitted 18 April, 2019; originally announced April 2019.

  34. arXiv:1903.00450  [pdf, other

    cs.LG cs.CV stat.ML

    Multi-Object Representation Learning with Iterative Variational Inference

    Authors: Klaus Greff, Raphaël Lopez Kaufman, Rishabh Kabra, Nick Watters, Chris Burgess, Daniel Zoran, Loic Matthey, Matthew Botvinick, Alexander Lerchner

    Abstract: Human perception is structured around objects which form the basis for our higher-level cognition and impressive systematic generalization abilities. Yet most work on representation learning focuses on feature learning without even considering multiple objects, or treats segmentation as an (often supervised) preprocessing step. Instead, we argue for the importance of learning to segment and repres… ▽ More

    Submitted 27 July, 2020; v1 submitted 1 March, 2019; originally announced March 2019.

    Journal ref: ICML 2019 (PMLR 97:2424-2433)

  35. arXiv:1901.11390  [pdf, other

    cs.CV cs.LG stat.ML

    MONet: Unsupervised Scene Decomposition and Representation

    Authors: Christopher P. Burgess, Loic Matthey, Nicholas Watters, Rishabh Kabra, Irina Higgins, Matt Botvinick, Alexander Lerchner

    Abstract: The ability to decompose scenes in terms of abstract building blocks is crucial for general intelligence. Where those basic building blocks share meaningful properties, interactions and other regularities across scenes, such decompositions can simplify reasoning and facilitate imagination of novel scenarios. In particular, representing perceptual observations in terms of entities should improve da… ▽ More

    Submitted 22 January, 2019; originally announced January 2019.

  36. arXiv:1901.10902  [pdf, other

    stat.ML cs.LG

    InfoBot: Transfer and Exploration via the Information Bottleneck

    Authors: Anirudh Goyal, Riashat Islam, Daniel Strouse, Zafarali Ahmed, Matthew Botvinick, Hugo Larochelle, Yoshua Bengio, Sergey Levine

    Abstract: A central challenge in reinforcement learning is discovering effective policies for tasks where rewards are sparsely distributed. We postulate that in the absence of useful reward signals, an effective exploration strategy should seek out {\it decision states}. These states lie at critical junctions in the state space from where the agent can transition to new, potentially unexplored regions. We p… ▽ More

    Submitted 5 December, 2023; v1 submitted 30 January, 2019; originally announced January 2019.

    Comments: Accepted at ICLR'19

  37. arXiv:1901.08162  [pdf, other

    cs.LG cs.AI stat.ML

    Causal Reasoning from Meta-reinforcement Learning

    Authors: Ishita Dasgupta, Jane Wang, Silvia Chiappa, Jovana Mitrovic, Pedro Ortega, David Raposo, Edward Hughes, Peter Battaglia, Matthew Botvinick, Zeb Kurth-Nelson

    Abstract: Discovering and exploiting the causal structure in the environment is a crucial challenge for intelligent agents. Here we explore whether causal reasoning can emerge via meta-reinforcement learning. We train a recurrent network with model-free reinforcement learning to solve a range of problems that each contain causal structure. We find that the trained agent can perform causal reasoning in novel… ▽ More

    Submitted 23 January, 2019; originally announced January 2019.

  38. arXiv:1811.01458  [pdf, other

    cs.MA cs.AI cs.LG

    Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning

    Authors: Jakob N. Foerster, Francis Song, Edward Hughes, Neil Burch, Iain Dunning, Shimon Whiteson, Matthew Botvinick, Michael Bowling

    Abstract: When observing the actions of others, humans make inferences about why they acted as they did, and what this implies about the world; humans also use the fact that their actions will be interpreted in this manner, allowing them to act informatively and thereby communicate efficiently with others. Although learning algorithms have recently achieved superhuman performance in a number of two-player,… ▽ More

    Submitted 10 September, 2019; v1 submitted 4 November, 2018; originally announced November 2018.

  39. arXiv:1809.11044  [pdf, other

    cs.LG cs.AI cs.MA stat.ML

    Relational Forward Models for Multi-Agent Learning

    Authors: Andrea Tacchetti, H. Francis Song, Pedro A. M. Mediano, Vinicius Zambaldi, Neil C. Rabinowitz, Thore Graepel, Matthew Botvinick, Peter W. Battaglia

    Abstract: The behavioral dynamics of multi-agent systems have a rich and orderly structure, which can be leveraged to understand these systems, and to improve how artificial agents learn to operate in them. Here we introduce Relational Forward Models (RFM) for multi-agent learning, networks that can learn to make accurate predictions of agents' future behavior in multi-agent environments. Because these mode… ▽ More

    Submitted 28 September, 2018; originally announced September 2018.

  40. arXiv:1808.02093  [pdf, other

    cs.AI cs.IT cs.LG cs.MA stat.ML

    Learning to Share and Hide Intentions using Information Regularization

    Authors: DJ Strouse, Max Kleiman-Weiner, Josh Tenenbaum, Matt Botvinick, David Schwab

    Abstract: Learning to cooperate with friends and compete with foes is a key component of multi-agent reinforcement learning. Typically to do so, one requires access to either a model of or interaction with the other agent(s). Here we show how to learn effective strategies for cooperation and competition in an asymmetric information game with no such model or interaction. Our approach is to encourage an agen… ▽ More

    Submitted 1 January, 2019; v1 submitted 6 August, 2018; originally announced August 2018.

    Comments: Presented at the 32nd Conference on Neural Information Processing Systems (NIPS 2018)

  41. arXiv:1806.01830  [pdf, other

    cs.LG stat.ML

    Relational Deep Reinforcement Learning

    Authors: Vinicius Zambaldi, David Raposo, Adam Santoro, Victor Bapst, Yujia Li, Igor Babuschkin, Karl Tuyls, David Reichert, Timothy Lillicrap, Edward Lockhart, Murray Shanahan, Victoria Langston, Razvan Pascanu, Matthew Botvinick, Oriol Vinyals, Peter Battaglia

    Abstract: We introduce an approach for deep reinforcement learning (RL) that improves upon the efficiency, generalization capacity, and interpretability of conventional approaches through structured perception and relational reasoning. It uses self-attention to iteratively reason about the relations between entities in a scene and to guide a model-free policy. Our results show that in a novel navigation and… ▽ More

    Submitted 28 June, 2018; v1 submitted 5 June, 2018; originally announced June 2018.

  42. arXiv:1806.01261  [pdf, other

    cs.LG cs.AI stat.ML

    Relational inductive biases, deep learning, and graph networks

    Authors: Peter W. Battaglia, Jessica B. Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, Caglar Gulcehre, Francis Song, Andrew Ballard, Justin Gilmer, George Dahl, Ashish Vaswani, Kelsey Allen, Charles Nash, Victoria Langston, Chris Dyer, Nicolas Heess, Daan Wierstra, Pushmeet Kohli, Matt Botvinick, Oriol Vinyals , et al. (2 additional authors not shown)

    Abstract: Artificial intelligence (AI) has undergone a renaissance recently, making major progress in key domains such as vision, language, control, and decision-making. This has been due, in part, to cheap data and cheap compute resources, which have fit the natural strengths of deep learning. However, many defining characteristics of human intelligence, which developed under much different pressures, rema… ▽ More

    Submitted 17 October, 2018; v1 submitted 4 June, 2018; originally announced June 2018.

  43. arXiv:1805.09692  [pdf, other

    stat.ML cs.AI cs.LG cs.NE

    Been There, Done That: Meta-Learning with Episodic Recall

    Authors: Samuel Ritter, Jane X. Wang, Zeb Kurth-Nelson, Siddhant M. Jayakumar, Charles Blundell, Razvan Pascanu, Matthew Botvinick

    Abstract: Meta-learning agents excel at rapidly learning new tasks from open-ended task distributions; yet, they forget what they learn about each task as soon as the next begins. When tasks reoccur - as they do in natural environments - metalearning agents must explore again instead of immediately exploiting previously discovered solutions. We propose a formalism for generating open-ended yet repetitious e… ▽ More

    Submitted 6 July, 2018; v1 submitted 24 May, 2018; originally announced May 2018.

    Comments: ICML 2018

  44. arXiv:1804.01128  [pdf, other

    cs.AI

    Probing Physics Knowledge Using Tools from Developmental Psychology

    Authors: Luis Piloto, Ari Weinstein, Dhruva TB, Arun Ahuja, Mehdi Mirza, Greg Wayne, David Amos, Chia-chun Hung, Matt Botvinick

    Abstract: In order to build agents with a rich understanding of their environment, one key objective is to endow them with a grasp of intuitive physics; an ability to reason about three-dimensional objects, their dynamic interactions, and responses to forces. While some work on this problem has taken the approach of building in components such as ready-made physics engines, other research aims to extract ge… ▽ More

    Submitted 3 April, 2018; originally announced April 2018.

  45. arXiv:1803.10760  [pdf, other

    cs.LG stat.ML

    Unsupervised Predictive Memory in a Goal-Directed Agent

    Authors: Greg Wayne, Chia-Chun Hung, David Amos, Mehdi Mirza, Arun Ahuja, Agnieszka Grabska-Barwinska, Jack Rae, Piotr Mirowski, Joel Z. Leibo, Adam Santoro, Mevlana Gemici, Malcolm Reynolds, Tim Harley, Josh Abramson, Shakir Mohamed, Danilo Rezende, David Saxton, Adam Cain, Chloe Hillier, David Silver, Koray Kavukcuoglu, Matt Botvinick, Demis Hassabis, Timothy Lillicrap

    Abstract: Animals execute goal-directed behaviours despite the limited range and scope of their sensors. To cope, they explore environments and store memories maintaining estimates of important information that is not presently available. Recently, progress has been made with artificial intelligence (AI) agents that learn to perform tasks from sensory input, even at a human level, by merging reinforcement l… ▽ More

    Submitted 28 March, 2018; originally announced March 2018.

  46. arXiv:1803.06959  [pdf, other

    stat.ML cs.AI cs.LG cs.NE

    On the importance of single directions for generalization

    Authors: Ari S. Morcos, David G. T. Barrett, Neil C. Rabinowitz, Matthew Botvinick

    Abstract: Despite their ability to memorize large datasets, deep neural networks often achieve good generalization performance. However, the differences between the learned solutions of networks which generalize and those which do not remain unclear. Additionally, the tuning properties of single directions (defined as the activation of a single unit or some linear combination of units in response to some in… ▽ More

    Submitted 22 May, 2018; v1 submitted 19 March, 2018; originally announced March 2018.

    Comments: ICLR 2018 conference paper; added additional methodological details

  47. arXiv:1802.07740  [pdf, other

    cs.AI

    Machine Theory of Mind

    Authors: Neil C. Rabinowitz, Frank Perbet, H. Francis Song, Chiyuan Zhang, S. M. Ali Eslami, Matthew Botvinick

    Abstract: Theory of mind (ToM; Premack & Woodruff, 1978) broadly refers to humans' ability to represent the mental states of others, including their desires, beliefs, and intentions. We propose to train a machine to build such models too. We design a Theory of Mind neural network -- a ToMnet -- which uses meta-learning to build models of the agents it encounters, from observations of their behaviour alone.… ▽ More

    Submitted 12 March, 2018; v1 submitted 21 February, 2018; originally announced February 2018.

    Comments: 21 pages, 15 figures

  48. arXiv:1801.08116  [pdf, other

    cs.AI cs.NE q-bio.NC

    Psychlab: A Psychology Laboratory for Deep Reinforcement Learning Agents

    Authors: Joel Z. Leibo, Cyprien de Masson d'Autume, Daniel Zoran, David Amos, Charles Beattie, Keith Anderson, Antonio García Castañeda, Manuel Sanchez, Simon Green, Audrunas Gruslys, Shane Legg, Demis Hassabis, Matthew M. Botvinick

    Abstract: Psychlab is a simulated psychology laboratory inside the first-person 3D game world of DeepMind Lab (Beattie et al. 2016). Psychlab enables implementations of classical laboratory psychological experiments so that they work with both human and artificial agents. Psychlab has a simple and flexible API that enables users to easily create their own tasks. As examples, we are releasing Psychlab implem… ▽ More

    Submitted 4 February, 2018; v1 submitted 24 January, 2018; originally announced January 2018.

    Comments: 28 pages, 11 figures

  49. arXiv:1711.08378  [pdf

    cs.AI

    Building Machines that Learn and Think for Themselves: Commentary on Lake et al., Behavioral and Brain Sciences, 2017

    Authors: M. Botvinick, D. G. T. Barrett, P. Battaglia, N. de Freitas, D. Kumaran, J. Z Leibo, T. Lillicrap, J. Modayil, S. Mohamed, N. C. Rabinowitz, D. J. Rezende, A. Santoro, T. Schaul, C. Summerfield, G. Wayne, T. Weber, D. Wierstra, S. Legg, D. Hassabis

    Abstract: We agree with Lake and colleagues on their list of key ingredients for building humanlike intelligence, including the idea that model-based reasoning is essential. However, we favor an approach that centers on one additional ingredient: autonomy. In particular, we aim toward agents that can both build and exploit their own internal models, with minimal human hand-engineering. We believe an approac… ▽ More

    Submitted 22 November, 2017; originally announced November 2017.

  50. arXiv:1707.08475  [pdf, other

    stat.ML cs.AI cs.LG

    DARLA: Improving Zero-Shot Transfer in Reinforcement Learning

    Authors: Irina Higgins, Arka Pal, Andrei A. Rusu, Loic Matthey, Christopher P Burgess, Alexander Pritzel, Matthew Botvinick, Charles Blundell, Alexander Lerchner

    Abstract: Domain adaptation is an important open problem in deep reinforcement learning (RL). In many scenarios of interest data is hard to obtain, so agents may learn a source policy in a setting where data is readily available, with the hope that it generalises well to the target domain. We propose a new multi-stage RL agent, DARLA (DisentAngled Representation Learning Agent), which learns to see before l… ▽ More

    Submitted 6 June, 2018; v1 submitted 26 July, 2017; originally announced July 2017.

    Comments: ICML 2017