Zum Hauptinhalt springen

Showing 1–15 of 15 results for author: Köster, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.15059  [pdf

    cs.AI cs.CY cs.GT

    Using deep reinforcement learning to promote sustainable human behaviour on a common pool resource problem

    Authors: Raphael Koster, Miruna Pîslar, Andrea Tacchetti, Jan Balaguer, Leqi Liu, Romuald Elie, Oliver P. Hauser, Karl Tuyls, Matt Botvinick, Christopher Summerfield

    Abstract: A canonical social dilemma arises when finite resources are allocated to a group of people, who can choose to either reciprocate with interest, or keep the proceeds for themselves. What resource allocation mechanisms will encourage levels of reciprocation that sustain the commons? Here, in an iterated multiplayer trust game, we use deep reinforcement learning (RL) to design an allocation mechanism… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  2. arXiv:2305.13786  [pdf, other

    cs.CV cs.AI cs.LG

    Perception Test: A Diagnostic Benchmark for Multimodal Video Models

    Authors: Viorica Pătrăucean, Lucas Smaira, Ankush Gupta, Adrià Recasens Continente, Larisa Markeeva, Dylan Banarse, Skanda Koppula, Joseph Heyward, Mateusz Malinowski, Yi Yang, Carl Doersch, Tatiana Matejovicova, Yury Sulsky, Antoine Miech, Alex Frechette, Hanna Klimczak, Raphael Koster, Junlin Zhang, Stephanie Winkler, Yusuf Aytar, Simon Osindero, Dima Damen, Andrew Zisserman, João Carreira

    Abstract: We propose a novel multimodal video benchmark - the Perception Test - to evaluate the perception and reasoning skills of pre-trained multimodal models (e.g. Flamingo, SeViLA, or GPT-4). Compared to existing benchmarks that focus on computational tasks (e.g. classification, detection or tracking), the Perception Test focuses on skills (Memory, Abstraction, Physics, Semantics) and types of reasoning… ▽ More

    Submitted 30 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: 37th Conference on Neural Information Processing Systems (NeurIPS 2023) Track on Datasets and Benchmarks

  3. arXiv:2211.13746  [pdf, other

    cs.MA cs.AI cs.GT cs.NE

    Melting Pot 2.0

    Authors: John P. Agapiou, Alexander Sasha Vezhnevets, Edgar A. Duéñez-Guzmán, Jayd Matyas, Yiran Mao, Peter Sunehag, Raphael Köster, Udari Madhushani, Kavya Kopparapu, Ramona Comanescu, DJ Strouse, Michael B. Johanson, Sukhdeep Singh, Julia Haas, Igor Mordatch, Dean Mobbs, Joel Z. Leibo

    Abstract: Multi-agent artificial intelligence research promises a path to develop intelligent technologies that are more human-like and more human-compatible than those produced by "solipsistic" approaches, which do not consider interactions between agents. Melting Pot is a research tool developed to facilitate work on multi-agent artificial intelligence, and provides an evaluation protocol that measures ge… ▽ More

    Submitted 30 October, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

    Comments: 69 pages, 54 figures. arXiv admin note: text overlap with arXiv:2107.06857

  4. arXiv:2202.10135  [pdf, other

    cs.MA cs.AI cs.LG econ.GN

    The Good Shepherd: An Oracle Agent for Mechanism Design

    Authors: Jan Balaguer, Raphael Koster, Christopher Summerfield, Andrea Tacchetti

    Abstract: From social networks to traffic routing, artificial learning agents are playing a central role in modern institutions. We must therefore understand how to leverage these systems to foster outcomes and behaviors that align with our own values and aspirations. While multiagent learning has received considerable attention in recent years, artificial agents have been primarily evaluated when interacti… ▽ More

    Submitted 21 February, 2022; originally announced February 2022.

  5. arXiv:2202.10122  [pdf, other

    cs.MA cs.AI cs.LG econ.GN

    HCMD-zero: Learning Value Aligned Mechanisms from Data

    Authors: Jan Balaguer, Raphael Koster, Ari Weinstein, Lucy Campbell-Gillingham, Christopher Summerfield, Matthew Botvinick, Andrea Tacchetti

    Abstract: Artificial learning agents are mediating a larger and larger number of interactions among humans, firms, and organizations, and the intersection between mechanism design and machine learning has been heavily investigated in recent years. However, mechanism design methods often make strong assumptions on how participants behave (e.g. rationality), on the kind of knowledge designers have access to a… ▽ More

    Submitted 20 May, 2022; v1 submitted 21 February, 2022; originally announced February 2022.

  6. arXiv:2201.11441  [pdf

    cs.AI cs.HC cs.MA econ.GN

    Human-centered mechanism design with Democratic AI

    Authors: Raphael Koster, Jan Balaguer, Andrea Tacchetti, Ari Weinstein, Tina Zhu, Oliver Hauser, Duncan Williams, Lucy Campbell-Gillingham, Phoebe Thacker, Matthew Botvinick, Christopher Summerfield

    Abstract: Building artificial intelligence (AI) that aligns with human values is an unsolved problem. Here, we developed a human-in-the-loop research pipeline called Democratic AI, in which reinforcement learning is used to design a social mechanism that humans prefer by majority. A large group of humans played an online investment game that involved deciding whether to keep a monetary endowment or to share… ▽ More

    Submitted 27 January, 2022; originally announced January 2022.

    Comments: 18 pages, 4 figures, 54 pages including supplemental materials

  7. arXiv:2112.06751  [pdf, other

    cs.AI cs.HC

    Role of Human-AI Interaction in Selective Prediction

    Authors: Elizabeth Bondi, Raphael Koster, Hannah Sheahan, Martin Chadwick, Yoram Bachrach, Taylan Cemgil, Ulrich Paquet, Krishnamurthy Dvijotham

    Abstract: Recent work has shown the potential benefit of selective prediction systems that can learn to defer to a human when the predictions of the AI are unreliable, particularly to improve the reliability of AI systems in high-stakes applications like healthcare or conservation. However, most prior work assumes that human behavior remains unchanged when they solve a prediction task as part of a human-AI… ▽ More

    Submitted 16 May, 2022; v1 submitted 13 December, 2021; originally announced December 2021.

    Comments: Published in AAAI 2022; added link to data, small formatting corrections for camera-ready, including small changes to Fig 6-7 that do not change conclusions

  8. arXiv:2107.06857  [pdf, other

    cs.MA cs.AI

    Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot

    Authors: Joel Z. Leibo, Edgar Duéñez-Guzmán, Alexander Sasha Vezhnevets, John P. Agapiou, Peter Sunehag, Raphael Koster, Jayd Matyas, Charles Beattie, Igor Mordatch, Thore Graepel

    Abstract: Existing evaluation suites for multi-agent reinforcement learning (MARL) do not assess generalization to novel situations as their primary objective (unlike supervised-learning benchmarks). Our contribution, Melting Pot, is a MARL evaluation suite that fills this gap, and uses reinforcement learning to reduce the human labor required to create novel test scenarios. This works because one agent's b… ▽ More

    Submitted 14 July, 2021; originally announced July 2021.

    Comments: Accepted to ICML 2021 and presented as a long talk; 33 pages; 9 figures

    Journal ref: In International Conference on Machine Learning 2021 (pp. 6187-6199). PMLR

  9. arXiv:2106.09012  [pdf, other

    cs.MA

    A learning agent that acquires social norms from public sanctions in decentralized multi-agent settings

    Authors: Eugene Vinitsky, Raphael Köster, John P. Agapiou, Edgar Duéñez-Guzmán, Alexander Sasha Vezhnevets, Joel Z. Leibo

    Abstract: Society is characterized by the presence of a variety of social norms: collective patterns of sanctioning that can prevent miscoordination and free-riding. Inspired by this, we aim to construct learning dynamics where potentially beneficial social norms can emerge. Since social norms are underpinned by sanctioning, we introduce a training regime where agents can access all sanctioning events but l… ▽ More

    Submitted 27 September, 2022; v1 submitted 16 June, 2021; originally announced June 2021.

  10. arXiv:2103.04982  [pdf, other

    cs.MA cs.AI cs.GT

    A multi-agent reinforcement learning model of reputation and cooperation in human groups

    Authors: Kevin R. McKee, Edward Hughes, Tina O. Zhu, Martin J. Chadwick, Raphael Koster, Antonio Garcia Castaneda, Charlie Beattie, Thore Graepel, Matt Botvinick, Joel Z. Leibo

    Abstract: Collective action demands that individuals efficiently coordinate how much, where, and when to cooperate. Laboratory experiments have extensively explored the first part of this process, demonstrating that a variety of social-cognitive mechanisms influence how much individuals choose to invest in group efforts. However, experimental research has been unable to shed light on how social cognitive me… ▽ More

    Submitted 22 February, 2023; v1 submitted 8 March, 2021; originally announced March 2021.

  11. arXiv:2010.09054  [pdf, other

    cs.MA

    Model-free conventions in multi-agent reinforcement learning with heterogeneous preferences

    Authors: Raphael Köster, Kevin R. McKee, Richard Everett, Laura Weidinger, William S. Isaac, Edward Hughes, Edgar A. Duéñez-Guzmán, Thore Graepel, Matthew Botvinick, Joel Z. Leibo

    Abstract: Game theoretic views of convention generally rest on notions of common knowledge and hyper-rational models of individual behavior. However, decades of work in behavioral economics have questioned the validity of both foundations. Meanwhile, computational neuroscience has contributed a modernized 'dual process' account of decision-making where model-free (MF) reinforcement learning trades off with… ▽ More

    Submitted 14 December, 2020; v1 submitted 18 October, 2020; originally announced October 2020.

  12. arXiv:2001.10913  [pdf, other

    cs.LG cs.AI

    MEMO: A Deep Network for Flexible Combination of Episodic Memories

    Authors: Andrea Banino, Adrià Puigdomènech Badia, Raphael Köster, Martin J. Chadwick, Vinicius Zambaldi, Demis Hassabis, Caswell Barry, Matthew Botvinick, Dharshan Kumaran, Charles Blundell

    Abstract: Recent research developing neural network architectures with external memory have often used the benchmark bAbI question and answering dataset which provides a challenging number of tasks requiring reasoning. Here we employed a classic associative inference task from the memory-based reasoning neuroscience literature in order to more carefully probe the reasoning capacity of existing memory-augmen… ▽ More

    Submitted 29 January, 2020; originally announced January 2020.

    Comments: 9 pages, 2 figures, 3 tables, to be published as a conference paper at ICLR 2020

    ACM Class: I.2.6

  13. arXiv:2001.09318  [pdf, other

    cs.MA cs.AI

    Silly rules improve the capacity of agents to learn stable enforcement and compliance behaviors

    Authors: Raphael Köster, Dylan Hadfield-Menell, Gillian K. Hadfield, Joel Z. Leibo

    Abstract: How can societies learn to enforce and comply with social norms? Here we investigate the learning dynamics and emergence of compliance and enforcement of social norms in a foraging game, implemented in a multi-agent reinforcement learning setting. In this spatiotemporally extended game, individuals are incentivized to implement complex berry-foraging policies and punish transgressions against soci… ▽ More

    Submitted 25 January, 2020; originally announced January 2020.

  14. arXiv:1803.08884  [pdf, other

    cs.NE cs.AI cs.GT cs.MA q-bio.PE

    Inequity aversion improves cooperation in intertemporal social dilemmas

    Authors: Edward Hughes, Joel Z. Leibo, Matthew G. Phillips, Karl Tuyls, Edgar A. Duéñez-Guzmán, Antonio García Castañeda, Iain Dunning, Tina Zhu, Kevin R. McKee, Raphael Koster, Heather Roff, Thore Graepel

    Abstract: Groups of humans are often able to find ways to cooperate with one another in complex, temporally extended social dilemmas. Models based on behavioral economics are only able to explain this phenomenon for unrealistic stateless matrix games. Recently, multi-agent reinforcement learning has been applied to generalize social dilemma problems to temporally and spatially extended Markov games. However… ▽ More

    Submitted 27 September, 2018; v1 submitted 23 March, 2018; originally announced March 2018.

    Comments: 15 pages, 8 figures

  15. Decision Rules for Robotic Mobile Fulfillment Systems

    Authors: Marius Merschformann, Tim Lamballais, René de Koster, Leena Suhl

    Abstract: The Robotic Mobile Fulfillment Systems (RMFS) is a new type of robotized, parts-to-picker material handling system, designed especially for e-commerce warehouses. Robots bring movable shelves, called pods, to workstations where inventory is put on or removed from the pods. This paper simulates both the pick and replenishment process and studies the order assignment, pod selection and pod storage a… ▽ More

    Submitted 20 January, 2018; originally announced January 2018.