Zum Hauptinhalt springen

Showing 51–58 of 58 results for author: Tuyls, K

.
  1. arXiv:1711.00832  [pdf, other

    cs.AI cs.GT cs.LG cs.MA

    A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning

    Authors: Marc Lanctot, Vinicius Zambaldi, Audrunas Gruslys, Angeliki Lazaridou, Karl Tuyls, Julien Perolat, David Silver, Thore Graepel

    Abstract: To achieve general intelligence, agents must learn how to interact with others in a shared environment: this is the challenge of multiagent reinforcement learning (MARL). The simplest form is independent reinforcement learning (InRL), where each agent treats its experience as part of its (non-stationary) environment. In this paper, we first observe that policies learned using InRL can overfit to t… ▽ More

    Submitted 7 November, 2017; v1 submitted 2 November, 2017; originally announced November 2017.

    Comments: Camera-ready copy of NIPS 2017 paper, including appendix

  2. arXiv:1707.06600  [pdf, other

    cs.MA cs.NE q-bio.PE

    A multi-agent reinforcement learning model of common-pool resource appropriation

    Authors: Julien Perolat, Joel Z. Leibo, Vinicius Zambaldi, Charles Beattie, Karl Tuyls, Thore Graepel

    Abstract: Humanity faces numerous problems of common-pool resource appropriation. This class of multi-agent social dilemma includes the problems of ensuring sustainable use of fresh water, common fisheries, grazing pastures, and irrigation systems. Abstract models of common-pool resource appropriation based on non-cooperative game theory predict that self-interested agents will generally fail to find social… ▽ More

    Submitted 6 September, 2017; v1 submitted 20 July, 2017; originally announced July 2017.

    Comments: 15 pages, 11 figures

  3. arXiv:1707.04402  [pdf, other

    cs.MA cs.AI cs.LG

    Lenient Multi-Agent Deep Reinforcement Learning

    Authors: Gregory Palmer, Karl Tuyls, Daan Bloembergen, Rahul Savani

    Abstract: Much of the success of single agent deep reinforcement learning (DRL) in recent years can be attributed to the use of experience replay memories (ERM), which allow Deep Q-Networks (DQNs) to be trained efficiently through sampling stored state transitions. However, care is required when using ERMs for multi-agent deep reinforcement learning (MA-DRL), as stored transitions can become outdated becaus… ▽ More

    Submitted 27 February, 2018; v1 submitted 14 July, 2017; originally announced July 2017.

    Comments: 9 pages, 6 figures, AAMAS2018 Conference Proceedings

  4. arXiv:1706.05296  [pdf, other

    cs.AI

    Value-Decomposition Networks For Cooperative Multi-Agent Learning

    Authors: Peter Sunehag, Guy Lever, Audrunas Gruslys, Wojciech Marian Czarnecki, Vinicius Zambaldi, Max Jaderberg, Marc Lanctot, Nicolas Sonnerat, Joel Z. Leibo, Karl Tuyls, Thore Graepel

    Abstract: We study the problem of cooperative multi-agent reinforcement learning with a single joint reward signal. This class of learning problems is difficult because of the often large combined action and observation spaces. In the fully centralized and decentralized approaches, we find the problem of spurious rewards and a phenomenon we call the "lazy agent" problem, which arises due to partial observab… ▽ More

    Submitted 16 June, 2017; originally announced June 2017.

    ACM Class: I.2.11

  5. Efficient Optical flow and Stereo Vision for Velocity Estimation and Obstacle Avoidance on an Autonomous Pocket Drone

    Authors: Kimberly McGuire, Guido de Croon, Christophe De Wagter, Karl Tuyls, Hilbert Kappen

    Abstract: Miniature Micro Aerial Vehicles (MAV) are very suitable for flying in indoor environments, but autonomous navigation is challenging due to their strict hardware limitations. This paper presents a highly efficient computer vision algorithm called Edge-FS for the determination of velocity and depth. It runs at 20 Hz on a 4 g stereo camera with an embedded STM32F4 microprocessor (168 MHz, 192 kB) and… ▽ More

    Submitted 14 March, 2017; v1 submitted 20 December, 2016; originally announced December 2016.

    Comments: 7 pages, 10 figures, Published at IEEE Robotics and Automation Letters

    Journal ref: IEEE Robotics and Automation Letters, 2017, 2, 1070-1076

  6. Local Histogram Matching for Efficient Optical Flow Computation Applied to Velocity Estimation on Pocket Drones

    Authors: Kimberly McGuire, Guido de Croon, Christophe de Wagter, Bart Remes, Karl Tuyls, Hilbert Kappen

    Abstract: Autonomous flight of pocket drones is challenging due to the severe limitations on on-board energy, sensing, and processing power. However, tiny drones have great potential as their small size allows maneuvering through narrow spaces while their small weight provides significant safety advantages. This paper presents a computationally efficient algorithm for determining optical flow, which can be… ▽ More

    Submitted 14 March, 2017; v1 submitted 24 March, 2016; originally announced March 2016.

    Comments: 7 pages, 10 figures, Changes: format changed one column to two columns, used url package for links

    Journal ref: 2016 IEEE International Conference on Robotics and Automation (ICRA), 3255 - 3260,

  7. arXiv:1401.3465  [pdf

    cs.GT cs.SI physics.soc-ph

    Learning to Reach Agreement in a Continuous Ultimatum Game

    Authors: Steven de Jong, Simon Uyttendaele, Karl Tuyls

    Abstract: It is well-known that acting in an individually rational manner, according to the principles of classical game theory, may lead to sub-optimal solutions in a class of problems named social dilemmas. In contrast, humans generally do not have much difficulty with social dilemmas, as they are able to balance personal benefit and group benefit. As agents in multi-agent systems are regularly confronted… ▽ More

    Submitted 15 January, 2014; originally announced January 2014.

    Journal ref: Journal Of Artificial Intelligence Research, Volume 33, pages 551-574, 2008

  8. arXiv:0803.1555  [pdf, ps, other

    cs.DB cs.LG

    Privacy Preserving ID3 over Horizontally, Vertically and Grid Partitioned Data

    Authors: Bart Kuijpers, Vanessa Lemmens, Bart Moelans, Karl Tuyls

    Abstract: We consider privacy preserving decision tree induction via ID3 in the case where the training data is horizontally or vertically distributed. Furthermore, we consider the same problem in the case where the data is both horizontally and vertically distributed, a situation we refer to as grid partitioned data. We give an algorithm for privacy preserving ID3 over horizontally partitioned data invol… ▽ More

    Submitted 11 March, 2008; originally announced March 2008.

    Comments: 25 pages

    ACM Class: E.1; E.3; H.2.8; H.3.3