Zum Hauptinhalt springen

Showing 1–25 of 25 results for author: Dulac-Arnold, G

.
  1. arXiv:2311.00899  [pdf, other

    cs.RO

    RoboVQA: Multimodal Long-Horizon Reasoning for Robotics

    Authors: Pierre Sermanet, Tianli Ding, Jeffrey Zhao, Fei Xia, Debidatta Dwibedi, Keerthana Gopalakrishnan, Christine Chan, Gabriel Dulac-Arnold, Sharath Maddineni, Nikhil J Joshi, Pete Florence, Wei Han, Robert Baruch, Yao Lu, Suvir Mirchandani, Peng Xu, Pannag Sanketi, Karol Hausman, Izhak Shafran, Brian Ichter, Yuan Cao

    Abstract: We present a scalable, bottom-up and intrinsically diverse data collection scheme that can be used for high-level reasoning with long and medium horizons and that has 2.2x higher throughput compared to traditional narrow top-down step-by-step collection. We collect realistic data by performing any user requests within the entirety of 3 office buildings and using multiple robot and human embodiment… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  2. arXiv:2305.14654  [pdf, other

    cs.RO cs.AI

    Barkour: Benchmarking Animal-level Agility with Quadruped Robots

    Authors: Ken Caluwaerts, Atil Iscen, J. Chase Kew, Wenhao Yu, Tingnan Zhang, Daniel Freeman, Kuang-Huei Lee, Lisa Lee, Stefano Saliceti, Vincent Zhuang, Nathan Batchelor, Steven Bohez, Federico Casarini, Jose Enrique Chen, Omar Cortes, Erwin Coumans, Adil Dostmohamed, Gabriel Dulac-Arnold, Alejandro Escontrela, Erik Frey, Roland Hafner, Deepali Jain, Bauyrjan Jyenis, Yuheng Kuang, Edward Lee , et al. (19 additional authors not shown)

    Abstract: Animals have evolved various agile locomotion strategies, such as sprinting, leaping, and jumping. There is a growing interest in developing legged robots that move like their biological counterparts and show various agile skills to navigate complex environments quickly. Despite the interest, the field lacks systematic benchmarks to measure the performance of control policies and hardware in agili… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

    Comments: 17 pages, 19 figures

  3. arXiv:2305.01400  [pdf, other

    cs.RO cs.AI cs.LG eess.SY

    Get Back Here: Robust Imitation by Return-to-Distribution Planning

    Authors: Geoffrey Cideron, Baruch Tabanpour, Sebastian Curi, Sertan Girgin, Leonard Hussenot, Gabriel Dulac-Arnold, Matthieu Geist, Olivier Pietquin, Robert Dadashi

    Abstract: We consider the Imitation Learning (IL) setup where expert data are not collected on the actual deployment environment but on a different version. To address the resulting distribution shift, we combine behavior cloning (BC) with a planner that is tasked to bring the agent back to states visited by the expert whenever the agent deviates from the demonstration distribution. The resulting algorithm,… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

  4. arXiv:2302.04009  [pdf, other

    cs.LG

    Investigating the role of model-based learning in exploration and transfer

    Authors: Jacob Walker, Eszter Vértes, Yazhe Li, Gabriel Dulac-Arnold, Ankesh Anand, Théophane Weber, Jessica B. Hamrick

    Abstract: State of the art reinforcement learning has enabled training agents on tasks of ever increasing complexity. However, the current paradigm tends to favor training agents from scratch on every new task or on collections of tasks with a view towards generalizing to novel task configurations. The former suffers from poor data efficiency while the latter is difficult when test tasks are out-of-distribu… ▽ More

    Submitted 8 February, 2023; originally announced February 2023.

  5. arXiv:2211.09019  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Learning Reward Functions for Robotic Manipulation by Observing Humans

    Authors: Minttu Alakuijala, Gabriel Dulac-Arnold, Julien Mairal, Jean Ponce, Cordelia Schmid

    Abstract: Observing a human demonstrator manipulate objects provides a rich, scalable and inexpensive source of data for learning robotic policies. However, transferring skills from human videos to a robotic manipulator poses several challenges, not least a difference in action and observation spaces. In this work, we use unlabeled videos of humans solving a wide range of manipulation tasks to learn a task-… ▽ More

    Submitted 7 March, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

  6. arXiv:2211.03521  [pdf, other

    cs.AI

    On the importance of data collection for training general goal-reaching policies

    Authors: Alexis Jacq, Manu Orsini, Gabriel Dulac-Arnold, Olivier Pietquin, Matthieu Geist, Olivier Bachem

    Abstract: Recent advances in ML suggest that the quantity of data available to a model is one of the primary bottlenecks to high performance. Although for language-based tasks there exist almost unlimited amounts of reasonably coherent data to train from, this is generally not the case for Reinforcement Learning, especially when dealing with a novel environment. In effect, even a relatively trivial continuo… ▽ More

    Submitted 20 February, 2023; v1 submitted 7 November, 2022; originally announced November 2022.

  7. arXiv:2109.14311  [pdf, other

    cs.LG cs.RO

    Learning Dynamics Models for Model Predictive Agents

    Authors: Michael Lutter, Leonard Hasenclever, Arunkumar Byravan, Gabriel Dulac-Arnold, Piotr Trochim, Nicolas Heess, Josh Merel, Yuval Tassa

    Abstract: Model-Based Reinforcement Learning involves learning a \textit{dynamics model} from data, and then using this model to optimise behaviour, most often with an online \textit{planner}. Much of the recent research along these lines presents a particular set of design choices, involving problem definition, model learning and planning. Given the multiple contributions, it is difficult to evaluate the e… ▽ More

    Submitted 29 September, 2021; originally announced September 2021.

  8. arXiv:2106.08050  [pdf, other

    cs.LG

    Residual Reinforcement Learning from Demonstrations

    Authors: Minttu Alakuijala, Gabriel Dulac-Arnold, Julien Mairal, Jean Ponce, Cordelia Schmid

    Abstract: Residual reinforcement learning (RL) has been proposed as a way to solve challenging robotic tasks by adapting control actions from a conventional feedback controller to maximize a reward signal. We extend the residual formulation to learn from visual inputs and sparse rewards using demonstrations. Learning from images, proprioceptive inputs and a sparse task-completion reward relaxes the requirem… ▽ More

    Submitted 15 June, 2021; originally announced June 2021.

  9. arXiv:2103.03104  [pdf, other

    cs.LG eess.SY

    Learning to run a Power Network Challenge: a Retrospective Analysis

    Authors: Antoine Marot, Benjamin Donnot, Gabriel Dulac-Arnold, Adrian Kelly, Aïdan O'Sullivan, Jan Viebahn, Mariette Awad, Isabelle Guyon, Patrick Panciatici, Camilo Romero

    Abstract: Power networks, responsible for transporting electricity across large geographical regions, are complex infrastructures on which modern life critically depend. Variations in demand and production profiles, with increasing renewable energy integration, as well as the high voltage network technology, constitute a real challenge for human operators when optimizing electricity transportation while avo… ▽ More

    Submitted 21 October, 2021; v1 submitted 2 March, 2021; originally announced March 2021.

    Journal ref: Proceedings of Machine Learning Research, 2021 NeurIPS 2020 Competition and Demonstration Track

  10. arXiv:2011.07318  [pdf, other

    cs.LG cs.AI

    A Geometric Perspective on Self-Supervised Policy Adaptation

    Authors: Cristian Bodnar, Karol Hausman, Gabriel Dulac-Arnold, Rico Jonschkowski

    Abstract: One of the most challenging aspects of real-world reinforcement learning (RL) is the multitude of unpredictable and ever-changing distractions that could divert an agent from what was tasked to do in its training environment. While an agent could learn from reward signals to ignore them, the complexity of the real-world can make rewards hard to acquire, or, at best, extremely sparse. A recent clas… ▽ More

    Submitted 14 November, 2020; originally announced November 2020.

    Comments: Contains 17 pages, 18 figures

  11. arXiv:2008.05556  [pdf, other

    cs.LG cs.AI cs.RO eess.SY stat.ML

    Model-Based Offline Planning

    Authors: Arthur Argenson, Gabriel Dulac-Arnold

    Abstract: Offline learning is a key part of making reinforcement learning (RL) useable in real systems. Offline RL looks at scenarios where there is data from a system's operation, but no direct access to the system when learning a policy. Recent work on training RL policies from offline data has shown results both with model-free policies learned directly from the data, or with planning on top of learnt mo… ▽ More

    Submitted 17 March, 2021; v1 submitted 12 August, 2020; originally announced August 2020.

  12. arXiv:2006.13888  [pdf, other

    cs.LG stat.ML

    RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning

    Authors: Caglar Gulcehre, Ziyu Wang, Alexander Novikov, Tom Le Paine, Sergio Gomez Colmenarejo, Konrad Zolna, Rishabh Agarwal, Josh Merel, Daniel Mankowitz, Cosmin Paduraru, Gabriel Dulac-Arnold, Jerry Li, Mohammad Norouzi, Matt Hoffman, Ofir Nachum, George Tucker, Nicolas Heess, Nando de Freitas

    Abstract: Offline methods for reinforcement learning have a potential to help bridge the gap between reinforcement learning research and real-world applications. They make it possible to learn policies from offline datasets, thus overcoming concerns associated with online data collection in the real-world, including cost, safety, or ethical concerns. In this paper, we propose a benchmark called RL Unplugged… ▽ More

    Submitted 12 February, 2021; v1 submitted 24 June, 2020; originally announced June 2020.

    Comments: NeurIPS paper. 21 pages including supplementary material, the github link for the datasets: https://github.com/deepmind/deepmind-research/rl_unplugged

  13. arXiv:2006.00979  [pdf, other

    cs.LG cs.AI

    Acme: A Research Framework for Distributed Reinforcement Learning

    Authors: Matthew W. Hoffman, Bobak Shahriari, John Aslanides, Gabriel Barth-Maron, Nikola Momchev, Danila Sinopalnikov, Piotr Stańczyk, Sabela Ramos, Anton Raichuk, Damien Vincent, Léonard Hussenot, Robert Dadashi, Gabriel Dulac-Arnold, Manu Orsini, Alexis Jacq, Johan Ferret, Nino Vieillard, Seyed Kamyar Seyed Ghasemipour, Sertan Girgin, Olivier Pietquin, Feryal Behbahani, Tamara Norman, Abbas Abdolmaleki, Albin Cassirer, Fan Yang , et al. (14 additional authors not shown)

    Abstract: Deep reinforcement learning (RL) has led to many recent and groundbreaking advances. However, these advances have often come at the cost of both increased scale in the underlying architectures being trained as well as increased complexity of the RL algorithms used to train them. These increases have in turn made it more difficult for researchers to rapidly prototype new ideas or reproduce publishe… ▽ More

    Submitted 20 September, 2022; v1 submitted 1 June, 2020; originally announced June 2020.

    Comments: This work presents a second version of the paper which coincides with an increase in modularity, additional emphasis on offline, imitation and learning from demonstrations algorithms, as well as various new agents implemented as part of Acme

  14. arXiv:2003.11881  [pdf, other

    cs.LG cs.AI

    An empirical investigation of the challenges of real-world reinforcement learning

    Authors: Gabriel Dulac-Arnold, Nir Levine, Daniel J. Mankowitz, Jerry Li, Cosmin Paduraru, Sven Gowal, Todd Hester

    Abstract: Reinforcement learning (RL) has proven its worth in a series of artificial domains, and is beginning to show some successes in real-world scenarios. However, much of the research advances in RL are hard to leverage in real-world systems due to a series of assumptions that are rarely satisfied in practice. In this work, we identify and formalize a series of independent challenges that embody the di… ▽ More

    Submitted 4 March, 2021; v1 submitted 24 March, 2020; originally announced March 2020.

    Comments: arXiv admin note: text overlap with arXiv:1904.12901

  15. arXiv:1910.09036  [pdf, other

    cs.LG stat.ML

    Differentiable Deep Clustering with Cluster Size Constraints

    Authors: Aude Genevay, Gabriel Dulac-Arnold, Jean-Philippe Vert

    Abstract: Clustering is a fundamental unsupervised learning approach. Many clustering algorithms -- such as $k$-means -- rely on the euclidean distance as a similarity measure, which is often not the most relevant metric for high dimensional data such as images. Learning a lower-dimensional embedding that can better reflect the geometry of the dataset is therefore instrumental for performance. We propose a… ▽ More

    Submitted 20 October, 2019; originally announced October 2019.

  16. arXiv:1905.12909  [pdf, other

    cs.LG stat.ML

    Deep multi-class learning from label proportions

    Authors: Gabriel Dulac-Arnold, Neil Zeghidour, Marco Cuturi, Lucas Beyer, Jean-Philippe Vert

    Abstract: We propose a learning algorithm capable of learning from label proportions instead of direct data labels. In this scenario, our data are arranged into various bags of a certain size, and only the proportions of each label within a given bag are known. This is a common situation in cases where per-data labeling is lengthy, but a more general label is easily accessible. Several approaches have been… ▽ More

    Submitted 26 June, 2019; v1 submitted 30 May, 2019; originally announced May 2019.

  17. arXiv:1904.12901  [pdf, ps, other

    cs.LG cs.AI cs.RO stat.ML

    Challenges of Real-World Reinforcement Learning

    Authors: Gabriel Dulac-Arnold, Daniel Mankowitz, Todd Hester

    Abstract: Reinforcement learning (RL) has proven its worth in a series of artificial domains, and is beginning to show some successes in real-world scenarios. However, much of the research advances in RL are often hard to leverage in real-world systems due to a series of assumptions that are rarely satisfied in practice. We present a set of nine unique challenges that must be addressed to productionize RL t… ▽ More

    Submitted 29 April, 2019; originally announced April 2019.

  18. arXiv:1704.03732  [pdf, ps, other

    cs.AI cs.LG

    Deep Q-learning from Demonstrations

    Authors: Todd Hester, Matej Vecerik, Olivier Pietquin, Marc Lanctot, Tom Schaul, Bilal Piot, Dan Horgan, John Quan, Andrew Sendonaris, Gabriel Dulac-Arnold, Ian Osband, John Agapiou, Joel Z. Leibo, Audrunas Gruslys

    Abstract: Deep reinforcement learning (RL) has achieved several high profile successes in difficult decision-making problems. However, these algorithms typically require a huge amount of data before they reach reasonable performance. In fact, their performance during learning can be extremely poor. This may be acceptable for a simulator, but it severely limits the applicability of deep RL to many real-world… ▽ More

    Submitted 22 November, 2017; v1 submitted 12 April, 2017; originally announced April 2017.

    Comments: Published at AAAI 2018. Previously on arxiv as "Learning from Demonstrations for Real World Reinforcement Learning"

  19. arXiv:1612.08810  [pdf, other

    cs.LG cs.AI cs.NE

    The Predictron: End-To-End Learning and Planning

    Authors: David Silver, Hado van Hasselt, Matteo Hessel, Tom Schaul, Arthur Guez, Tim Harley, Gabriel Dulac-Arnold, David Reichert, Neil Rabinowitz, Andre Barreto, Thomas Degris

    Abstract: One of the key challenges of artificial intelligence is to learn models that are effective in the context of planning. In this document we introduce the predictron architecture. The predictron consists of a fully abstract model, represented by a Markov reward process, that can be rolled forward multiple "imagined" planning steps. Each forward pass of the predictron accumulates internal rewards and… ▽ More

    Submitted 20 July, 2017; v1 submitted 28 December, 2016; originally announced December 2016.

    Comments: Camera-ready version, ICML 2017, with supplement

  20. arXiv:1512.07679  [pdf, other

    cs.AI cs.LG cs.NE stat.ML

    Deep Reinforcement Learning in Large Discrete Action Spaces

    Authors: Gabriel Dulac-Arnold, Richard Evans, Hado van Hasselt, Peter Sunehag, Timothy Lillicrap, Jonathan Hunt, Timothy Mann, Theophane Weber, Thomas Degris, Ben Coppin

    Abstract: Being able to reason in an environment with a large number of discrete actions is essential to bringing reinforcement learning to a larger class of problems. Recommender systems, industrial plants and language models are only some of the many real-world tasks involving large numbers of discrete actions for which current methods are difficult or even often impossible to apply. An ability to general… ▽ More

    Submitted 4 April, 2016; v1 submitted 23 December, 2015; originally announced December 2015.

  21. arXiv:1512.01124  [pdf, other

    cs.AI cs.HC cs.LG

    Deep Reinforcement Learning with Attention for Slate Markov Decision Processes with High-Dimensional States and Actions

    Authors: Peter Sunehag, Richard Evans, Gabriel Dulac-Arnold, Yori Zwols, Daniel Visentin, Ben Coppin

    Abstract: Many real-world problems come with action spaces represented as feature vectors. Although high-dimensional control is a largely unsolved problem, there has recently been progress for modest dimensionalities. Here we report on a successful attempt at addressing problems of dimensionality as high as $2000$, of a particular form. Motivated by important applications such as recommendation systems that… ▽ More

    Submitted 16 December, 2015; v1 submitted 3 December, 2015; originally announced December 2015.

  22. arXiv:1312.6594  [pdf, other

    cs.CV cs.LG

    Sequentially Generated Instance-Dependent Image Representations for Classification

    Authors: Gabriel Dulac-Arnold, Ludovic Denoyer, Nicolas Thome, Matthieu Cord, Patrick Gallinari

    Abstract: In this paper, we investigate a new framework for image classification that adaptively generates spatial representations. Our strategy is based on a sequential process that learns to explore the different regions of any image in order to infer its category. In particular, the choice of regions is specific to each image, directed by the actual content of previously selected regions.The capacity of… ▽ More

    Submitted 11 February, 2014; v1 submitted 20 December, 2013; originally announced December 2013.

  23. arXiv:1203.0203  [pdf, other

    cs.LG stat.ML

    Fast Reinforcement Learning with Large Action Sets using Error-Correcting Output Codes for MDP Factorization

    Authors: Gabriel Dulac-Arnold, Ludovic Denoyer, Philippe Preux, Patrick Gallinari

    Abstract: The use of Reinforcement Learning in real-world scenarios is strongly limited by issues of scale. Most RL learning algorithms are unable to deal with problems composed of hundreds or sometimes even dozens of possible actions, and therefore cannot be applied to many real-world problems. We consider the RL problem in the supervised classification framework where the optimal policy is obtained throug… ▽ More

    Submitted 29 February, 2012; originally announced March 2012.

    MSC Class: 68T05

  24. Datum-Wise Classification: A Sequential Approach to Sparsity

    Authors: Gabriel Dulac-Arnold, Ludovic Denoyer, Philippe Preux, Patrick Gallinari

    Abstract: We propose a novel classification technique whose aim is to select an appropriate representation for each datapoint, in contrast to the usual approach of selecting a representation encompassing the whole dataset. This datum-wise representation is found by using a sparsity inducing empirical risk, which is a relaxation of the standard L 0 regularized risk. The classification problem is modeled as a… ▽ More

    Submitted 29 August, 2011; originally announced August 2011.

    Comments: ECML2011

    Journal ref: Lecture Notes in Computer Science, 2011, Volume 6911/2011, 375-390

  25. Text Classification: A Sequential Reading Approach

    Authors: Gabriel Dulac-Arnold, Ludovic Denoyer, Patrick Gallinari

    Abstract: We propose to model the text classification process as a sequential decision process. In this process, an agent learns to classify documents into topics while reading the document sentences sequentially and learns to stop as soon as enough information was read for deciding. The proposed algorithm is based on a modelisation of Text Classification as a Markov Decision Process and learns by using Rei… ▽ More

    Submitted 29 August, 2011; v1 submitted 7 July, 2011; originally announced July 2011.

    Comments: ECIR2011

    Journal ref: Lecture Notes in Computer Science, 2011, Volume 6611/2011, 411-423