Zum Hauptinhalt springen

Showing 1–10 of 10 results for author: Clavera, I

Searching in archive cs. Search in all archives.
.
  1. arXiv:2010.13303  [pdf, other

    cs.LG

    Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning

    Authors: Younggyo Seo, Kimin Lee, Ignasi Clavera, Thanard Kurutach, Jinwoo Shin, Pieter Abbeel

    Abstract: Model-based reinforcement learning (RL) has shown great potential in various control tasks in terms of both sample-efficiency and final performance. However, learning a generalizable dynamics model robust to changes in dynamics remains a challenge since the target transition dynamics follow a multi-modal distribution. In this paper, we present a new model-based RL algorithm, coined trajectory-wise… ▽ More

    Submitted 25 October, 2020; originally announced October 2020.

    Comments: Accepted in NeurIPS2020. First two authors contributed equally, website: https://sites.google.com/view/trajectory-mcl code: https://github.com/younggyoseo/trajectory_mcl

  2. arXiv:2005.08114  [pdf, other

    cs.CV cs.AI stat.ML

    Mutual Information Maximization for Robust Plannable Representations

    Authors: Yiming Ding, Ignasi Clavera, Pieter Abbeel

    Abstract: Extending the capabilities of robotics to real-world complex, unstructured environments requires the need of developing better perception systems while maintaining low sample complexity. When dealing with high-dimensional state spaces, current methods are either model-free or model-based based on reconstruction objectives. The sample inefficiency of the former constitutes a major barrier for apply… ▽ More

    Submitted 16 May, 2020; originally announced May 2020.

    Comments: Accepted at NeurIPS 2019 Workshop on Robot Learning: Control and Interaction in the Real World

  3. arXiv:2005.08068  [pdf, other

    cs.LG cs.AI stat.ML

    Model-Augmented Actor-Critic: Backpropagating through Paths

    Authors: Ignasi Clavera, Violet Fu, Pieter Abbeel

    Abstract: Current model-based reinforcement learning approaches use the model simply as a learned black-box simulator to augment the data for policy optimization or value function learning. In this paper, we show how to make more effective use of the model by exploiting its differentiability. We construct a policy optimization algorithm that uses the pathwise derivative of the learned model and policy acros… ▽ More

    Submitted 16 May, 2020; originally announced May 2020.

    Comments: Accepted paper at ICLR 2020

  4. arXiv:1910.12453  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Asynchronous Methods for Model-Based Reinforcement Learning

    Authors: Yunzhi Zhang, Ignasi Clavera, Boren Tsai, Pieter Abbeel

    Abstract: Significant progress has been made in the area of model-based reinforcement learning. State-of-the-art algorithms are now able to match the asymptotic performance of model-free methods while being significantly more data efficient. However, this success has come at a price: state-of-the-art model-based methods require significant computation interleaved with data collection, resulting in run times… ▽ More

    Submitted 28 October, 2019; originally announced October 2019.

    Comments: 10 pages, CoRL 2019

  5. arXiv:1907.02057  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Benchmarking Model-Based Reinforcement Learning

    Authors: Tingwu Wang, Xuchan Bao, Ignasi Clavera, Jerrick Hoang, Yeming Wen, Eric Langlois, Shunshi Zhang, Guodong Zhang, Pieter Abbeel, Jimmy Ba

    Abstract: Model-based reinforcement learning (MBRL) is widely seen as having the potential to be significantly more sample efficient than model-free RL. However, research in model-based RL has not been very standardized. It is fairly common for authors to experiment with self-designed environments, and there are several separate lines of research, which are sometimes closed-sourced or not reproducible. Acco… ▽ More

    Submitted 3 July, 2019; originally announced July 2019.

    Comments: 8 main pages, 8 figures; 14 appendix pages, 25 figures

  6. arXiv:1906.05862  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    Sub-policy Adaptation for Hierarchical Reinforcement Learning

    Authors: Alexander C. Li, Carlos Florensa, Ignasi Clavera, Pieter Abbeel

    Abstract: Hierarchical reinforcement learning is a promising approach to tackle long-horizon decision-making problems with sparse rewards. Unfortunately, most methods still decouple the lower-level skill acquisition process and the training of a higher level that controls the skills in a new task. Leaving the skills fixed can lead to significant sub-optimality in the transfer setting. In this work, we propo… ▽ More

    Submitted 13 May, 2020; v1 submitted 13 June, 2019; originally announced June 2019.

    Comments: ICLR 2020

  7. arXiv:1810.06784  [pdf, other

    cs.LG stat.ML

    ProMP: Proximal Meta-Policy Search

    Authors: Jonas Rothfuss, Dennis Lee, Ignasi Clavera, Tamim Asfour, Pieter Abbeel

    Abstract: Credit assignment in Meta-reinforcement learning (Meta-RL) is still poorly understood. Existing methods either neglect credit assignment to pre-adaptation behavior or implement it naively. This leads to poor sample-efficiency during meta-training as well as ineffective task identification strategies. This paper provides a theoretical analysis of credit assignment in gradient-based Meta-RL. Buildin… ▽ More

    Submitted 11 February, 2022; v1 submitted 15 October, 2018; originally announced October 2018.

    Comments: The first three authors contributed equally. Published at ICLR 2019

  8. arXiv:1809.05214  [pdf, other

    cs.LG cs.AI stat.ML

    Model-Based Reinforcement Learning via Meta-Policy Optimization

    Authors: Ignasi Clavera, Jonas Rothfuss, John Schulman, Yasuhiro Fujita, Tamim Asfour, Pieter Abbeel

    Abstract: Model-based reinforcement learning approaches carry the promise of being data efficient. However, due to challenges in learning dynamics models that sufficiently match the real-world dynamics, they struggle to achieve the same asymptotic performance as model-free methods. We propose Model-Based Meta-Policy-Optimization (MB-MPO), an approach that foregoes the strong reliance on accurate learned dyn… ▽ More

    Submitted 13 September, 2018; originally announced September 2018.

    Comments: First 2 authors contributed equally. Accepted for Conference on Robot Learning (CoRL)

  9. arXiv:1803.11347  [pdf, other

    cs.LG cs.RO stat.ML

    Learning to Adapt in Dynamic, Real-World Environments Through Meta-Reinforcement Learning

    Authors: Anusha Nagabandi, Ignasi Clavera, Simin Liu, Ronald S. Fearing, Pieter Abbeel, Sergey Levine, Chelsea Finn

    Abstract: Although reinforcement learning methods can achieve impressive results in simulation, the real world presents two major challenges: generating samples is exceedingly expensive, and unexpected perturbations or unseen situations cause proficient but specialized policies to fail at test time. Given that it is impractical to train separate policies to accommodate all situations the agent may see in th… ▽ More

    Submitted 27 February, 2019; v1 submitted 30 March, 2018; originally announced March 2018.

    Comments: First 2 authors contributed equally. Website: https://sites.google.com/berkeley.edu/metaadaptivecontrol

  10. arXiv:1802.10592  [pdf, other

    cs.LG cs.AI cs.RO

    Model-Ensemble Trust-Region Policy Optimization

    Authors: Thanard Kurutach, Ignasi Clavera, Yan Duan, Aviv Tamar, Pieter Abbeel

    Abstract: Model-free reinforcement learning (RL) methods are succeeding in a growing number of tasks, aided by recent advances in deep learning. However, they tend to suffer from high sample complexity, which hinders their use in real-world domains. Alternatively, model-based reinforcement learning promises to reduce sample complexity, but tends to require careful tuning and to date have succeeded mainly in… ▽ More

    Submitted 5 October, 2018; v1 submitted 28 February, 2018; originally announced February 2018.