Zum Hauptinhalt springen

Showing 1–22 of 22 results for author: Andrychowicz, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2306.06079  [pdf, other

    physics.ao-ph cs.LG

    Deep Learning for Day Forecasts from Sparse Observations

    Authors: Marcin Andrychowicz, Lasse Espeholt, Di Li, Samier Merchant, Alexander Merose, Fred Zyda, Shreya Agrawal, Nal Kalchbrenner

    Abstract: Deep neural networks offer an alternative paradigm for modeling weather conditions. The ability of neural models to make a prediction in less than a second once the data is available and to do so with very high temporal and spatial resolution, and the ability to learn directly from atmospheric observations, are just some of these models' unique advantages. Neural models trained using atmospheric o… ▽ More

    Submitted 6 July, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

  2. arXiv:2111.15382  [pdf, other

    cs.LG cs.AI

    Continuous Control With Ensemble Deep Deterministic Policy Gradients

    Authors: Piotr Januszewski, Mateusz Olko, Michał Królikowski, Jakub Świątkowski, Marcin Andrychowicz, Łukasz Kuciński, Piotr Miłoś

    Abstract: The growth of deep reinforcement learning (RL) has brought multiple exciting tools and methods to the field. This rapid expansion makes it important to understand the interplay between individual elements of the RL toolbox. We approach this task from an empirical perspective by conducting a study in the continuous control setting. We present multiple insights of fundamental nature, including: an a… ▽ More

    Submitted 30 November, 2021; originally announced November 2021.

  3. arXiv:2108.07041  [pdf, other

    cs.LG

    Implicitly Regularized RL with Implicit Q-Values

    Authors: Nino Vieillard, Marcin Andrychowicz, Anton Raichuk, Olivier Pietquin, Matthieu Geist

    Abstract: The $Q$-function is a central quantity in many Reinforcement Learning (RL) algorithms for which RL agents behave following a (soft)-greedy policy w.r.t. to $Q$. It is a powerful tool that allows action selection without a model of the environment and even without explicitly modeling the policy. Yet, this scheme can only be used in discrete action tasks, with small numbers of actions, as the softma… ▽ More

    Submitted 31 May, 2022; v1 submitted 16 August, 2021; originally announced August 2021.

    Comments: AISTATS 2022

  4. arXiv:2106.00672  [pdf, other

    cs.LG cs.AI cs.NE

    What Matters for Adversarial Imitation Learning?

    Authors: Manu Orsini, Anton Raichuk, Léonard Hussenot, Damien Vincent, Robert Dadashi, Sertan Girgin, Matthieu Geist, Olivier Bachem, Olivier Pietquin, Marcin Andrychowicz

    Abstract: Adversarial imitation learning has become a popular framework for imitation in continuous control. Over the years, several variations of its components were proposed to enhance the performance of the learned policies as well as the sample complexity of the algorithm. In practice, these choices are rarely tested all together in rigorous empirical studies. It is therefore difficult to discuss and un… ▽ More

    Submitted 1 June, 2021; originally announced June 2021.

  5. arXiv:2105.12034  [pdf, other

    cs.LG

    Hyperparameter Selection for Imitation Learning

    Authors: Leonard Hussenot, Marcin Andrychowicz, Damien Vincent, Robert Dadashi, Anton Raichuk, Lukasz Stafiniak, Sertan Girgin, Raphael Marinier, Nikola Momchev, Sabela Ramos, Manu Orsini, Olivier Bachem, Matthieu Geist, Olivier Pietquin

    Abstract: We address the issue of tuning hyperparameters (HPs) for imitation learning algorithms in the context of continuous-control, when the underlying reward function of the demonstrating expert cannot be observed at any time. The vast literature in imitation learning mostly considers this reward function to be available for HP selection, but this is not a realistic setting. Indeed, would this reward fu… ▽ More

    Submitted 25 May, 2021; originally announced May 2021.

    Comments: ICML 2021

  6. arXiv:2006.05990  [pdf, other

    cs.LG stat.ML

    What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study

    Authors: Marcin Andrychowicz, Anton Raichuk, Piotr Stańczyk, Manu Orsini, Sertan Girgin, Raphael Marinier, Léonard Hussenot, Matthieu Geist, Olivier Pietquin, Marcin Michalski, Sylvain Gelly, Olivier Bachem

    Abstract: In recent years, on-policy reinforcement learning (RL) has been successfully applied to many different continuous control tasks. While RL algorithms are often conceptually simple, their state-of-the-art implementations take numerous low- and high-level design decisions that strongly affect the performance of the resulting agents. Those choices are usually not extensively discussed in the literatur… ▽ More

    Submitted 10 June, 2020; originally announced June 2020.

  7. arXiv:1910.07113  [pdf, other

    cs.LG cs.AI cs.CV cs.RO stat.ML

    Solving Rubik's Cube with a Robot Hand

    Authors: OpenAI, Ilge Akkaya, Marcin Andrychowicz, Maciek Chociej, Mateusz Litwin, Bob McGrew, Arthur Petron, Alex Paino, Matthias Plappert, Glenn Powell, Raphael Ribas, Jonas Schneider, Nikolas Tezak, Jerry Tworek, Peter Welinder, Lilian Weng, Qiming Yuan, Wojciech Zaremba, Lei Zhang

    Abstract: We demonstrate that models trained only in simulation can be used to solve a manipulation problem of unprecedented complexity on a real robot. This is made possible by two key components: a novel algorithm, which we call automatic domain randomization (ADR) and a robot platform built for machine learning. ADR automatically generates a distribution over randomized environments of ever-increasing di… ▽ More

    Submitted 15 October, 2019; originally announced October 2019.

  8. arXiv:1808.00177  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Learning Dexterous In-Hand Manipulation

    Authors: OpenAI, Marcin Andrychowicz, Bowen Baker, Maciek Chociej, Rafal Jozefowicz, Bob McGrew, Jakub Pachocki, Arthur Petron, Matthias Plappert, Glenn Powell, Alex Ray, Jonas Schneider, Szymon Sidor, Josh Tobin, Peter Welinder, Lilian Weng, Wojciech Zaremba

    Abstract: We use reinforcement learning (RL) to learn dexterous in-hand manipulation policies which can perform vision-based object reorientation on a physical Shadow Dexterous Hand. The training is performed in a simulated environment in which we randomize many of the physical properties of the system like friction coefficients and an object's appearance. Our policies transfer to the physical robot despite… ▽ More

    Submitted 18 January, 2019; v1 submitted 1 August, 2018; originally announced August 2018.

    Comments: Making OpenAI the first author. We wish this paper to be cited as "Learning Dexterous In-Hand Manipulation" by OpenAI et al. We are replicating the approach from the physics community: arXiv:1812.06489

  9. arXiv:1802.09464  [pdf, other

    cs.LG cs.AI cs.RO

    Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research

    Authors: Matthias Plappert, Marcin Andrychowicz, Alex Ray, Bob McGrew, Bowen Baker, Glenn Powell, Jonas Schneider, Josh Tobin, Maciek Chociej, Peter Welinder, Vikash Kumar, Wojciech Zaremba

    Abstract: The purpose of this technical report is two-fold. First of all, it introduces a suite of challenging continuous control tasks (integrated with OpenAI Gym) based on currently existing robotics hardware. The tasks include pushing, sliding and pick & place with a Fetch robotic arm as well as in-hand object manipulation with a Shadow Dexterous Hand. All tasks have sparse binary rewards and follow a Mu… ▽ More

    Submitted 10 March, 2018; v1 submitted 26 February, 2018; originally announced February 2018.

  10. arXiv:1710.06542  [pdf, other

    cs.RO cs.AI cs.LG

    Asymmetric Actor Critic for Image-Based Robot Learning

    Authors: Lerrel Pinto, Marcin Andrychowicz, Peter Welinder, Wojciech Zaremba, Pieter Abbeel

    Abstract: Deep reinforcement learning (RL) has proven a powerful technique in many sequential decision making domains. However, Robotics poses many challenges for RL, most notably training on a physical system can be expensive and dangerous, which has sparked significant interest in learning control policies using a physics simulator. While several recent works have shown promising results in transferring p… ▽ More

    Submitted 17 October, 2017; originally announced October 2017.

    Comments: Videos of experiments can be found at http://www.goo.gl/b57WTs

  11. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization

    Authors: Xue Bin Peng, Marcin Andrychowicz, Wojciech Zaremba, Pieter Abbeel

    Abstract: Simulations are attractive environments for training agents as they provide an abundant source of data and alleviate certain safety concerns during the training process. But the behaviours developed by agents in simulation are often specific to the characteristics of the simulator. Due to modeling error, strategies that are successful in simulation may not transfer to their real world counterparts… ▽ More

    Submitted 2 March, 2018; v1 submitted 17 October, 2017; originally announced October 2017.

  12. arXiv:1710.06425  [pdf, other

    cs.RO cs.LG

    Domain Randomization and Generative Models for Robotic Grasping

    Authors: Joshua Tobin, Lukas Biewald, Rocky Duan, Marcin Andrychowicz, Ankur Handa, Vikash Kumar, Bob McGrew, Jonas Schneider, Peter Welinder, Wojciech Zaremba, Pieter Abbeel

    Abstract: Deep learning-based robotic grasping has made significant progress thanks to algorithmic improvements and increased data availability. However, state-of-the-art models are often trained on as few as hundreds or thousands of unique object instances, and as a result generalization can be a challenge. In this work, we explore a novel data generation pipeline for training a deep neural network to pe… ▽ More

    Submitted 3 April, 2018; v1 submitted 17 October, 2017; originally announced October 2017.

    Comments: 8 pages, 11 figures. Submitted to 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2018)

  13. arXiv:1709.10089  [pdf, other

    cs.LG cs.AI cs.NE cs.RO

    Overcoming Exploration in Reinforcement Learning with Demonstrations

    Authors: Ashvin Nair, Bob McGrew, Marcin Andrychowicz, Wojciech Zaremba, Pieter Abbeel

    Abstract: Exploration in environments with sparse rewards has been a persistent problem in reinforcement learning (RL). Many tasks are natural to specify with a sparse reward, and manually shaping a reward function can result in suboptimal performance. However, finding a non-zero reward is exponentially more difficult with increasing task horizon or action dimensionality. This puts many real-world tasks out… ▽ More

    Submitted 25 February, 2018; v1 submitted 28 September, 2017; originally announced September 2017.

    Comments: 8 pages, ICRA 2018

  14. arXiv:1707.01495  [pdf, other

    cs.LG cs.AI cs.NE cs.RO

    Hindsight Experience Replay

    Authors: Marcin Andrychowicz, Filip Wolski, Alex Ray, Jonas Schneider, Rachel Fong, Peter Welinder, Bob McGrew, Josh Tobin, Pieter Abbeel, Wojciech Zaremba

    Abstract: Dealing with sparse rewards is one of the biggest challenges in Reinforcement Learning (RL). We present a novel technique called Hindsight Experience Replay which allows sample-efficient learning from rewards which are sparse and binary and therefore avoid the need for complicated reward engineering. It can be combined with an arbitrary off-policy RL algorithm and may be seen as a form of implicit… ▽ More

    Submitted 23 February, 2018; v1 submitted 5 July, 2017; originally announced July 2017.

  15. arXiv:1706.01905  [pdf, other

    cs.LG cs.AI cs.NE cs.RO stat.ML

    Parameter Space Noise for Exploration

    Authors: Matthias Plappert, Rein Houthooft, Prafulla Dhariwal, Szymon Sidor, Richard Y. Chen, Xi Chen, Tamim Asfour, Pieter Abbeel, Marcin Andrychowicz

    Abstract: Deep reinforcement learning (RL) methods generally engage in exploratory behavior through noise injection in the action space. An alternative is to add noise directly to the agent's parameters, which can lead to more consistent exploration and a richer set of behaviors. Methods such as evolutionary strategies use parameter perturbations, but discard all temporal structure in the process and requir… ▽ More

    Submitted 31 January, 2018; v1 submitted 6 June, 2017; originally announced June 2017.

    Comments: Updated to camera-ready ICLR submission

  16. arXiv:1703.07326  [pdf, other

    cs.AI cs.LG cs.NE cs.RO

    One-Shot Imitation Learning

    Authors: Yan Duan, Marcin Andrychowicz, Bradly C. Stadie, Jonathan Ho, Jonas Schneider, Ilya Sutskever, Pieter Abbeel, Wojciech Zaremba

    Abstract: Imitation learning has been commonly applied to solve different tasks in isolation. This usually requires either careful feature engineering, or a significant number of samples. This is far from what we desire: ideally, robots should be able to learn from very few demonstrations of any given task, and instantly generalize to new situations of the same task, without requiring task-specific engineer… ▽ More

    Submitted 4 December, 2017; v1 submitted 21 March, 2017; originally announced March 2017.

  17. arXiv:1606.04474  [pdf, other

    cs.NE cs.LG

    Learning to learn by gradient descent by gradient descent

    Authors: Marcin Andrychowicz, Misha Denil, Sergio Gomez, Matthew W. Hoffman, David Pfau, Tom Schaul, Brendan Shillingford, Nando de Freitas

    Abstract: The move from hand-designed features to learned features in machine learning has been wildly successful. In spite of this, optimization algorithms are still designed by hand. In this paper we show how the design of an optimization algorithm can be cast as a learning problem, allowing the algorithm to learn to exploit structure in the problems of interest in an automatic way. Our learned algorithms… ▽ More

    Submitted 30 November, 2016; v1 submitted 14 June, 2016; originally announced June 2016.

  18. arXiv:1602.03218  [pdf, ps, other

    cs.LG

    Learning Efficient Algorithms with Hierarchical Attentive Memory

    Authors: Marcin Andrychowicz, Karol Kurach

    Abstract: In this paper, we propose and investigate a novel memory architecture for neural networks called Hierarchical Attentive Memory (HAM). It is based on a binary tree with leaves corresponding to memory cells. This allows HAM to perform memory access in O(log n) complexity, which is a significant improvement over the standard attention mechanism that requires O(n) operations, where n is the size of th… ▽ More

    Submitted 23 February, 2016; v1 submitted 9 February, 2016; originally announced February 2016.

    Comments: Added soft attention appendix

  19. arXiv:1511.06392  [pdf, other

    cs.LG cs.NE

    Neural Random-Access Machines

    Authors: Karol Kurach, Marcin Andrychowicz, Ilya Sutskever

    Abstract: In this paper, we propose and investigate a new neural network architecture called Neural Random Access Machine. It can manipulate and dereference pointers to an external variable-size random-access memory. The model is trained from pure input-output examples using backpropagation. We evaluate the new model on a number of simple algorithmic tasks whose solutions require pointer manipulation and… ▽ More

    Submitted 9 February, 2016; v1 submitted 19 November, 2015; originally announced November 2015.

    Comments: ICLR submission, 17 pages, 9 figures, 6 tables (with bibliography and appendix)

  20. arXiv:1405.1861  [pdf, other

    cs.CR

    Modeling Bitcoin Contracts by Timed Automata

    Authors: Marcin Andrychowicz, Stefan Dziembowski, Daniel Malinowski, Łukasz Mazurek

    Abstract: Bitcoin is a peer-to-peer cryptographic currency system. Since its introduction in 2008, Bitcoin has gained noticeable popularity, mostly due to its following properties: (1) the transaction fees are very low, and (2) it is not controlled by any central authority, which in particular means that nobody can "print" the money to generate inflation. Moreover, the transaction syntax allows to create th… ▽ More

    Submitted 27 June, 2014; v1 submitted 8 May, 2014; originally announced May 2014.

  21. arXiv:1312.3230  [pdf, ps, other

    cs.CR

    How to deal with malleability of BitCoin transactions

    Authors: Marcin Andrychowicz, Stefan Dziembowski, Daniel Malinowski, Łukasz Mazurek

    Abstract: BitCoin transactions are malleable in a sense that given a transaction an adversary can easily construct an equivalent transaction which has a different hash. This can pose a serious problem in some BitCoin distributed contracts in which changing a transaction's hash may result in the protocol disruption and a financial loss. The problem mostly concerns protocols, which use a "refund" transaction… ▽ More

    Submitted 11 December, 2013; originally announced December 2013.

  22. arXiv:1209.4820  [pdf, ps, other

    cs.CR

    Efficient Refreshing Protocol for Leakage-Resilient Storage Based on the Inner-Product Extractor

    Authors: Marcin Andrychowicz

    Abstract: A recent trend in cryptography is to protect data and computation against various side-channel attacks. Dziembowski and Faust (TCC 2012) have proposed a general way to protect arbitrary circuits against any continual leakage assuming that: (i) the memory is divided into the parts, which leaks independently (ii) the leakage in each observation is bounded (iii) the circuit has an access to a leak-fr… ▽ More

    Submitted 21 September, 2012; originally announced September 2012.