Zum Hauptinhalt springen

Showing 1–6 of 6 results for author: Arnob, S Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2212.14405  [pdf, other

    cs.LG

    Offline Policy Optimization in RL with Variance Regularizaton

    Authors: Riashat Islam, Samarth Sinha, Homanga Bharadhwaj, Samin Yeasar Arnob, Zhuoran Yang, Animesh Garg, Zhaoran Wang, Lihong Li, Doina Precup

    Abstract: Learning policies from fixed offline datasets is a key challenge to scale up reinforcement learning (RL) algorithms towards practical applications. This is often because off-policy RL algorithms suffer from distributional shift, due to mismatch between dataset and the target policy, leading to high variance and over-estimation of value functions. In this work, we propose variance regularization fo… ▽ More

    Submitted 29 December, 2022; originally announced December 2022.

    Comments: Old Draft, Offline RL Workshop, NeurIPS'20;

  2. arXiv:2212.13835  [pdf, other

    cs.LG

    Representation Learning in Deep RL via Discrete Information Bottleneck

    Authors: Riashat Islam, Hongyu Zang, Manan Tomar, Aniket Didolkar, Md Mofijul Islam, Samin Yeasar Arnob, Tariq Iqbal, Xin Li, Anirudh Goyal, Nicolas Heess, Alex Lamb

    Abstract: Several self-supervised representation learning methods have been proposed for reinforcement learning (RL) with rich observations. For real-world applications of RL, recovering underlying latent states is crucial, particularly when sensory inputs contain irrelevant and exogenous information. In this work, we study how information bottlenecks can be used to construct latent states efficiently in th… ▽ More

    Submitted 30 May, 2023; v1 submitted 28 December, 2022; originally announced December 2022.

    Comments: AISTATS 2023

  3. arXiv:2112.15579  [pdf, other

    cs.LG

    Single-Shot Pruning for Offline Reinforcement Learning

    Authors: Samin Yeasar Arnob, Riyasat Ohib, Sergey Plis, Doina Precup

    Abstract: Deep Reinforcement Learning (RL) is a powerful framework for solving complex real-world problems. Large neural networks employed in the framework are traditionally associated with better generalization capabilities, but their increased size entails the drawbacks of extensive training duration, substantial hardware resources, and longer inference times. One way to tackle this problem is to prune ne… ▽ More

    Submitted 31 December, 2021; originally announced December 2021.

  4. arXiv:2112.15578  [pdf, other

    cs.LG cs.AI

    Importance of Empirical Sample Complexity Analysis for Offline Reinforcement Learning

    Authors: Samin Yeasar Arnob, Riashat Islam, Doina Precup

    Abstract: We hypothesize that empirically studying the sample complexity of offline reinforcement learning (RL) is crucial for the practical applications of RL in the real world. Several recent works have demonstrated the ability to learn policies directly from offline data. In this work, we ask the question of the dependency on the number of samples for learning from offline data. Our objective is to empha… ▽ More

    Submitted 31 December, 2021; originally announced December 2021.

  5. arXiv:2005.01138  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Off-Policy Adversarial Inverse Reinforcement Learning

    Authors: Samin Yeasar Arnob

    Abstract: Adversarial Imitation Learning (AIL) is a class of algorithms in Reinforcement learning (RL), which tries to imitate an expert without taking any reward from the environment and does not provide expert behavior directly to the policy training. Rather, an agent learns a policy distribution that minimizes the difference from expert behavior in an adversarial setting. Adversarial Inverse Reinforcemen… ▽ More

    Submitted 3 May, 2020; originally announced May 2020.

    Comments: 15 pages, 10 figures

  6. arXiv:1912.05109  [pdf, other

    cs.LG stat.ML

    Doubly Robust Off-Policy Actor-Critic Algorithms for Reinforcement Learning

    Authors: Riashat Islam, Raihan Seraj, Samin Yeasar Arnob, Doina Precup

    Abstract: We study the problem of off-policy critic evaluation in several variants of value-based off-policy actor-critic algorithms. Off-policy actor-critic algorithms require an off-policy critic evaluation step, to estimate the value of the new policy after every policy gradient update. Despite enormous success of off-policy policy gradients on control tasks, existing general methods suffer from high var… ▽ More

    Submitted 10 December, 2019; originally announced December 2019.

    Comments: In Submission; Appeared at NeurIPS 2019 Workshop on Safety and Robustness in Decision Making