Zum Hauptinhalt springen

Showing 1–21 of 21 results for author: Fakoor, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.17768  [pdf, other

    cs.RO cs.AI cs.LG

    EXTRACT: Efficient Policy Learning by Extracting Transferrable Robot Skills from Offline Data

    Authors: Jesse Zhang, Minho Heo, Zuxin Liu, Erdem Biyik, Joseph J Lim, Yao Liu, Rasool Fakoor

    Abstract: Most reinforcement learning (RL) methods focus on learning optimal policies over low-level action spaces. While these methods can perform well in their training environments, they lack the flexibility to transfer to new tasks. Instead, RL agents that can act over useful, temporally extended skills rather than low-level actions can learn new tasks more easily. Prior work in skill-based RL either re… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 22 pages, 13 figures

  2. arXiv:2406.01838  [pdf, other

    cs.LG cs.AI

    Learning the Target Network in Function Space

    Authors: Kavosh Asadi, Yao Liu, Shoham Sabach, Ming Yin, Rasool Fakoor

    Abstract: We focus on the task of learning the value function in the reinforcement learning (RL) setting. This task is often solved by updating a pair of online and target networks while ensuring that the parameters of these two networks are equivalent. We propose Lookahead-Replicate (LR), a new value-function approximation algorithm that is agnostic to this parameter-space equivalence. Instead, the LR algo… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: Accepted to International Conference on Machine Learning (ICML24)

  3. arXiv:2310.05905  [pdf, other

    cs.LG cs.AI cs.RO

    TAIL: Task-specific Adapters for Imitation Learning with Large Pretrained Models

    Authors: Zuxin Liu, Jesse Zhang, Kavosh Asadi, Yao Liu, Ding Zhao, Shoham Sabach, Rasool Fakoor

    Abstract: The full potential of large pretrained models remains largely untapped in control domains like robotics. This is mainly because of the scarcity of data and the computational challenges associated with training or fine-tuning these large models for such applications. Prior work mainly emphasizes either effective pretraining of large models for decision-making or single-task adaptation. But real-wor… ▽ More

    Submitted 8 March, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: Published on ICLR 2024

  4. arXiv:2307.06328  [pdf, other

    cs.LG cs.AI

    Budgeting Counterfactual for Offline RL

    Authors: Yao Liu, Pratik Chaudhari, Rasool Fakoor

    Abstract: The main challenge of offline reinforcement learning, where data is limited, arises from a sequence of counterfactual reasoning dilemmas within the realm of potential actions: What if we were to choose a different course of action? These circumstances frequently give rise to extrapolation errors, which tend to accumulate exponentially with the problem horizon. Hence, it becomes crucial to acknowle… ▽ More

    Submitted 21 May, 2024; v1 submitted 12 July, 2023; originally announced July 2023.

    Comments: Published at NeurIPS 2023

  5. arXiv:2306.17833  [pdf, other

    cs.LG cs.AI

    Resetting the Optimizer in Deep RL: An Empirical Study

    Authors: Kavosh Asadi, Rasool Fakoor, Shoham Sabach

    Abstract: We focus on the task of approximating the optimal value function in deep reinforcement learning. This iterative process is comprised of solving a sequence of optimization problems where the loss function changes per iteration. The common approach to solving this sequence of problems is to employ modern variants of the stochastic gradient descent algorithm such as Adam. These optimizers maintain th… ▽ More

    Submitted 14 November, 2023; v1 submitted 30 June, 2023; originally announced June 2023.

    Comments: Accepted at Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023)

  6. arXiv:2306.17750  [pdf, other

    cs.LG

    TD Convergence: An Optimization Perspective

    Authors: Kavosh Asadi, Shoham Sabach, Yao Liu, Omer Gottesman, Rasool Fakoor

    Abstract: We study the convergence behavior of the celebrated temporal-difference (TD) learning algorithm. By looking at the algorithm through the lens of optimization, we first argue that TD can be viewed as an iterative optimization algorithm where the function to be minimized changes per iteration. By carefully investigating the divergence displayed by TD on a classical counter example, we identify two f… ▽ More

    Submitted 8 November, 2023; v1 submitted 30 June, 2023; originally announced June 2023.

    Comments: Accepted at Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023)

  7. arXiv:2210.01422  [pdf, other

    cs.LG

    Time-Varying Propensity Score to Bridge the Gap between the Past and Present

    Authors: Rasool Fakoor, Jonas Mueller, Zachary C. Lipton, Pratik Chaudhari, Alexander J. Smola

    Abstract: Real-world deployment of machine learning models is challenging because data evolves over time. While no model can work when data evolves in an arbitrary fashion, if there is some pattern to these changes, we might be able to design methods to address it. This paper addresses situations when data evolves gradually. We introduce a time-varying propensity score that can detect gradual shifts in the… ▽ More

    Submitted 2 May, 2024; v1 submitted 4 October, 2022; originally announced October 2022.

    Comments: Published at ICLR 2024

  8. arXiv:2205.14495  [pdf, other

    cs.LG

    Task-Agnostic Continual Reinforcement Learning: Gaining Insights and Overcoming Challenges

    Authors: Massimo Caccia, Jonas Mueller, Taesup Kim, Laurent Charlin, Rasool Fakoor

    Abstract: Continual learning (CL) enables the development of models and agents that learn from a sequence of tasks while addressing the limitations of standard deep learning approaches, such as catastrophic forgetting. In this work, we investigate the factors that contribute to the performance differences between task-agnostic CL and multi-task (MTL) agents. We pose two hypotheses: (1) task-agnostic methods… ▽ More

    Submitted 17 May, 2023; v1 submitted 28 May, 2022; originally announced May 2022.

    Journal ref: CoLLAs 2023

  9. arXiv:2112.05848  [pdf, other

    cs.LG cs.AI

    Faster Deep Reinforcement Learning with Slower Online Network

    Authors: Kavosh Asadi, Rasool Fakoor, Omer Gottesman, Taesup Kim, Michael L. Littman, Alexander J. Smola

    Abstract: Deep reinforcement learning algorithms often use two networks for value function optimization: an online network, and a target network that tracks the online network with some delay. Using two separate networks enables the agent to hedge against issues that arise when performing bootstrapping. In this paper we endow two popular deep reinforcement learning algorithms, namely DQN and Rainbow, with u… ▽ More

    Submitted 17 April, 2023; v1 submitted 10 December, 2021; originally announced December 2021.

    Comments: Published at the Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS 2022)

  10. arXiv:2103.00083  [pdf, other

    stat.ML cs.LG

    Flexible Model Aggregation for Quantile Regression

    Authors: Rasool Fakoor, Taesup Kim, Jonas Mueller, Alexander J. Smola, Ryan J. Tibshirani

    Abstract: Quantile regression is a fundamental problem in statistical learning motivated by a need to quantify uncertainty in predictions, or to model a diverse population without being overly reductive. For instance, epidemiological forecasts, cost estimates, and revenue predictions all benefit from being able to quantify the range of possible values accurately. As such, many models have been developed for… ▽ More

    Submitted 15 April, 2023; v1 submitted 26 February, 2021; originally announced March 2021.

    Comments: Accepted at JMLR 2023

  11. arXiv:2102.09225  [pdf, other

    cs.LG stat.ML

    Continuous Doubly Constrained Batch Reinforcement Learning

    Authors: Rasool Fakoor, Jonas Mueller, Kavosh Asadi, Pratik Chaudhari, Alexander J. Smola

    Abstract: Reliant on too many experiments to learn good actions, current Reinforcement Learning (RL) algorithms have limited applicability in real-world settings, which can be too expensive to allow exploration. We propose an algorithm for batch RL, where effective policies are learned using only a fixed offline dataset instead of online interactions with the environment. The limited data in batch RL produc… ▽ More

    Submitted 6 December, 2021; v1 submitted 18 February, 2021; originally announced February 2021.

    Comments: NeurIPS 2021 conference paper

  12. arXiv:2006.15199  [pdf, other

    cs.LG stat.ML

    DDPG++: Striving for Simplicity in Continuous-control Off-Policy Reinforcement Learning

    Authors: Rasool Fakoor, Pratik Chaudhari, Alexander J. Smola

    Abstract: This paper prescribes a suite of techniques for off-policy Reinforcement Learning (RL) that simplify the training process and reduce the sample complexity. First, we show that simple Deterministic Policy Gradient works remarkably well as long as the overestimation bias is controlled. This is contrast to existing literature which creates sophisticated off-policy techniques. Second, we pinpoint trai… ▽ More

    Submitted 26 June, 2020; originally announced June 2020.

  13. arXiv:2006.14284  [pdf, other

    cs.LG stat.ML

    Fast, Accurate, and Simple Models for Tabular Data via Augmented Distillation

    Authors: Rasool Fakoor, Jonas Mueller, Nick Erickson, Pratik Chaudhari, Alexander J. Smola

    Abstract: Automated machine learning (AutoML) can produce complex model ensembles by stacking, bagging, and boosting many individual models like trees, deep networks, and nearest neighbor estimators. While highly accurate, the resulting predictors are large, slow, and opaque as compared to their constituents. To improve the deployment of AutoML on tabular data, we propose FAST-DAD to distill arbitrarily com… ▽ More

    Submitted 25 June, 2020; originally announced June 2020.

    Journal ref: NeurIPS 2020

  14. arXiv:2004.02441  [pdf, other

    cs.LG stat.ML

    TraDE: Transformers for Density Estimation

    Authors: Rasool Fakoor, Pratik Chaudhari, Jonas Mueller, Alexander J. Smola

    Abstract: We present TraDE, a self-attention-based architecture for auto-regressive density estimation with continuous and discrete valued data. Our model is trained using a penalized maximum likelihood objective, which ensures that samples from the density estimate resemble the training data distribution. The use of self-attention means that the model need not retain conditional sufficient statistics durin… ▽ More

    Submitted 14 October, 2020; v1 submitted 6 April, 2020; originally announced April 2020.

  15. arXiv:1910.00125  [pdf, other

    cs.LG stat.ML

    Meta-Q-Learning

    Authors: Rasool Fakoor, Pratik Chaudhari, Stefano Soatto, Alexander J. Smola

    Abstract: This paper introduces Meta-Q-Learning (MQL), a new off-policy algorithm for meta-Reinforcement Learning (meta-RL). MQL builds upon three simple ideas. First, we show that Q-learning is competitive with state-of-the-art meta-RL algorithms if given access to a context variable that is a representation of the past trajectory. Second, a multi-task objective to maximize the average reward across the tr… ▽ More

    Submitted 4 April, 2020; v1 submitted 30 September, 2019; originally announced October 2019.

    Comments: ICLR 2020 conference paper

  16. arXiv:1905.01756  [pdf, other

    cs.LG stat.ML

    P3O: Policy-on Policy-off Policy Optimization

    Authors: Rasool Fakoor, Pratik Chaudhari, Alexander J. Smola

    Abstract: On-policy reinforcement learning (RL) algorithms have high sample complexity while off-policy algorithms are difficult to tune. Merging the two holds the promise to develop efficient algorithms that generalize across diverse environments. It is however challenging in practice to find suitable hyper-parameters that govern this trade off. This paper develops a simple algorithm named P3O that interle… ▽ More

    Submitted 15 July, 2019; v1 submitted 5 May, 2019; originally announced May 2019.

    Comments: UAI 2019 conference paper. Code: https://github.com/rasoolfa/P3O

  17. arXiv:1810.12464  [pdf, other

    cs.LG stat.ML

    Differentiable Greedy Networks

    Authors: Thomas Powers, Rasool Fakoor, Siamak Shakeri, Abhinav Sethy, Amanjit Kainth, Abdel-rahman Mohamed, Ruhi Sarikaya

    Abstract: Optimal selection of a subset of items from a given set is a hard problem that requires combinatorial optimization. In this paper, we propose a subset selection algorithm that is trainable with gradient-based methods yet achieves near-optimal performance via submodular optimization. We focus on the task of identifying a relevant set of sentences for claim verification in the context of the FEVER t… ▽ More

    Submitted 29 October, 2018; originally announced October 2018.

    Comments: Work in progress and under review

  18. arXiv:1810.00679  [pdf, other

    cs.IR cs.CL cs.LG stat.ML

    Direct optimization of F-measure for retrieval-based personal question answering

    Authors: Rasool Fakoor, Amanjit Kainth, Siamak Shakeri, Christopher Winestock, Abdel-rahman Mohamed, Ruhi Sarikaya

    Abstract: Recent advances in spoken language technologies and the introduction of many customer facing products, have given rise to a wide customer reliance on smart personal assistants for many of their daily tasks. In this paper, we present a system to reduce users' cognitive load by extending personal assistants with long-term personal memory where users can store and retrieve by voice, arbitrary pieces… ▽ More

    Submitted 27 September, 2018; originally announced October 2018.

    Comments: accepted at SLT2018

  19. arXiv:1802.05874  [pdf, other

    cs.LG

    Constrained Convolutional-Recurrent Networks to Improve Speech Quality with Low Impact on Recognition Accuracy

    Authors: Rasool Fakoor, Xiaodong He, Ivan Tashev, Shuayb Zarar

    Abstract: For a speech-enhancement algorithm, it is highly desirable to simultaneously improve perceptual quality and recognition rate. Thanks to computational costs and model complexities, it is challenging to train a model that effectively optimizes both metrics at the same time. In this paper, we propose a method for speech enhancement that combines local and global contextual structures information thro… ▽ More

    Submitted 16 February, 2018; originally announced February 2018.

    Comments: Published as a conference paper at ICASSP 2018

  20. arXiv:1711.10791  [pdf, other

    cs.LG

    Reinforcement Learning To Adapt Speech Enhancement to Instantaneous Input Signal Quality

    Authors: Rasool Fakoor, Xiaodong He, Ivan Tashev, Shuayb Zarar

    Abstract: Today, the optimal performance of existing noise-suppression algorithms, both data-driven and those based on classic statistical methods, is range bound to specific levels of instantaneous input signal-to-noise ratios. In this paper, we present a new approach to improve the adaptivity of such algorithms enabling them to perform robustly across a wide range of input signal and noise types. Our meth… ▽ More

    Submitted 27 July, 2018; v1 submitted 29 November, 2017; originally announced November 2017.

    Comments: NIPS 2017, Machine Learning for Audio Signal Processing workshop

  21. arXiv:1611.02261  [pdf, other

    cs.CV cs.LG cs.NE

    Memory-augmented Attention Modelling for Videos

    Authors: Rasool Fakoor, Abdel-rahman Mohamed, Margaret Mitchell, Sing Bing Kang, Pushmeet Kohli

    Abstract: We present a method to improve video description generation by modeling higher-order interactions between video frames and described concepts. By storing past visual attention in the video associated to previously generated words, the system is able to decide what to look at and describe in light of what it has already looked at and described. This enables not only more effective local attention,… ▽ More

    Submitted 24 April, 2017; v1 submitted 7 November, 2016; originally announced November 2016.

    Comments: Revised version, minor changes, add the link for the source codes