Zum Hauptinhalt springen

Showing 1–17 of 17 results for author: Ajay, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.00681  [pdf, other

    cs.LG

    Learning Multimodal Behaviors from Scratch with Diffusion Policy Gradient

    Authors: Zechu Li, Rickmer Krohn, Tao Chen, Anurag Ajay, Pulkit Agrawal, Georgia Chalvatzaki

    Abstract: Deep reinforcement learning (RL) algorithms typically parameterize the policy as a deep network that outputs either a deterministic action or a stochastic one modeled as a Gaussian distribution, hence restricting learning to a single behavioral mode. Meanwhile, diffusion models emerged as a powerful framework for multimodal learning. However, the use of diffusion policies in online RL is hindered… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  2. arXiv:2405.17247  [pdf, other

    cs.LG

    An Introduction to Vision-Language Modeling

    Authors: Florian Bordes, Richard Yuanzhe Pang, Anurag Ajay, Alexander C. Li, Adrien Bardes, Suzanne Petryk, Oscar Mañas, Zhiqiu Lin, Anas Mahmoud, Bargav Jayaraman, Mark Ibrahim, Melissa Hall, Yunyang Xiong, Jonathan Lebensold, Candace Ross, Srihari Jayakumar, Chuan Guo, Diane Bouchacourt, Haider Al-Tahan, Karthik Padthe, Vasu Sharma, Hu Xu, Xiaoqing Ellen Tan, Megan Richards, Samuel Lavoie , et al. (16 additional authors not shown)

    Abstract: Following the recent popularity of Large Language Models (LLMs), several attempts have been made to extend them to the visual domain. From having a visual assistant that could guide us through unfamiliar environments to generative models that produce images using only a high-level text description, the vision-language model (VLM) applications will significantly impact our relationship with technol… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  3. arXiv:2309.08587  [pdf, other

    cs.LG cs.AI cs.RO

    Compositional Foundation Models for Hierarchical Planning

    Authors: Anurag Ajay, Seungwook Han, Yilun Du, Shuang Li, Abhi Gupta, Tommi Jaakkola, Josh Tenenbaum, Leslie Kaelbling, Akash Srivastava, Pulkit Agrawal

    Abstract: To make effective decisions in novel environments with long-horizon goals, it is crucial to engage in hierarchical reasoning across spatial and temporal scales. This entails planning abstract subgoal sequences, visually reasoning about the underlying plans, and executing actions in accordance with the devised plan through visual-motor control. We propose Compositional Foundation Models for Hierarc… ▽ More

    Submitted 21 September, 2023; v1 submitted 15 September, 2023; originally announced September 2023.

    Comments: Website: https://hierarchical-planning-foundation-model.github.io/

  4. arXiv:2308.11780  [pdf, other

    cs.LG cs.CL

    Few-shot Anomaly Detection in Text with Deviation Learning

    Authors: Anindya Sundar Das, Aravind Ajay, Sriparna Saha, Monowar Bhuyan

    Abstract: Most current methods for detecting anomalies in text concentrate on constructing models solely relying on unlabeled data. These models operate on the presumption that no labeled anomalous examples are available, which prevents them from utilizing prior knowledge of anomalies that are typically present in small numbers in many real-world applications. Furthermore, these models prioritize learning f… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

    Comments: Accepted in ICONIP 2023

  5. arXiv:2307.12983  [pdf, other

    cs.LG cs.AI cs.RO

    Parallel $Q$-Learning: Scaling Off-policy Reinforcement Learning under Massively Parallel Simulation

    Authors: Zechu Li, Tao Chen, Zhang-Wei Hong, Anurag Ajay, Pulkit Agrawal

    Abstract: Reinforcement learning is time-consuming for complex tasks due to the need for large amounts of training data. Recent advances in GPU-based simulation, such as Isaac Gym, have sped up data collection thousands of times on a commodity GPU. Most prior works used on-policy methods like PPO due to their simplicity and ease of scaling. Off-policy methods are more data efficient but challenging to scale… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

    Comments: Accepted by ICML 2023

  6. arXiv:2302.13934  [pdf, other

    cs.LG stat.ML

    Statistical Learning under Heterogeneous Distribution Shift

    Authors: Max Simchowitz, Anurag Ajay, Pulkit Agrawal, Akshay Krishnamurthy

    Abstract: This paper studies the prediction of a target $\mathbf{z}$ from a pair of random variables $(\mathbf{x},\mathbf{y})$, where the ground-truth predictor is additive $\mathbb{E}[\mathbf{z} \mid \mathbf{x},\mathbf{y}] = f_\star(\mathbf{x}) +g_{\star}(\mathbf{y})$. We study the performance of empirical risk minimization (ERM) over functions $f+g$, $f \in F$ and $g \in G$, fit on a given training distri… ▽ More

    Submitted 27 October, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

  7. arXiv:2211.15657  [pdf, other

    cs.LG cs.AI

    Is Conditional Generative Modeling all you need for Decision-Making?

    Authors: Anurag Ajay, Yilun Du, Abhi Gupta, Joshua Tenenbaum, Tommi Jaakkola, Pulkit Agrawal

    Abstract: Recent improvements in conditional generative modeling have made it possible to generate high-quality images from language descriptions alone. We investigate whether these methods can directly address the problem of sequential decision-making. We view decision-making not through the lens of reinforcement learning (RL), but rather through conditional generative modeling. To our surprise, we find th… ▽ More

    Submitted 10 July, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

    Comments: Website: https://anuragajay.github.io/decision-diffuser/

  8. arXiv:2210.03104  [pdf, other

    cs.LG cs.AI

    Distributionally Adaptive Meta Reinforcement Learning

    Authors: Anurag Ajay, Abhishek Gupta, Dibya Ghosh, Sergey Levine, Pulkit Agrawal

    Abstract: Meta-reinforcement learning algorithms provide a data-driven way to acquire policies that quickly adapt to many tasks with varying rewards or dynamics functions. However, learned meta-policies are often effective only on the exact task distribution on which they were trained and struggle in the presence of distribution shift of test-time rewards or transition dynamics. In this work, we develop a f… ▽ More

    Submitted 10 July, 2023; v1 submitted 6 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022

  9. arXiv:2207.02200  [pdf, other

    cs.LG cs.AI stat.ML

    Offline RL Policies Should be Trained to be Adaptive

    Authors: Dibya Ghosh, Anurag Ajay, Pulkit Agrawal, Sergey Levine

    Abstract: Offline RL algorithms must account for the fact that the dataset they are provided may leave many facets of the environment unknown. The most common way to approach this challenge is to employ pessimistic or conservative methods, which avoid behaviors that are too dissimilar from those in the training dataset. However, relying exclusively on conservatism has drawbacks: performance is sensitive to… ▽ More

    Submitted 5 July, 2022; originally announced July 2022.

    Comments: ICML 2022 (long talk)

  10. arXiv:2206.04672  [pdf, other

    cs.LG stat.ML

    Overcoming the Spectral Bias of Neural Value Approximation

    Authors: Ge Yang, Anurag Ajay, Pulkit Agrawal

    Abstract: Value approximation using deep neural networks is at the heart of off-policy deep reinforcement learning, and is often the primary module that provides learning signals to the rest of the algorithm. While multi-layer perceptron networks are universal function approximators, recent works in neural kernel regression suggest the presence of a spectral bias, where fitting high-frequency components of… ▽ More

    Submitted 9 June, 2022; originally announced June 2022.

    Comments: Code and analysis available at https://geyang.github.io/ffn . First two authors contributed equally

  11. arXiv:2010.13611  [pdf, other

    cs.LG

    OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning

    Authors: Anurag Ajay, Aviral Kumar, Pulkit Agrawal, Sergey Levine, Ofir Nachum

    Abstract: Reinforcement learning (RL) has achieved impressive performance in a variety of online settings in which an agent's ability to query the environment for transitions and rewards is effectively unlimited. However, in many practical applications, the situation is reversed: an agent may have access to large amounts of undirected offline experience data, while access to the online environment is severe… ▽ More

    Submitted 4 May, 2021; v1 submitted 26 October, 2020; originally announced October 2020.

    Comments: https://sites.google.com/view/opal-iclr

  12. arXiv:2009.03994  [pdf, other

    cs.RO

    Long-Horizon Prediction and Uncertainty Propagation with Residual Point Contact Learners

    Authors: Nima Fazeli, Anurag Ajay, Alberto Rodriguez

    Abstract: The ability to simulate and predict the outcome of contacts is paramount to the successful execution of many robotic tasks. Simulators are powerful tools for the design of robots and their behaviors, yet the discrepancy between their predictions and observed data limit their usability. In this paper, we propose a self-supervised approach to learning residual models for rigid-body simulators that e… ▽ More

    Submitted 8 September, 2020; originally announced September 2020.

    Comments: 7 pages, 4 figures, ICRA 2020 submission accepted

  13. arXiv:1904.06580  [pdf, other

    cs.RO cs.LG

    Combining Physical Simulators and Object-Based Networks for Control

    Authors: Anurag Ajay, Maria Bauza, Jiajun Wu, Nima Fazeli, Joshua B. Tenenbaum, Alberto Rodriguez, Leslie P. Kaelbling

    Abstract: Physics engines play an important role in robot planning and control; however, many real-world control problems involve complex contact dynamics that cannot be characterized analytically. Most physics engines therefore employ . approximations that lead to a loss in precision. In this paper, we propose a hybrid dynamics model, simulator-augmented interaction networks (SAIN), combining a physics eng… ▽ More

    Submitted 13 April, 2019; originally announced April 2019.

    Comments: ICRA 2019; Project page: http://sain.csail.mit.edu

  14. arXiv:1808.03246  [pdf, other

    cs.RO cs.LG

    Augmenting Physical Simulators with Stochastic Neural Networks: Case Study of Planar Pushing and Bouncing

    Authors: Anurag Ajay, Jiajun Wu, Nima Fazeli, Maria Bauza, Leslie P. Kaelbling, Joshua B. Tenenbaum, Alberto Rodriguez

    Abstract: An efficient, generalizable physical simulator with universal uncertainty estimates has wide applications in robot state estimation, planning, and control. In this paper, we build such a simulator for two scenarios, planar pushing and ball bouncing, by augmenting an analytical rigid-body simulator with a neural network that learns to model uncertainty as residuals. Combining symbolic, deterministi… ▽ More

    Submitted 9 August, 2018; originally announced August 2018.

    Comments: IROS 2018

  15. arXiv:1610.01112  [pdf, other

    cs.LG cs.RO

    Reset-Free Guided Policy Search: Efficient Deep Reinforcement Learning with Stochastic Initial States

    Authors: William Montgomery, Anurag Ajay, Chelsea Finn, Pieter Abbeel, Sergey Levine

    Abstract: Autonomous learning of robotic skills can allow general-purpose robots to learn wide behavioral repertoires without requiring extensive manual engineering. However, robotic skill learning methods typically make one of several trade-offs to enable practical real-world learning, such as requiring manually designed policy or value function representations, initialization from human-provided demonstra… ▽ More

    Submitted 6 October, 2016; v1 submitted 4 October, 2016; originally announced October 2016.

  16. arXiv:1605.07148  [pdf, other

    cs.LG cs.AI

    Backprop KF: Learning Discriminative Deterministic State Estimators

    Authors: Tuomas Haarnoja, Anurag Ajay, Sergey Levine, Pieter Abbeel

    Abstract: Generative state estimators based on probabilistic filters and smoothers are one of the most popular classes of state estimators for robots and autonomous vehicles. However, generative models have limited capacity to handle rich sensory observations, such as camera images, since they must model the entire distribution over sensor readings. Discriminative models do not suffer from this limitation,… ▽ More

    Submitted 30 September, 2017; v1 submitted 23 May, 2016; originally announced May 2016.

    Comments: NIPS 2016

  17. arXiv:1303.3605  [pdf

    cs.RO cs.CV cs.LG

    A survey on sensing methods and feature extraction algorithms for SLAM problem

    Authors: Adheen Ajay, D. Venkataraman

    Abstract: This paper is a survey work for a bigger project for designing a Visual SLAM robot to generate 3D dense map of an unknown unstructured environment. A lot of factors have to be considered while designing a SLAM robot. Sensing method of the SLAM robot should be determined by considering the kind of environment to be modeled. Similarly the type of environment determines the suitable feature extractio… ▽ More

    Submitted 14 March, 2013; originally announced March 2013.

    Comments: 5 pages, 1 figure,2 tables