Skip to main content

Showing 1–50 of 216 results for author: Aggarwal, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.11481  [pdf, other

    cs.LG cs.AI

    Constrained Reinforcement Learning with Average Reward Objective: Model-Based and Model-Free Algorithms

    Authors: Vaneet Aggarwal, Washim Uddin Mondal, Qinbo Bai

    Abstract: Reinforcement Learning (RL) serves as a versatile framework for sequential decision-making, finding applications across diverse domains such as robotics, autonomous driving, recommendation systems, supply chain optimization, biology, mechanics, and finance. The primary objective in these applications is to maximize the average reward. Real-world scenarios often necessitate adherence to specific co… ▽ More

    Submitted 17 July, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:2402.02042

  2. arXiv:2406.02844  [pdf, other

    cs.IR cs.CL

    Item-Language Model for Conversational Recommendation

    Authors: Li Yang, Anushya Subbiah, Hardik Patel, Judith Yue Li, Yanwei Song, Reza Mirghaderi, Vikram Aggarwal

    Abstract: Large-language Models (LLMs) have been extremely successful at tasks like complex dialogue understanding, reasoning and coding due to their emergent abilities. These emergent abilities have been extended with multi-modality to include image, audio, and video capabilities. Recommender systems, on the other hand, have been critical for information seeking and item discovery needs. Recently, there ha… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 15 pages, 3 figures

  3. arXiv:2405.16386  [pdf, other

    cs.LG cs.AI

    Variational Offline Multi-agent Skill Discovery

    Authors: Jiayu Chen, Bhargav Ganguly, Tian Lan, Vaneet Aggarwal

    Abstract: Skills are effective temporal abstractions established for sequential decision making tasks, which enable efficient hierarchical learning for long-horizon tasks and facilitate multi-task learning through their transferability. Despite extensive research, research gaps remain in multi-agent scenarios, particularly for automatically extracting subgroup coordination patterns in a multi-agent task. In… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  4. arXiv:2405.10624  [pdf, ps, other

    cs.LG cs.AI

    Sample-Efficient Constrained Reinforcement Learning with General Parameterization

    Authors: Washim Uddin Mondal, Vaneet Aggarwal

    Abstract: We consider a constrained Markov Decision Problem (CMDP) where the goal of an agent is to maximize the expected discounted sum of rewards over an infinite horizon while ensuring that the expected discounted sum of costs exceeds a certain threshold. Building on the idea of momentum-based acceleration, we develop the Primal-Dual Accelerated Natural Policy Gradient (PD-ANPG) algorithm that guarantees… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  5. arXiv:2405.10310  [pdf, other

    cs.LG cs.AI cs.PF cs.RO stat.ML

    Stochastic Q-learning for Large Discrete Action Spaces

    Authors: Fares Fourati, Vaneet Aggarwal, Mohamed-Slim Alouini

    Abstract: In complex environments with large discrete action spaces, effective decision-making is critical in reinforcement learning (RL). Despite the widespread use of value-based RL approaches like Q-learning, they come with a computational burden, necessitating the maximization of a value function over all actions in each iteration. This burden becomes particularly challenging when addressing large-scale… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  6. arXiv:2405.05950  [pdf, other

    cs.LG cs.AI cs.DM cs.MA stat.ML

    Federated Combinatorial Multi-Agent Multi-Armed Bandits

    Authors: Fares Fourati, Mohamed-Slim Alouini, Vaneet Aggarwal

    Abstract: This paper introduces a federated learning framework tailored for online combinatorial optimization with bandit feedback. In this setting, agents select subsets of arms, observe noisy rewards for these subsets without accessing individual arm information, and can cooperate and share information at specific intervals. Our framework transforms any offline resilient single-agent $(α-ε)$-approximation… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  7. arXiv:2405.01843  [pdf, ps, other

    cs.LG cs.AI

    Closing the Gap: Achieving Global Convergence (Last Iterate) of Actor-Critic under Markovian Sampling with Neural Network Parametrization

    Authors: Mudit Gaur, Amrit Singh Bedi, Di Wang, Vaneet Aggarwal

    Abstract: The current state-of-the-art theoretical analysis of Actor-Critic (AC) algorithms significantly lags in addressing the practical aspects of AC implementations. This crucial gap needs bridging to bring the analysis in line with practical implementations of AC. To address this, we advocate for considering the MMCLG criteria: \textbf{M}ulti-layer neural network parametrization for actor/critic, \text… ▽ More

    Submitted 11 June, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

    Comments: Accepted at ICML 2024. This is a revised version of arXiv:2306.10486, where we have gone from finite action space to continuous action space, from average iterate convergence to last iterate convergence and from $ε^{-4}$ to $ε^{-3}$ sample complexity

  8. arXiv:2405.00065  [pdf, other

    math.OC cs.CC cs.LG stat.ML

    From Linear to Linearizable Optimization: A Novel Framework with Applications to Stationary and Non-stationary DR-submodular Optimization

    Authors: Mohammad Pedramfar, Vaneet Aggarwal

    Abstract: This paper introduces the notion of upper linearizable/quadratizable functions, a class that extends concavity and DR-submodularity in various settings, including monotone and non-monotone cases over different convex sets. A general meta-algorithm is devised to convert algorithms for linear/quadratic maximization into ones that optimize upper quadratizable functions, offering a unified approach to… ▽ More

    Submitted 13 May, 2024; v1 submitted 27 April, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2402.08621

  9. arXiv:2404.17225  [pdf, other

    cs.CR cs.AI cs.RO

    Enhancing Privacy and Security of Autonomous UAV Navigation

    Authors: Vatsal Aggarwal, Arjun Ramesh Kaushik, Charanjit Jutla, Nalini Ratha

    Abstract: Autonomous Unmanned Aerial Vehicles (UAVs) have become essential tools in defense, law enforcement, disaster response, and product delivery. These autonomous navigation systems require a wireless communication network, and of late are deep learning based. In critical scenarios such as border protection or disaster response, ensuring the secure navigation of autonomous UAVs is paramount. But, these… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  10. arXiv:2404.15616  [pdf, other

    quant-ph cs.AI

    A Bi-directional Quantum Search Algorithm

    Authors: Debanjan Konar, Zain Hafeez, Vaneet Aggarwal

    Abstract: Grover's search algorithms, including various partial Grover searches, experience scaling problems as the number of iterations rises with increased qubits, making implementation more computationally expensive. This paper combines Partial Grover's search algorithm and Bi-directional Search to create a fast Grover's quantum search algorithm, referred to as Bi-Directional Grover Search (BDGS). We inc… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 7 pages

  11. arXiv:2404.10518  [pdf, other

    cs.CV

    MobileNetV4 -- Universal Models for the Mobile Ecosystem

    Authors: Danfeng Qin, Chas Leichner, Manolis Delakis, Marco Fornoni, Shixin Luo, Fan Yang, Weijun Wang, Colby Banbury, Chengxi Ye, Berkin Akin, Vaibhav Aggarwal, Tenghui Zhu, Daniele Moro, Andrew Howard

    Abstract: We present the latest generation of MobileNets, known as MobileNetV4 (MNv4), featuring universally efficient architecture designs for mobile devices. At its core, we introduce the Universal Inverted Bottleneck (UIB) search block, a unified and flexible structure that merges Inverted Bottleneck (IB), ConvNext, Feed Forward Network (FFN), and a novel Extra Depthwise (ExtraDW) variant. Alongside UIB,… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  12. arXiv:2404.08003  [pdf, other

    cs.LG cs.DC cs.NI

    Asynchronous Federated Reinforcement Learning with Policy Gradient Updates: Algorithm Design and Convergence Analysis

    Authors: Guangchen Lan, Dong-Jun Han, Abolfazl Hashemi, Vaneet Aggarwal, Christopher G. Brinton

    Abstract: To improve the efficiency of reinforcement learning, we propose a novel asynchronous federated reinforcement learning framework termed AFedPG, which constructs a global model through collaboration among $N$ agents using policy gradient (PG) updates. To handle the challenge of lagged policies in asynchronous settings, we design delay-adaptive lookahead and normalized update techniques that can effe… ▽ More

    Submitted 14 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

    ACM Class: I.2.6; I.2.11

  13. arXiv:2404.02108  [pdf, ps, other

    cs.LG

    Variance-Reduced Policy Gradient Approaches for Infinite Horizon Average Reward Markov Decision Processes

    Authors: Swetha Ganesh, Washim Uddin Mondal, Vaneet Aggarwal

    Abstract: We present two Policy Gradient-based methods with general parameterization in the context of infinite horizon average reward Markov Decision Processes. The first approach employs Implicit Gradient Transport for variance reduction, ensuring an expected regret of the order $\tilde{\mathcal{O}}(T^{3/5})$. The second approach, rooted in Hessian-based techniques, ensures an expected regret of the order… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: 34 pages

  14. arXiv:2403.11925  [pdf, other

    cs.LG

    Towards Global Optimality for Practical Average Reward Reinforcement Learning without Mixing Time Oracles

    Authors: Bhrij Patel, Wesley A. Suttle, Alec Koppel, Vaneet Aggarwal, Brian M. Sadler, Amrit Singh Bedi, Dinesh Manocha

    Abstract: In the context of average-reward reinforcement learning, the requirement for oracle knowledge of the mixing time, a measure of the duration a Markov chain under a fixed policy needs to achieve its stationary distribution, poses a significant challenge for the global convergence of policy gradient methods. This requirement is particularly problematic due to the difficulty and expense of estimating… ▽ More

    Submitted 20 June, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: 26 Pages, 2 Figures

  15. arXiv:2403.10063  [pdf, other

    cs.LG cs.AI cs.CC math.OC

    Unified Projection-Free Algorithms for Adversarial DR-Submodular Optimization

    Authors: Mohammad Pedramfar, Yididiya Y. Nadew, Christopher J. Quinn, Vaneet Aggarwal

    Abstract: This paper introduces unified projection-free Frank-Wolfe type algorithms for adversarial continuous DR-submodular optimization, spanning scenarios such as full information and (semi-)bandit feedback, monotone and non-monotone functions, different constraints, and types of stochastic queries. For every problem considered in the non-monotone setting, the proposed algorithms are either the first wit… ▽ More

    Submitted 26 April, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: This paper is published in ICLR 2024. This version includes a correction for regret bounds in the full-information zeroth order feedback setting (see the footnote on page 1 for details)

  16. arXiv:2403.09940  [pdf, ps, other

    cs.LG cs.AI math.OC

    Global Convergence Guarantees for Federated Policy Gradient Methods with Adversaries

    Authors: Swetha Ganesh, Jiayu Chen, Gugan Thoppe, Vaneet Aggarwal

    Abstract: Federated Reinforcement Learning (FRL) allows multiple agents to collaboratively build a decision making policy without sharing raw trajectories. However, if a small fraction of these agents are adversarial, it can lead to catastrophic results. We propose a policy gradient based approach that is robust to adversarial agents which can send arbitrary values to the server. Under this setting, our res… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: 27 pages, 6 figures

  17. arXiv:2403.07309  [pdf, other

    cs.LG cs.AI cs.CY

    Reinforced Sequential Decision-Making for Sepsis Treatment: The POSNEGDM Framework with Mortality Classifier and Transformer

    Authors: Dipesh Tamboli, Jiayu Chen, Kiran Pranesh Jotheeswaran, Denny Yu, Vaneet Aggarwal

    Abstract: Sepsis, a life-threatening condition triggered by the body's exaggerated response to infection, demands urgent intervention to prevent severe complications. Existing machine learning methods for managing sepsis struggle in offline scenarios, exhibiting suboptimal performance with survival rates below 50%. This paper introduces the POSNEGDM -- ``Reinforcement Learning with Positive and Negative Dem… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: Accepted to IEEE Journal of Biomedical and Health Informatics, Mar 2024

  18. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  19. arXiv:2402.13777  [pdf, other

    cs.LG cs.AI

    Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions

    Authors: Jiayu Chen, Bhargav Ganguly, Yang Xu, Yongsheng Mei, Tian Lan, Vaneet Aggarwal

    Abstract: Deep generative models (DGMs) have demonstrated great success across various domains, particularly in generating texts, images, and videos using models trained from offline data. Similarly, data-driven decision-making and robotic control also necessitate learning a generator function from the offline data to serve as the strategy or policy. In this case, applying deep generative models in offline… ▽ More

    Submitted 25 May, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: We restructured the paper and added more discussion

  20. arXiv:2402.08790  [pdf, other

    cs.LG q-bio.QM

    Improving Molecule Generation and Drug Discovery with a Knowledge-enhanced Generative Model

    Authors: Aditya Malusare, Vaneet Aggarwal

    Abstract: Recent advancements in generative models have established state-of-the-art benchmarks in generating molecules and novel drug candidates. Despite these successes, a significant gap persists between generative models and the utilization of extensive biomedical knowledge, often systematized within knowledge graphs, whose potential to inform and enhance generative processes has not been realized. In t… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: 12 pages

  21. arXiv:2402.08621  [pdf, other

    cs.LG math.OC stat.ML

    A Generalized Approach to Online Convex Optimization

    Authors: Mohammad Pedramfar, Vaneet Aggarwal

    Abstract: In this paper, we analyze the problem of online convex optimization in different settings. We show that any algorithm for online linear optimization with fully adaptive adversaries is an algorithm for online convex optimization. We also show that any such algorithm that requires full-information feedback may be transformed to an algorithm with semi-bandit feedback with comparable regret bound. We… ▽ More

    Submitted 13 May, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

  22. arXiv:2402.06901  [pdf, other

    cs.NI

    Near-perfect Coverage Manifold Estimation in Cellular Networks via conditional GAN

    Authors: Washim Uddin Mondal, Veni Goyal, Satish V. Ukkusuri, Goutam Das, Di Wang, Mohamed-Slim Alouini, Vaneet Aggarwal

    Abstract: This paper presents a conditional generative adversarial network (cGAN) that translates base station location (BSL) information of any Region-of-Interest (RoI) to location-dependent coverage probability values within a subset of that region, called the region-of-evaluation (RoE). We train our network utilizing the BSL data of India, the USA, Germany, and Brazil. In comparison to the state-of-the-a… ▽ More

    Submitted 10 February, 2024; originally announced February 2024.

    Journal ref: IEEE Networking Letters, 2024

  23. arXiv:2402.02042  [pdf, ps, other

    cs.LG cs.AI

    Learning General Parameterized Policies for Infinite Horizon Average Reward Constrained MDPs via Primal-Dual Policy Gradient Algorithm

    Authors: Qinbo Bai, Washim Uddin Mondal, Vaneet Aggarwal

    Abstract: This paper explores the realm of infinite horizon average reward Constrained Markov Decision Processes (CMDP). To the best of our knowledge, this work is the first to delve into the regret and constraint violation analysis of average reward CMDPs with a general policy parametrization. To address this challenge, we propose a primal dual based policy gradient algorithm that adeptly manages the const… ▽ More

    Submitted 3 March, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

    Comments: We fixed Lemma 6 in v2 which changed the final result

  24. arXiv:2312.08057  [pdf, other

    cs.LG cs.AI math.CO math.OC stat.ML

    Combinatorial Stochastic-Greedy Bandit

    Authors: Fares Fourati, Christopher John Quinn, Mohamed-Slim Alouini, Vaneet Aggarwal

    Abstract: We propose a novel combinatorial stochastic-greedy bandit (SGB) algorithm for combinatorial multi-armed bandit problems when no extra information other than the joint reward of the selected set of $n$ arms at each time step $t\in [T]$ is observed. SGB adopts an optimized stochastic-explore-then-commit approach and is specifically designed for scenarios with a large set of base arms. Unlike existin… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

  25. arXiv:2312.01826  [pdf, other

    cs.NI

    Terrain-based Coverage Manifold Estimation: Machine Learning, Stochastic Geometry, or Simulation?

    Authors: Ruibo Wang, Washim Uddin Mondal, Mustafa A. Kishk, Vaneet Aggarwal, Mohamed-Slim Alouini

    Abstract: Given the necessity of connecting the unconnected, covering blind spots has emerged as a critical task in the next-generation wireless communication network. A direct solution involves obtaining a coverage manifold that visually showcases network coverage performance at each position. Our goal is to devise different methods that minimize the absolute error between the estimated coverage manifold a… ▽ More

    Submitted 11 December, 2023; v1 submitted 4 December, 2023; originally announced December 2023.

  26. arXiv:2311.02333  [pdf, other

    cs.LG q-bio.GN

    Understanding the Natural Language of DNA using Encoder-Decoder Foundation Models with Byte-level Precision

    Authors: Aditya Malusare, Harish Kothandaraman, Dipesh Tamboli, Nadia A. Lanman, Vaneet Aggarwal

    Abstract: This paper presents the Ensemble Nucleotide Byte-level Encoder-Decoder (ENBED) foundation model, analyzing DNA sequences at byte-level precision with an encoder-decoder Transformer architecture. ENBED uses a sub-quadratic implementation of attention to develop an efficient model capable of sequence-to-sequence transformations, generalizing previous genomic models with encoder-only or decoder-only… ▽ More

    Submitted 13 February, 2024; v1 submitted 4 November, 2023; originally announced November 2023.

    Comments: 9 pages

  27. arXiv:2310.20007  [pdf, ps, other

    stat.ML cs.AI cs.LG

    Improved Bayesian Regret Bounds for Thompson Sampling in Reinforcement Learning

    Authors: Ahmadreza Moradipari, Mohammad Pedramfar, Modjtaba Shokrian Zini, Vaneet Aggarwal

    Abstract: In this paper, we prove the first Bayesian regret bounds for Thompson Sampling in reinforcement learning in a multitude of settings. We simplify the learning problem using a discrete set of surrogate environments, and present a refined analysis of the information ratio using posterior consistency. This leads to an upper bound of order $\widetilde{O}(H\sqrt{d_{l_1}T})$ in the time inhomogeneous rei… ▽ More

    Submitted 6 February, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  28. arXiv:2310.19807  [pdf, other

    cs.LG math.OC

    Improved Communication Efficiency in Federated Natural Policy Gradient via ADMM-based Gradient Updates

    Authors: Guangchen Lan, Han Wang, James Anderson, Christopher Brinton, Vaneet Aggarwal

    Abstract: Federated reinforcement learning (FedRL) enables agents to collaboratively train a global policy without sharing their individual data. However, high communication overhead remains a critical bottleneck, particularly for natural policy gradient (NPG) methods, which are second-order. To address this issue, we propose the FedNPG-ADMM framework, which leverages the alternating direction method of mul… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

    Comments: Accepted at the 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

    ACM Class: I.2.6

  29. arXiv:2310.11684  [pdf, other

    cs.LG cs.AI quant-ph

    Quantum Speedups in Regret Analysis of Infinite Horizon Average-Reward Markov Decision Processes

    Authors: Bhargav Ganguly, Yang Xu, Vaneet Aggarwal

    Abstract: This paper investigates the potential of quantum acceleration in addressing infinite horizon Markov Decision Processes (MDPs) to enhance average reward outcomes. We introduce an innovative quantum framework for the agent's engagement with an unknown MDP, extending the conventional interaction paradigm. Our approach involves the design of an optimism-driven tabular Reinforcement Learning algorithm… ▽ More

    Submitted 28 April, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

  30. arXiv:2310.11677  [pdf, ps, other

    cs.LG cs.AI

    Improved Sample Complexity Analysis of Natural Policy Gradient Algorithm with General Parameterization for Infinite Horizon Discounted Reward Markov Decision Processes

    Authors: Washim Uddin Mondal, Vaneet Aggarwal

    Abstract: We consider the problem of designing sample efficient learning algorithms for infinite horizon discounted reward Markov Decision Process. Specifically, we propose the Accelerated Natural Policy Gradient (ANPG) algorithm that utilizes an accelerated stochastic gradient descent process to obtain the natural policy gradient. ANPG achieves $\mathcal{O}({ε^{-2}})$ sample complexity and… ▽ More

    Submitted 5 February, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

    Journal ref: AISTATS 2024

  31. arXiv:2310.07367  [pdf, ps, other

    cs.LG

    Improved Analysis of Sparse Linear Regression in Local Differential Privacy Model

    Authors: Liyang Zhu, Meng Ding, Vaneet Aggarwal, Jinhui Xu, Di Wang

    Abstract: In this paper, we revisit the problem of sparse linear regression in the local differential privacy (LDP) model. Existing research in the non-interactive and sequentially local models has focused on obtaining the lower bounds for the case where the underlying parameter is $1$-sparse, and extending such bounds to the more general $k$-sparse case has proven to be challenging. Moreover, it is unclear… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

  32. arXiv:2310.01515  [pdf, other

    quant-ph cs.LG

    Tensor Ring Optimized Quantum-Enhanced Tensor Neural Networks

    Authors: Debanjan Konar, Dheeraj Peddireddy, Vaneet Aggarwal, Bijaya K. Panigrahi

    Abstract: Quantum machine learning researchers often rely on incorporating Tensor Networks (TN) into Deep Neural Networks (DNN) and variational optimization. However, the standard optimization techniques used for training the contracted trainable weights of each model layer suffer from the correlations and entanglement structure between the model parameters on classical implementations. To address this issu… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

  33. arXiv:2309.12814  [pdf, other

    cs.CV

    Domain Adaptive Few-Shot Open-Set Learning

    Authors: Debabrata Pal, Deeptej More, Sai Bhargav, Dipesh Tamboli, Vaneet Aggarwal, Biplab Banerjee

    Abstract: Few-shot learning has made impressive strides in addressing the crucial challenges of recognizing unknown samples from novel classes in target query sets and managing visual shifts between domains. However, existing techniques fall short when it comes to identifying target outliers under domain shifts by learning to reject pseudo-outliers from the source domain, resulting in an incomplete solution… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

    Journal ref: ICCV 2023

  34. arXiv:2309.07230  [pdf, other

    cs.SE

    ESRO: Experience Assisted Service Reliability against Outages

    Authors: Sarthak Chakraborty, Shubham Agarwal, Shaddy Garg, Abhimanyu Sethia, Udit Narayan Pandey, Videh Aggarwal, Shiv Saini

    Abstract: Modern cloud services are prone to failures due to their complex architecture, making diagnosis a critical process. Site Reliability Engineers (SREs) spend hours leveraging multiple sources of data, including the alerts, error logs, and domain expertise through past experiences to locate the root cause(s). These experiences are documented as natural language text in outage reports for previous out… ▽ More

    Submitted 13 September, 2023; originally announced September 2023.

    Comments: Accepted to 38th IEEE/ACM International Conference on Automated Software Engineering (ASE 2023)

  35. arXiv:2309.01922  [pdf, ps, other

    cs.LG cs.AI

    Regret Analysis of Policy Gradient Algorithm for Infinite Horizon Average Reward Markov Decision Processes

    Authors: Qinbo Bai, Washim Uddin Mondal, Vaneet Aggarwal

    Abstract: In this paper, we consider an infinite horizon average reward Markov Decision Process (MDP). Distinguishing itself from existing works within this context, our approach harnesses the power of the general policy gradient-based algorithm, liberating it from the constraints of assuming a linear MDP structure. We propose a policy gradient-based algorithm and show its global convergence property. We th… ▽ More

    Submitted 2 February, 2024; v1 submitted 4 September, 2023; originally announced September 2023.

    Journal ref: AAAI 2024

  36. arXiv:2308.14897  [pdf, other

    cs.LG cs.AI cs.DC

    Statistically Efficient Variance Reduction with Double Policy Estimation for Off-Policy Evaluation in Sequence-Modeled Reinforcement Learning

    Authors: Hanhan Zhou, Tian Lan, Vaneet Aggarwal

    Abstract: Offline reinforcement learning aims to utilize datasets of previously gathered environment-action interaction records to learn a policy without access to the real environment. Recent work has shown that offline reinforcement learning can be formulated as a sequence modeling problem and solved via supervised learning with approaches such as decision transformer. While these sequence-based methods a… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

  37. arXiv:2307.11629  [pdf, other

    cs.LG cs.MA

    Scalable Multi-agent Covering Option Discovery based on Kronecker Graphs

    Authors: Jiayu Chen, Jingdi Chen, Tian Lan, Vaneet Aggarwal

    Abstract: Covering skill (a.k.a., option) discovery has been developed to improve the exploration of RL in single-agent scenarios with sparse reward signals, through connecting the most distant states in the embedding space provided by the Fiedler vector of the state transition graph. Given that joint state space grows exponentially with the number of agents in multi-agent systems, existing researches still… ▽ More

    Submitted 20 August, 2023; v1 submitted 21 July, 2023; originally announced July 2023.

    Comments: Accepted to NeurIPS 2022. arXiv admin note: substantial text overlap with arXiv:2201.08227

  38. arXiv:2307.03884  [pdf, other

    quant-ph cs.LG

    Noisy Tensor Ring approximation for computing gradients of Variational Quantum Eigensolver for Combinatorial Optimization

    Authors: Dheeraj Peddireddy, Utkarsh Priyam, Vaneet Aggarwal

    Abstract: Variational Quantum algorithms, especially Quantum Approximate Optimization and Variational Quantum Eigensolver (VQE) have established their potential to provide computational advantage in the realm of combinatorial optimization. However, these algorithms suffer from classically intractable gradients limiting the scalability. This work addresses the scalability challenge for VQE by proposing a cla… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

    Comments: 12 pages, 13 figures, preprint

  39. arXiv:2306.17054  [pdf, other

    cs.NI

    Two-tiered Online Optimization of Region-wide Datacenter Resource Allocation via Deep Reinforcement Learning

    Authors: Chang-Lin Chen, Hanhan Zhou, Jiayu Chen, Mohammad Pedramfar, Vaneet Aggarwal, Tian Lan, Zheqing Zhu, Chi Zhou, Tim Gasser, Pol Mauri Ruiz, Vijay Menon, Neeraj Kumar, Hongbo Dong

    Abstract: This paper addresses the important need for advanced techniques in continuously allocating workloads on shared infrastructures in data centers, a problem arising due to the growing popularity and scale of cloud computing. It particularly emphasizes the scarcity of research ensuring guaranteed capacity in capacity reservations during large-scale failures. To tackle these issues, the paper presents… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

  40. arXiv:2306.10486  [pdf, ps, other

    cs.LG

    On the Global Convergence of Natural Actor-Critic with Two-layer Neural Network Parametrization

    Authors: Mudit Gaur, Amrit Singh Bedi, Di Wang, Vaneet Aggarwal

    Abstract: Actor-critic algorithms have shown remarkable success in solving state-of-the-art decision-making problems. However, despite their empirical effectiveness, their theoretical underpinnings remain relatively unexplored, especially with neural network parametrization. In this paper, we delve into the study of a natural actor-critic algorithm that utilizes neural networks to represent the critic. Our… ▽ More

    Submitted 18 June, 2023; originally announced June 2023.

    Comments: arXiv admin note: text overlap with arXiv:2211.07675

    ACM Class: F.2.1

  41. arXiv:2306.05411  [pdf, other

    cs.CV

    R-MAE: Regions Meet Masked Autoencoders

    Authors: Duy-Kien Nguyen, Vaibhav Aggarwal, Yanghao Li, Martin R. Oswald, Alexander Kirillov, Cees G. M. Snoek, Xinlei Chen

    Abstract: In this work, we explore regions as a potential visual analogue of words for self-supervised image representation learning. Inspired by Masked Autoencoding (MAE), a generative pre-training baseline, we propose masked region autoencoding to learn from groups of pixels or regions. Specifically, we design an architecture which efficiently addresses the one-to-many mapping between images and regions,… ▽ More

    Submitted 4 January, 2024; v1 submitted 8 June, 2023; originally announced June 2023.

  42. arXiv:2306.00989  [pdf, other

    cs.CV cs.LG

    Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles

    Authors: Chaitanya Ryali, Yuan-Ting Hu, Daniel Bolya, Chen Wei, Haoqi Fan, Po-Yao Huang, Vaibhav Aggarwal, Arkabandhu Chowdhury, Omid Poursaeed, Judy Hoffman, Jitendra Malik, Yanghao Li, Christoph Feichtenhofer

    Abstract: Modern hierarchical vision transformers have added several vision-specific components in the pursuit of supervised classification performance. While these components lead to effective accuracies and attractive FLOP counts, the added complexity actually makes these transformers slower than their vanilla ViT counterparts. In this paper, we argue that this additional bulk is unnecessary. By pretraini… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: ICML 2023 Oral version. Code+Models: https://github.com/facebookresearch/hiera

  43. arXiv:2305.19153  [pdf, other

    cs.NI cs.AI

    FERN: Leveraging Graph Attention Networks for Failure Evaluation and Robust Network Design

    Authors: Chenyi Liu, Vaneet Aggarwal, Tian Lan, Nan Geng, Yuan Yang, Mingwei Xu, Qing Li

    Abstract: Robust network design, which aims to guarantee network availability under various failure scenarios while optimizing performance/cost objectives, has received significant attention. Existing approaches often rely on model-based mixed-integer optimization that is hard to scale or employ deep learning to solve specific engineering problems yet with limited generalizability. In this paper, we show th… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Journal ref: IEEE/ACM Transactions on Networking 2023

  44. arXiv:2305.17327  [pdf, other

    cs.LG

    Hierarchical Deep Counterfactual Regret Minimization

    Authors: Jiayu Chen, Tian Lan, Vaneet Aggarwal

    Abstract: Imperfect Information Games (IIGs) offer robust models for scenarios where decision-makers face uncertainty or lack complete information. Counterfactual Regret Minimization (CFR) has been one of the most successful family of algorithms for tackling IIGs. The integration of skill-based strategy learning with CFR could potentially mirror more human-like decision-making process and enhance the learni… ▽ More

    Submitted 26 September, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

  45. arXiv:2305.16671  [pdf, ps, other

    cs.LG cs.AI cs.CC

    A Unified Approach for Maximizing Continuous DR-submodular Functions

    Authors: Mohammad Pedramfar, Christopher John Quinn, Vaneet Aggarwal

    Abstract: This paper presents a unified approach for maximizing continuous DR-submodular functions that encompasses a range of settings and oracle access types. Our approach includes a Frank-Wolfe type offline algorithm for both monotone and non-monotone functions, with different restrictions on the general convex set. We consider settings where the oracle provides access to either the gradient of the funct… ▽ More

    Submitted 12 January, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  46. arXiv:2305.12633  [pdf, other

    cs.LG

    Multi-task Hierarchical Adversarial Inverse Reinforcement Learning

    Authors: Jiayu Chen, Dipesh Tamboli, Tian Lan, Vaneet Aggarwal

    Abstract: Multi-task Imitation Learning (MIL) aims to train a policy capable of performing a distribution of tasks based on multi-task expert demonstrations, which is essential for general-purpose robots. Existing MIL algorithms suffer from low data efficiency and poor performance on complex long-horizontal tasks. We develop Multi-task Hierarchical Adversarial Inverse Reinforcement Learning (MH-AIRL) to lea… ▽ More

    Submitted 28 June, 2023; v1 submitted 21 May, 2023; originally announced May 2023.

    Comments: This paper is accepted at ICML 2023. arXiv admin note: text overlap with arXiv:2210.01969

  47. arXiv:2305.02527  [pdf, ps, other

    cs.LG cs.AI

    Reinforcement Learning with Delayed, Composite, and Partially Anonymous Reward

    Authors: Washim Uddin Mondal, Vaneet Aggarwal

    Abstract: We investigate an infinite-horizon average reward Markov Decision Process (MDP) with delayed, composite, and partially anonymous reward feedback. The delay and compositeness of rewards mean that rewards generated as a result of taking an action at a given state are fragmented into different components, and they are sequentially realized at delayed time instances. The partial anonymity attribute im… ▽ More

    Submitted 28 August, 2023; v1 submitted 3 May, 2023; originally announced May 2023.

  48. arXiv:2303.13604  [pdf, other

    cs.LG cs.AI cs.DS

    Stochastic Submodular Bandits with Delayed Composite Anonymous Bandit Feedback

    Authors: Mohammad Pedramfar, Vaneet Aggarwal

    Abstract: This paper investigates the problem of combinatorial multiarmed bandits with stochastic submodular (in expectation) rewards and full-bandit delayed feedback, where the delayed feedback is assumed to be composite and anonymous. In other words, the delayed feedback is composed of components of rewards from past actions, with unknown division among the sub-components. Three models of delayed feedback… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

  49. arXiv:2303.13496  [pdf, other

    cs.CV cs.AI cs.LG

    The effectiveness of MAE pre-pretraining for billion-scale pretraining

    Authors: Mannat Singh, Quentin Duval, Kalyan Vasudev Alwala, Haoqi Fan, Vaibhav Aggarwal, Aaron Adcock, Armand Joulin, Piotr Dollár, Christoph Feichtenhofer, Ross Girshick, Rohit Girdhar, Ishan Misra

    Abstract: This paper revisits the standard pretrain-then-finetune paradigm used in computer vision for visual recognition tasks. Typically, state-of-the-art foundation models are pretrained using large scale (weakly) supervised datasets with billions of images. We introduce an additional pre-pretraining stage that is simple and uses the self-supervised MAE technique to initialize the model. While MAE has on… ▽ More

    Submitted 24 January, 2024; v1 submitted 23 March, 2023; originally announced March 2023.

    Comments: ICCV 2023. Models available at https://github.com/facebookresearch/maws/

  50. arXiv:2303.08361  [pdf, other

    cs.DC cs.LG cs.NI eess.SY

    Towards Cooperative Federated Learning over Heterogeneous Edge/Fog Networks

    Authors: Su Wang, Seyyedali Hosseinalipour, Vaneet Aggarwal, Christopher G. Brinton, David J. Love, Weifeng Su, Mung Chiang

    Abstract: Federated learning (FL) has been promoted as a popular technique for training machine learning (ML) models over edge/fog networks. Traditional implementations of FL have largely neglected the potential for inter-network cooperation, treating edge/fog devices and other infrastructure participating in ML as separate processing elements. Consequently, FL has been vulnerable to several dimensions of n… ▽ More

    Submitted 15 March, 2023; originally announced March 2023.

    Comments: This paper has been accepted for publication in IEEE Communications Magazine