Zum Hauptinhalt springen

Showing 1–3 of 3 results for author: Hairi, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.03082  [pdf, other

    cs.LG

    Finite-Time Convergence and Sample Complexity of Actor-Critic Multi-Objective Reinforcement Learning

    Authors: Tianchen Zhou, FNU Hairi, Haibo Yang, Jia Liu, Tian Tong, Fan Yang, Michinari Momma, Yan Gao

    Abstract: Reinforcement learning with multiple, potentially conflicting objectives is pervasive in real-world applications, while this problem remains theoretically under-explored. This paper tackles the multi-objective reinforcement learning (MORL) problem and introduces an innovative actor-critic algorithm named MOAC which finds a policy by iteratively making trade-offs among conflicting reward signals. N… ▽ More

    Submitted 9 May, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

    Comments: Accepted in ICML 2024

  2. arXiv:2403.15935  [pdf, other

    cs.LG cs.MA

    Sample and Communication Efficient Fully Decentralized MARL Policy Evaluation via a New Approach: Local TD update

    Authors: Fnu Hairi, Zifan Zhang, Jia Liu

    Abstract: In actor-critic framework for fully decentralized multi-agent reinforcement learning (MARL), one of the key components is the MARL policy evaluation (PE) problem, where a set of $N$ agents work cooperatively to evaluate the value function of the global states for a given policy through communicating with their neighbors. In MARL-PE, a critical challenge is how to lower the sample and communication… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

    Comments: Main body of the paper appeared in AAMAS24

  3. arXiv:2012.06613  [pdf, ps, other

    cs.PF math.PR

    Beyond Scaling: Calculable Error Bounds of the Power-of-Two-Choices Mean-Field Model in Heavy-Traffic

    Authors: Fnu Hairi, Xin Liu, Lei Ying

    Abstract: This paper provides a recipe for deriving calculable approximation errors of mean-field models in heavy-traffic with the focus on the well-known load balancing algorithm -- power-of-two-choices (Po2). The recipe combines Stein's method for linearized mean-field models and State Space Concentration (SSC) based on geometric tail bounds. In particular, we divide the state space into two regions, a ne… ▽ More

    Submitted 29 October, 2021; v1 submitted 11 December, 2020; originally announced December 2020.