Coordinating Multi-Agent Reinforcement Learning via Dual Collaborative Constraints

Chao Li; Shaokang Dong; Shangdong Yang; Yujing Hu; Wenbin Li; Yang Gao

doi:10.1016/j.neunet.2024.106858

Coordinating Multi-Agent Reinforcement Learning via Dual Collaborative Constraints

Neural Netw. 2024 Nov 12:182:106858. doi: 10.1016/j.neunet.2024.106858. Online ahead of print.

Authors

Chao Li¹, Shaokang Dong², Shangdong Yang³, Yujing Hu⁴, Wenbin Li⁵, Yang Gao⁶

Affiliations

¹ State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, China. Electronic address: [email protected].
² State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, China. Electronic address: [email protected].
³ School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, 210023, China. Electronic address: [email protected].
⁴ NetEase Fuxi AI Lab, Netease Inc, Hangzhou, 310052, China. Electronic address: [email protected].
⁵ State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, China. Electronic address: [email protected].
⁶ State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, China. Electronic address: [email protected].

PMID: 39550797
DOI: 10.1016/j.neunet.2024.106858

Abstract

Many real-world multi-agent tasks exhibit a nearly decomposable structure, where interactions among agents within the same interaction set are strong while interactions between different sets are relatively weak. Efficiently modeling the nearly decomposable structure and leveraging it to coordinate agents can enhance the learning efficiency of multi-agent reinforcement learning algorithms for cooperative tasks, while existing works typically fail. To overcome this limitation, this paper proposes a novel algorithm named Dual Collaborative Constraints (DCC) that identifies the interaction sets as subtasks and achieves both intra-subtask and inter-subtask coordination. Specifically, DCC employs a bi-level structure to periodically distribute agents into multiple subtasks, and proposes both local and global collaborative constraints based on mutual information to facilitate both intra-subtask and inter-subtask coordination among agents. These two constraints ensure that agents within the same subtask reach a consensus on their local action selections and all of them select superior joint actions that maximize the overall task performance. Experimentally, we evaluate DCC on various cooperative multi-agent tasks, and its superior performance against multiple state-of-the-art baselines demonstrates its effectiveness.

Keywords: Cooperative tasks; Coordination; Multi-agent reinforcement learning; Nearly decomposable structure.