Complex networks are susceptible to contagious cascades, underscoring the urgency for effective epidemic mitigation strategies. While physical quarantine is a proven mitigation measure for mitigation, it can lead to substantial economic repercussions if not managed properly. This study presents an innovative approach to selecting quarantine targets within complex networks, aiming for an efficient and economic epidemic response. We model the epidemic spread in complex networks as a Markov chain, accounting for stochastic state transitions and node quarantines. We then leverage deep reinforcement learning (DRL) to design a quarantine strategy that minimizes both infection rates and quarantine costs through a sequence of strategic node quarantines. Our DRL agent is specifically trained with the proximal policy optimization algorithm to optimize these dual objectives. Through simulations in both synthetic small-world and real-world community networks, we demonstrate the efficacy of our strategy in controlling epidemics. Notably, we observe a non-linear pattern in the mitigation effect as the daily maximum quarantine scale increases: the mitigation rate is most pronounced at first but plateaus after reaching a critical threshold. This insight is crucial for setting the most effective epidemic mitigation parameters.
© 2024 Author(s). Published under an exclusive license by AIP Publishing.