A distributed algorithm for solving large-scale p-median problems using expectation maximization

PeerJ Comput Sci. 2024 Nov 21:10:e2446. doi: 10.7717/peerj-cs.2446. eCollection 2024.

Abstract

The p-median problem selects p source locations to serve n destinations such that the average distance between the destinations and corresponding sources is minimized. It is a well-studied NP-hard combinatorial optimization problem with many existing heuristic solutions, however, existing algorithms are not scalable for large-scale problems. The fast interchange (FI) heuristic which yields results close to the optimal solution with respect to the objective function value becomes suboptimal with respect to time requirements for large-scale problems. We present a novel distributed divide and conquer algorithm, EM-FI, to solve large-scale p-median problems quickly even with limited computing resources. The algorithm identifies the existing spatial clusters of the destination locations using expectation maximization (EM) and solves them as independent p-median problems using integer programming or FI concurrently. The proposed algorithm showed an order of magnitude improvement in time without the loss of quality in terms of the objective function value on synthetic and real datasets.

Keywords: Distributed algorithms; Heuristic search; Location allocation; P-median problem; Parallel computing; Spatial data mining.