Agent-based models (ABMs) integrate multiple scales of behavior and data to produce higher-order dynamic phenomena and are increasingly used in the study of important social complex systems in biomedicine, socio-economics and ecology/resource management. However, the development, validation and use of ABMs is hampered by the need to execute very large numbers of simulations in order to identify their behavioral properties, a challenge accentuated by the computational cost of running realistic, large-scale, potentially distributed ABM simulations. In this paper we describe the Extreme-scale Model Exploration with Swift (EMEWS) framework, which is capable of efficiently composing and executing large ensembles of simulations and other "black box" scientific applications while integrating model exploration (ME) algorithms developed with the use of widely available 3rd-party libraries written in popular languages such as R and Python. EMEWS combines novel stateful tasks with traditional run-to-completion many task computing (MTC) and solves many problems relevant to high-performance workflows, including scaling to very large numbers (millions) of tasks, maintaining state and locality information, and enabling effective multiple-language problem solving. We present the high-level programming model of the EMEWS framework and demonstrate how it is used to integrate an active learning ME algorithm to dynamically and efficiently characterize the parameter space of a large and complex, distributed Message Passing Interface (MPI) agent-based infectious disease model.
Keywords: Agent-based modeling; high performance computing; machine learning; metamodeling; parallel processing.