Background: COVID-19 will not be the last pandemic of the twenty-first century. To better prepare for the next one, it is essential that we make honest appraisals of the utility of different responses to COVID. In this paper, we focus specifically on epidemiologic forecasting. Characterizing forecast efficacy over the history of the pandemic is challenging, especially given its significant spatial, temporal, and contextual variability. In this light, we introduce the Weighted Contextual Interval Score (WCIS), a new method for retrospective interval forecast evaluation.
Methods: The central tenet of the WCIS is a direct incorporation of contextual utility into the evaluation. This necessitates a specific characterization of forecast efficacy depending on the use case for predictions, accomplished via defining a utility threshold parameter. This idea is generalized to probabilistic interval-form forecasts, which are the preferred prediction format for epidemiological modeling, as an extension of the existing Weighted Interval Score (WIS).
Results: We apply the WCIS to two forecasting scenarios: facility-level hospitalizations for a single state, and state-level hospitalizations for the whole of the United States. We observe that an appropriately parameterized application of the WCIS captures both the relative quality and the overall frequency of useful forecasts. Since the WCIS represents the utility of predictions using contextual normalization, it is easily comparable across highly variable pandemic scenarios while remaining intuitively representative of the in-situ quality of individual forecasts.
Conclusions: The WCIS provides a pragmatic utility-based characterization of probabilistic predictions. This method is expressly intended to enable practitioners and policymakers who may not have expertise in forecasting but are nevertheless essential partners in epidemic response to use and provide insightful analysis of predictions. We note that the WCIS is intended specifically for retrospective forecast evaluation and should not be used as a minimized penalty in a competitive context as it lacks statistical propriety. Code and data used for our analysis are available at https://github.com/maximilian-marshall/wcis .
Keywords: COVID-19; Epidemiology; Public health; Statistics.
© 2024. The Author(s).