Predicting seasonal influenza using supermarket retail records

PLoS Comput Biol. 2021 Jul 12;17(7):e1009087. doi: 10.1371/journal.pcbi.1009087. eCollection 2021 Jul.

Abstract

Increased availability of epidemiological data, novel digital data streams, and the rise of powerful machine learning approaches have generated a surge of research activity on real-time epidemic forecast systems. In this paper, we propose the use of a novel data source, namely retail market data to improve seasonal influenza forecasting. Specifically, we consider supermarket retail data as a proxy signal for influenza, through the identification of sentinel baskets, i.e., products bought together by a population of selected customers. We develop a nowcasting and forecasting framework that provides estimates for influenza incidence in Italy up to 4 weeks ahead. We make use of the Support Vector Regression (SVR) model to produce the predictions of seasonal flu incidence. Our predictions outperform both a baseline autoregressive model and a second baseline based on product purchases. The results show quantitatively the value of incorporating retail market data in forecasting models, acting as a proxy that can be used for the real-time analysis of epidemics.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Computational Biology
  • Consumer Behavior / statistics & numerical data*
  • Humans
  • Incidence
  • Influenza, Human / epidemiology*
  • Italy / epidemiology
  • Seasons
  • Supermarkets*