Quest for Optimal Regression Models in SARS-CoV-2 Wastewater Based Epidemiology

Int J Environ Res Public Health. 2021 Oct 14;18(20):10778. doi: 10.3390/ijerph182010778.

Abstract

Wastewater-based epidemiology is a recognised source of information for pandemic management. In this study, we investigated the correlation between a SARS-CoV-2 signal derived from wastewater sampling and COVID-19 incidence values monitored by means of individual testing programs. The dataset used in the study is composed of timelines (duration approx. five months) of both signals at four wastewater treatment plants across Austria, two of which drain large communities and the other two drain smaller communities. Eight regression models were investigated to predict the viral incidence under varying data inputs and pre-processing methods. It was found that population-based normalisation and smoothing as a pre-processing of the viral load data significantly influence the fitness of the regression models. Moreover, the time latency lag between the wastewater data and the incidence derived from the testing program was found to vary between 2 and 7 days depending on the time period and site. It was found to be necessary to take such a time lag into account by means of multivariate modelling to boost the performance of the regression. Comparing the models, no outstanding one could be identified as all investigated models are revealing a sufficient correlation for the task. The pre-processing of data and a multivariate model formulation is more important than the model structure.

Keywords: SARS-CoV-2; Taylor diagram; incidence; multivariate model; regression; wastewater-based epidemiology.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • COVID-19*
  • Humans
  • Pandemics
  • RNA, Viral
  • SARS-CoV-2
  • Wastewater
  • Wastewater-Based Epidemiological Monitoring*

Substances

  • RNA, Viral
  • Waste Water