A calibration framework toward model generalization for bacteria concentration estimation in water resource recovery facilities

Sci Rep. 2024 Dec 28;14(1):31218. doi: 10.1038/s41598-024-82598-y.

Abstract

Reduced bacteria concentrations in wastewater is a key indicator of the efficacy of water resource recovery facilities (WRRFs). However, monitoring the presence of bacterial concentrations in real time at each stage of the WRRF is challenging as it requires taking and processing water samples offline. Although few studies have been proposed to predict bacterial concentrations using data-driven models, generalizing these models to unseen data from different WRRFs remains challenging. This paper proposes a calibration approach based on neural networks to adapt the optimal models across various WRRFs in Saudi Arabia for bacterial estimation at the influent and effluent stages. The calibration relies on the out-of-distribution (OOD) framework of the physiochemical water parameters (e.g., pH, COD, TDS, turbidity, conductivity) with a design threshold chosen based on the data distribution of the received unseen samples. We propose a calibration framework that continues updating the trained neural network model for accurate bacterial concentration estimation upon receiving new samples. We tested the effectiveness of the proposed calibration scheme on four WRRF datasets in Saudi Arabia, comparing the results with before and after calibration without the OOD. Before calibration model was based on a traditional and optimal neural network approach, typically considered the conventional method for building neural networks. After calibration without OOD, the model continued retraining without explicitly checking for OOD condition. The results showed that the proposed calibration framework of the selected baseline WRRF with the OOD scheme improved [Formula: see text] and [Formula: see text] of the worst-case influent bacteria concentration before calibration and after calibration without OOD, respectively. Similarly, the worst-case effluent bacteria concentration estimation was enhanced by [Formula: see text] before calibration and [Formula: see text] after calibration without the OOD. Our findings highlight the importance of integrating the calibration framework with neural network approaches to achieve model generalization.

Keywords: Bacteria concentration sensing; Calibration of neural networks; Out-of-distribution (OOD) generalization; Wasserstein generative adversarial network (WGAN); Water resource recovery facilities.

MeSH terms

  • Bacteria* / isolation & purification
  • Calibration
  • Neural Networks, Computer*
  • Saudi Arabia
  • Wastewater / microbiology
  • Water Microbiology
  • Water Resources

Substances

  • Wastewater