A critical analysis of parameter choices in water quality assessment

Water Res. 2024 Jul 1:258:121777. doi: 10.1016/j.watres.2024.121777. Epub 2024 May 16.

Abstract

The determination of water quality heavily depends on the selection of parameters recorded from water samples for the water quality index (WQI). Data-driven methods, including machine learning models and statistical approaches, are frequently used to refine the parameter set for four main reasons: reducing cost and uncertainty, addressing the eclipsing problem, and enhancing the performance of models predicting the WQI. Despite their widespread use, there is a noticeable gap in comprehensive reviews that systematically examine previous studies in this area. Such reviews are essential to assess the validity of these objectives and to demonstrate the effectiveness of data-driven methods in achieving these goals. This paper sets out with two primary aims: first, to provide a review of the existing literature on methods for selecting parameters. Second, it seeks to delineate and evaluate the four principal motivations for parameter selection identified in the literature. This manuscript categorizes existing studies into two methodological groups for refining parameters: one focuses on preserving information within the dataset, and another ensures consistent prediction using the full set of parameters. It characterizes each group and evaluates how effectively each approach meets the four predefined objectives. The study presents that the minimal WQI approach, common to both categories, is the only approach that has successfully reduced recording costs. Nonetheless, it notes that simply reducing the number of parameters does not guarantee cost savings. Furthermore, the group of studies classified as preserving information within the dataset has demonstrated potential to decrease the eclipsing problem, whereas studies in the consistent prediction group have not been able to mitigate this issue. Additionally, since data-driven approaches still rely on the initial parameters chosen by experts, they do not eliminate the need for expert judgment. The study further points out that the WQI formula is a straightforward and expedient tool for assessing water quality. Consequently, the paper argues that employing machine learning solely to reduce the number of parameters to enhance WQI prediction is not a standalone solution. Rather, this objective should be integrated with a more comprehensive set of research goals. The critical analysis of research objectives and the characterization of previous studies lay the groundwork for future research. This groundwork will enable subsequent studies to evaluate how their proposed methods can effectively achieve these objectives.

Keywords: Machine learning; Parameter selection; Statistical analysis; Uncertainty; Water quality index.

Publication types

  • Review

MeSH terms

  • Environmental Monitoring / methods
  • Machine Learning
  • Models, Theoretical
  • Water Quality*