Using isometric log-ratio in compositional data analysis for developing a groundwater pollution index

Sci Rep. 2024 May 28;14(1):12196. doi: 10.1038/s41598-024-63178-6.

Abstract

This study introduces a novel groundwater pollution index (GPI) formulated through compositional data analysis (CoDa) and robust principal component analysis (RPCA) to enhance groundwater quality assessment. Using groundwater quality monitoring data from sites impacted by the 2010-2011 foot-and-mouth disease outbreak in South Korea, CoDa uncovers critical hydrochemical differences between leachate-influenced and background groundwater. The GPI was developed by selecting key subcompositional parts (NH4+-N, Cl-, and NO3--N) using RPCA, performing the isometric log-ratio (ILR) transformation, and normalizing the results to environmental standards, thereby providing a more precise and accurate assessment of pollution. Validated against government criteria, the GPI has shown its potential as an alternative assessment tool, with its reliability confirmed by receiver operating characteristic curve analysis. This study highlights the essential role of CoDa, especially the ILR -transformation, in overcoming the limitations of traditional statistical methods that often neglect the relative nature of hydrochemical data. Our results emphasize the utility of the GPI in significantly advancing groundwater quality monitoring and management by addressing a methodological gap in the quantitative assessment of groundwater pollution.

Keywords: Compositional data analysis (CoDa); Groundwater pollution index (GPI); Isometric log-ratio (ILR) transformation; Robust principal component analysis (RPCA).