Long-term water quality assessment in coastal and inland waters: An ensemble machine-learning approach using satellite data

Mar Pollut Bull. 2024 Dec;209(Pt B):117036. doi: 10.1016/j.marpolbul.2024.117036. Epub 2024 Nov 16.

Abstract

Accurate estimation of coastal and in-land water quality parameters is important for managing water resources and meeting the demand of sustainable development goals. The water quality monitoring based on discrete water sample analysis is limited to specific locations and becomes less effective to offer a synoptic view of the water quality variability at different spatial and temporal scales. The optical remote sensing techniques have proved their ability to provide a comprehensive and synoptic view of water quality parameters. In conjugation with other products, the optical remote sensing data products can be utilized for the effective management of water bodies while addressing the socio-economic issues faced by local governments and states. In recent years, multiple machine-learning (ML) models have been reported on the estimation of water quality using remote sensing data, but their performance is limited when extended to diverse water types within coastal and inland water environments. In this study, we present an ensemble machine-learning model for estimating the primary water quality parameters in coastal and inland waters, such as Chlorophyll-a (Chl-a) concentration, colored dissolved organic matter (aCDOM440), and Turbidity. It utilizes the in-situ measurements to train and optimize the ensemble machine-learning models for the spectral measurements data (400-700 nm) provided by MODIS-Aqua, Sentinel-2 Multi Spectral Instrument (MSI), and PlanetScope (Planet). To develop the prediction models, these in-situ measurements data were split into two parts: a training dataset (70 %) and a testing dataset (30 %). The ensemble machine-learning models were validated using the 5-fold cross-validation method. These models were trained and tested against distinct datasets encompassing a broad range of variations in water quality parameters collected from open ocean, coastal and inland waters. The validation results demonstrated a superior performance of the present ensemble ML models compared to other ML models (Chl-a: R2 = 0.96, RMSE = 4.93, MAE = 2.89; aCDOM440: R2 = 0.93, RMSE = 0.057, MAE = 0.025; Turbidity: R2 = 0.95, RMSE = 4.52, MAE = 1.009). To realize the importance of this study, the ensemble ML models were applied to MODIS-Aqua monthly composite measurements from 2003 to 2022 and captured pronounced seasonal variations in water quality parameters (WQP) and Water Quality Index (WQI). For instance, in the Gulf of Khambhat, turbidity decreased at an annual average rate of ∼0.08 NTU and Chl-a increased at an annual average rate of ∼0.004 mg m-3 for the past 20 years. Furthermore, we investigated the occurrences of Noctiluca scintillans (here after N. scintillans) bloom between 2019 and 2021 near the fin fish cage culture sites in Mandapam, on the southeast coast of Tamil Nadu, within the Gulf of Mannar, India which serves as a documentation of the Harmful Algal Bloom (HAB) incidents. The performance of ensemble model is further demonstrated using Planet images from inland turbid waters of the Muthupet lagoon (Brackish water) and Adyar river (Urban River) and MSI image from Chilika lagoon. The proposed ensemble ML models proved as an effective method for accurately estimating the WQP and WQI products and capturing their spatial and temporal variations in regional and global waters, which forms an important tool for sustainable development and management of coastal and inland water environments.

Keywords: Inland and coastal waters; Machine-learning; Optical remote sensing; Water quality.

MeSH terms

  • Chlorophyll A / analysis
  • Environmental Monitoring* / methods
  • Machine Learning*
  • Remote Sensing Technology
  • Satellite Imagery
  • Seawater / chemistry
  • Water Quality*

Substances

  • Chlorophyll A