Estimating daily air temperature and pollution in Catalonia: A comprehensive spatiotemporal modelling of multiple exposures

Environ Pollut. 2023 Nov 15:337:122501. doi: 10.1016/j.envpol.2023.122501. Epub 2023 Sep 8.

Abstract

Environmental epidemiology studies require models of multiple exposures to adjust for co-exposure and explore interactions. We estimated spatiotemporal exposure to surface air temperature and pollution (PM2.5, PM10, NO2, O3) at high spatiotemporal resolution (daily, 250 m) for 2018-2020 in Catalonia. Innovations include the use of TROPOMI products, a data split for remote sensing gap-filling evaluation, estimation of prediction uncertainty, and use of explainable machine learning. We compiled meteorological and air quality station measurements, climate and atmospheric composition reanalyses, remote sensing products, and other spatiotemporal data. We performed gap-filling of remotely-sensed products using Random Forest (RF) models and validated them using Out-Of-Bag (OOB) samples and a structured data split. The exposure modelling workflow consisted of: 1) PM2.5 station imputation with PM10 data; 2) quantile RF (QRF) model fitting; and 3) geostatistical residual spatial interpolation. Prediction uncertainty was estimated using QRF. SHAP values were used to examine variable importance and the fitted relationships. Model performance was assessed via nested CV at the station level. Evaluation of the gap-filling models using the structured split showed error underestimation when using OOB. Temperature models had the best performance (R2 =0.98) followed by the gaseous air pollutants (R2 =0.81 for NO2 and 0.86 for O3), while the performance of the PM2.5 and PM10 models was lower (R2 =0.57 and 0.63 respectively). Predicted exposure patterns captured urban heat island effects, dust advection events, and NO2 hotspots. SHAP values estimated a high importance of TROPOMI tropospheric NO2 columns in PM and NO2 models, and confirmed that the fitted associations conformed to prior knowledge. Our work highlights the importance of correctly validating gap-filling models and the potential of TROPOMI measurements. Moderate performance in PM models can be partly explained by the poor station coverage. Our exposure estimates can be used in epidemiological studies potentially accounting for exposure uncertainty.

Keywords: Air pollution; Air temperature; Explainable machine learning; Remote sensing; TROPOMI.

MeSH terms

  • Air Pollutants* / analysis
  • Air Pollution* / analysis
  • Cities
  • Environmental Monitoring
  • Hot Temperature
  • Nitrogen Dioxide / analysis
  • Particulate Matter / analysis
  • Spain
  • Temperature

Substances

  • Nitrogen Dioxide
  • Air Pollutants
  • Particulate Matter