Improving the construction and prediction strategy of the Air Quality Health Index (AQHI) using machine learning: A case study in Guangzhou, China

Ecotoxicol Environ Saf. 2024 Nov 8:287:117287. doi: 10.1016/j.ecoenv.2024.117287. Online ahead of print.

Abstract

Effectively capturing the risk of air pollution and informing residents is vital to public health. The widely used Air Quality Index (AQI) has been criticized for failing to accurately represent the non-threshold linear relationship between air pollution and health outcomes. Although the Air Quality Health Index (AQHI) was developed to address these limitations, it lacks comprehensive construction criteria. This work proposed a novel construction and prediction strategy of AQHI using machine learning methods. Our RF-Alasso-QGC method integrated Random Forest (RF), Adaptive Lasso (Alasso), and Quantile-based G-Computation (QGC) for effective pollutant selection and AQHI construction. The RF-Alasso method excluded CO, while identified PM10, PM2.5, NO2, SO2, and O3 as major contributors to mortality. The QGC method controlled the additive and synergistic effects among these air pollutants. Compared to the Standard-AQHI, the new RF-Alasso-QGC-AQHI demonstrated a stronger correlation with health outcomes, with an interquartile (IQR) increase associated with a 1.80 % (1.44 %, 2.17 %) increase in total mortality, and the best goodness of fit. Additionally, the hybrid Auto Regressive Moving Average-Long Short Term Memory (ARIMA-LSTM) successfully forecast the new AQHI, achieving a coefficient of determination (R²) of 0.961. The work demonstrated that the improved AQHI construction and prediction strategy more efficiently communicate and provide early warnings of the health risks of multiple air pollutants.

Keywords: Adaptive Lasso; Air pollutant selection; Air quality health index; Health risk prediction; Quantile-based G-Computation; Random Forest.