Background: In countries where air pollution stations are unavailable or scarce, station measurements from other countries and atmospheric remote sensing could jointly provide information to estimate ambient air quality at a sufficiently fine resolution to study the relationship between air pollution exposure and health. Predicting NO2 concentration globally with sufficient spatial and temporal resolution and accuracy for health studies is, however, not a trivial task. Challenges are data deficiency, in terms of NO2 measurements and NO2 predictors, and the development of a statistical model that can typify the regional and continental differences, such as traffic regulations, energy sources, and local weather.
Objective: We investigated the feasibility of mapping daytime and nighttime NO2 globally at a high spatial resolution (25 m), by including TROPOMI (TROPOspheric Monitoring Instrument) data and comparing various statistical learning techniques.
Method: We separated daytime (7:00 am - 9:59 pm) and nighttime (10:00 pm - 6:59 am) based on the local times. To study if one should build models for each country separately, national models in 4 selected countries (the US, China, Germany, Spain) were developed. We build the models for 2017 and used 3636 stations. Seven statistical learning techniques were applied and the impact of the predictors, model fitting, and predicting accuracy was compared between different techniques, national models, national and global models, and models with and without including the NO2 vertical column density retrieved from TROPOMI.
Result and conclusion: The ensemble tree-based methods obtained higher accuracy compared to the linear regression-based methods in national and global models. The global tree-based methods obtained similar accuracy to national models. Different spatial prediction patterns are observed even when the prediction accuracy is very similar. Separating between day and night can be important for more accurate air pollution exposure assessment. The TROPOMI variable is ranked as one of the most important variables in the statistical learning techniques but adding it to global models that contain other precedent remote sensing products does not improve the prediction accuracy.
Keywords: Air pollution; Global scale; High resolution; Statistical learning; TROPOMI; Temporal.
Copyright © 2020 The Authors. Published by Elsevier Ltd.. All rights reserved.