The Canadian Optimized Statistical Smoke Exposure Model (CanOSSEM): A machine learning approach to estimate national daily fine particulate matter (PM2.5) exposure

Sci Total Environ. 2022 Dec 1:850:157956. doi: 10.1016/j.scitotenv.2022.157956. Epub 2022 Aug 15.

Abstract

Exposure to biomass smoke has been associated with a wide range of acute and chronic health outcomes. Over the past decades, the frequency and intensity of wildfires has increased in many areas, resulting in longer smoke episodes with higher concentrations of fine particulate matter (PM2.5). There are also many communities where seasonal open burning and residential wood heating have short- and long-term impacts on ambient air quality. Understanding the acute and chronic health effects of biomass smoke exposure requires reliable estimates of PM2.5 concentrations during the wildfire season and throughout the year, particularly in areas without regulatory air quality monitoring stations. We have developed a machine learning approach to estimate PM2.5 across all populated regions of Canada from 2010 to 2019. The random forest machine learning model uses potential predictor variables integrated from multiple data sources and estimates daily mean (24-hour) PM2.5 concentrations at a 5 km × 5 km spatial resolution. The training and prediction datasets were generated using observations from National Air Pollution Surveillance (NAPS) network. The Root Mean Squared Error (RMSE) between predicted and observed PM2.5 concentrations was 2.96 μg/m3 for the entire prediction set, and more than 96 % of the predictions were within 5 μg/m3 of the NAPS PM2.5 measurements. The model was evaluated using 10-fold, leave one-region-out, and leave-one-year-out cross-validations. Overall, CanOSSEM performed well but performance was sensitive to removal of large wildfire events such as the Fort McMurray interface fire in May 2016 or the extreme 2017 and 2018 wildfire seasons in British Columbia. Exposure estimates from CanOSSEM will be useful for epidemiologic studies on the acute and chronic health effects associated with PM2.5 exposure, especially for populations affected by biomass smoke where routine air quality measurements are not available.

Keywords: Biomass smoke; Exposure assessment; Machine learning; Modeling; PM2.5.

MeSH terms

  • Air Pollutants* / analysis
  • Air Pollution* / analysis
  • British Columbia
  • Machine Learning
  • Particulate Matter / analysis
  • Smoke / analysis

Substances

  • Air Pollutants
  • Particulate Matter
  • Smoke