Improving the transferability of the crash prediction model using the TrAdaBoost.R2 algorithm

Accid Anal Prev. 2020 Jun:141:105551. doi: 10.1016/j.aap.2020.105551. Epub 2020 Apr 23.

Abstract

The crash prediction model is a useful tool for traffic administrators to identify significant risk factors, estimate crash frequency, and screen hazardous locations, but some jurisdictions interested in traffic safety analysis can collect only limited or low-quality data. Existing crash prediction models can be transferred if calibrated, but the current aggregate calibration method limits prediction accuracy and the disaggregate method is resource-consuming. Transfer learning is another approach to calibration that acquires knowledge from old data domains to solve problems in new data domains. An instance-based transfer learning technique, TrAdaBoost.R2, is adopted in this study since it meets the requirement of site-based crash prediction model transfer. TrAdaBoost.R2 was compared with AdaBoost.R2 using a simply pooled data set to examine the efficiency in extracting knowledge from a spatially outdated source data domain (old data domain). The target data domain (new data domain) was sampled to test the technique's adaptability to small sample size. The calibration factor method based on a negative binomial model was employed to compare its predictive performance with that of the transfer learning technique. Mean square error was calculated to evaluate the prediction accuracy. Two cities in China, Shanghai and Guangzhou, were taken mutually as source data domain and target data domain. Results showed that the models constructed with TrAdaBoost.R2 had better prediction accuracy than the conventional calibration method. The TrAdaBoost.R2 is recommended due to its predictive performance and adaptability to small sample size. Crash prediction models are proposed to construct for peak and off-peak hours separately.

Keywords: Calibration factor; Crash prediction model; Negative binomial model; TrAdaBoost.R2; Transferability.