Benchmarks for retrospective automated driving system crash rate analysis using police-reported crash data

John M Scanlon; Kristofer D Kusano; Laura A Fraade-Blanar; Timothy L McMurry; Yin-Hsiu Chen; Trent Victor

doi:10.1080/15389588.2024.2380522

Benchmarks for retrospective automated driving system crash rate analysis using police-reported crash data

Traffic Inj Prev. 2024 Nov 1:1-15. doi: 10.1080/15389588.2024.2380522. Online ahead of print.

Authors

John M Scanlon¹, Kristofer D Kusano¹, Laura A Fraade-Blanar¹, Timothy L McMurry¹, Yin-Hsiu Chen¹, Trent Victor¹

Affiliation

¹ Waymo, LLC, Mountain View, CA.

PMID: 39485680
DOI: 10.1080/15389588.2024.2380522

Abstract

Objectives: With fully automated driving systems (ADS; SAE level 4) ride-hailing services expanding in the U.S., we are now approaching an inflection point in the history of vehicle safety assessment. The process of retrospectively evaluating ADS safety impact (as seen with seatbelts, airbags, electronic stability control, etc.) can start to yield statistically credible conclusions. An ADS safety impact measurement requires a comparison to a "benchmark" crash rate. Most benchmarks generated to-date have focused on the current human-driven fleet, which enable researchers to understand the impact of the introduced ADS technology on the current crash record status quo. This study aims to address, update, and extend the existing literature by leveraging police-reported crashes to generate human crash rates for multiple geographic areas with current ADS deployments. Methods: All of the data leveraged is publicly accessible, and the benchmark determination methodology is intended to be repeatable and transparent. Generating a benchmark that is comparable to ADS crash data is associated with certain challenges, including data selection, handling underreporting and reporting thresholds, identifying the population of drivers and vehicles to compare against, choosing an appropriate severity level to assess, and matching crash and mileage exposure data.

Results: Consequently, we identify essential steps when generating benchmarks, and present our analyses amongst a backdrop of existing ADS benchmark literature. One analysis presented is the usage of established underreporting correction methodology to publicly available human driver police-reported data to improve comparability to publicly available ADS crash data. We also identified several important crash rate dependencies (geographic region, road type, and vehicle type), and show how failing to account for these features in ADS comparisons can bias results.

Conclusions: Working with police-reported crash data to create crash rate benchmarks is fraught with challenges. Researchers should be cautious in their selection of crash rate benchmarks. We present these challenges, discuss their consequences, and provide analytical guidance for addressing them. This body of work aims to contribute to the ability of the community - researchers, regulators, industry, and experts - to reach consensus on how to estimate accurate benchmarks.

Keywords: Automated driving systems; safety impact analysis; traffic safety.