Potential Impact of an Artificial Intelligence-based Mammography Triage Algorithm on Performance and Workload in a Population-based Screening Sample

Alyssa T Watanabe; Hoanh Vu; Chi Y Chim; Andrew W Litt; Tara Retson; Ray C Mayo

doi:10.1093/jbi/wbae056

Potential Impact of an Artificial Intelligence-based Mammography Triage Algorithm on Performance and Workload in a Population-based Screening Sample

J Breast Imaging. 2024 Sep 8:wbae056. doi: 10.1093/jbi/wbae056. Online ahead of print.

Authors

Alyssa T Watanabe^{1

2}, Hoanh Vu², Chi Y Chim², Andrew W Litt³, Tara Retson⁴, Ray C Mayo⁵

Affiliations

¹ Department of Radiology, Keck School of Medicine, University of Southern California (USC), Los Angeles, CA, USA.
² CureMetrix Incorporated, San Diego, CA, USA.
³ Cornice Health Ventures, LLC, Miami Beach FL, USA.
⁴ Department of Radiology, University of California San Diego (UCSD), La Jolla, CA, USA.
⁵ Department of Breast Imaging, University of Texas MD Anderson Cancer Center, Houston, TX, USA.

PMID: 39245042
DOI: 10.1093/jbi/wbae056

Abstract

Objective: To evaluate potential screening mammography performance and workload impact using a commercial artificial intelligence (AI)-based triage device in a population-based screening sample.

Methods: In this retrospective study, a sample of 2129 women who underwent screening mammograms were evaluated. The performance of a commercial AI-based triage device was compared with radiologists' reports, actual outcomes, and national benchmarks using commonly used mammography metrics. Up to 5 years of follow-up examination results were evaluated in cases to establish benignity. The algorithm sorted cases into groups of "suspicious" and "low suspicion." A theoretical workload reduction was calculated by subtracting cases triaged as "low suspicion" from the sample.

Results: At the default 93% sensitivity setting, there was significant improvement (P <.05) in the following triage simulation mean performance measures compared with actual outcome: 45.5% improvement in recall rate (13.4% to 7.3%; 95% CI, 6.2-8.3), 119% improvement in positive predictive value (PPV) 1 (5.3% to 11.6%; 95% CI, 9.96-13.4), 28.5% improvement in PPV2 (24.6% to 31.6%; 95% CI, 24.8-39.1), 20% improvement in sensitivity (83.3% to 100%; 95% CI, 100-100), and 7.2% improvement in specificity (87.2% to 93.5%; 95% CI, 92.4-94.5). A theoretical 62.5% workload reduction was possible. At the ultrahigh 99% sensitivity setting, a theoretical 27% workload reduction was possible. No cancers were missed by the algorithm at either sensitivity.

Conclusion: Artificial intelligence-based triage in this simulation demonstrated potential for significant improvement in mammography performance and predicted substantial theoretical workload reduction without any missed cancers.

Keywords: artificial intelligence; decision support; deep learning; screening mammogram; workload triage.

© Society of Breast Imaging 2024. All rights reserved. For commercial re-use, please contact [email protected] for reprints and translation rights for reprints. All other permissions can be obtained through our RightsLink service via the Permissions link on the article page on our site—for further information please contact [email protected].