Purpose: Accurate risk assessment is essential for the success of population screening programs in breast cancer. Models with high sensitivity and specificity would enable programs to target more elaborate screening efforts to high-risk populations, while minimizing overtreatment for the rest. Artificial intelligence (AI)-based risk models have demonstrated a significant advance over risk models used today in clinical practice. However, the responsible deployment of novel AI requires careful validation across diverse populations. To this end, we validate our AI-based model, Mirai, across globally diverse screening populations.
Methods: We collected screening mammograms and pathology-confirmed breast cancer outcomes from Massachusetts General Hospital, USA; Novant, USA; Emory, USA; Maccabi-Assuta, Israel; Karolinska, Sweden; Chang Gung Memorial Hospital, Taiwan; and Barretos, Brazil. We evaluated Uno's concordance index for Mirai in predicting risk of breast cancer at one to five years from the mammogram.
Results: A total of 128,793 mammograms from 62,185 patients were collected across the seven sites, of which 3,815 were followed by a cancer diagnosis within 5 years. Mirai obtained concordance indices of 0.75 (95% CI, 0.72 to 0.78), 0.75 (95% CI, 0.70 to 0.80), 0.77 (95% CI, 0.75 to 0.79), 0.77 (95% CI, 0.73 to 0.81), 0.81 (95% CI, 0.79 to 0.82), 0.79 (95% CI, 0.76 to 0.83), and 0.84 (95% CI, 0.81 to 0.88) at Massachusetts General Hospital, Novant, Emory, Maccabi-Assuta, Karolinska, Chang Gung Memorial Hospital, and Barretos, respectively.
Conclusion: Mirai, a mammography-based risk model, maintained its accuracy across globally diverse test sets from seven hospitals across five countries. This is the broadest validation to date of an AI-based breast cancer model and suggests that the technology can offer broad and equitable improvements in care.