Nationwide real-world implementation of AI for cancer detection in population-based mammography screening

Nora Eisemann; Stefan Bunk; Trasias Mukama; Hannah Baltus; Susanne A Elsner; Timo Gomille; Gerold Hecht; Sylvia Heywang-Köbrunner; Regine Rathmann; Katja Siegmann-Luz; Thilo Töllner; Toni Werner Vomweg; Christian Leibig; Alexander Katalinic

doi:10.1038/s41591-024-03408-6

Nationwide real-world implementation of AI for cancer detection in population-based mammography screening

Nat Med. 2025 Jan 7. doi: 10.1038/s41591-024-03408-6. Online ahead of print.

Authors

Affiliations

¹ Institute for Social Medicine and Epidemiology, University of Lübeck, Lubeck, Germany.
² Vara, Berlin, Germany. [email protected].
³ Vara, Berlin, Germany.
⁴ Diagnosticum Visiorad, Pinneberg, Germany.
⁵ Reference Center Mammography North, German Breast Cancer Screening Program, Oldenburg, Germany.
⁶ Reference Center Mammography Munich, German Breast Cancer Screening Program and FFB gGmbH, Munich, Germany.
⁷ Radiology Center Schwarzer Bär, Hannover, Germany.
⁸ Reference Center Mammography Berlin, German Breast Cancer Screening Program, Berlin, Germany.
⁹ Clinic Dr. Hancken, Stade, Germany.
¹⁰ Radiological Institute Dr. von Essen, Koblenz, Germany.
¹¹ Institute for Social Medicine and Epidemiology, University of Lübeck, Lubeck, Germany. [email protected].

^# Contributed equally.

PMID: 39775040
DOI: 10.1038/s41591-024-03408-6

Abstract

Artificial intelligence (AI) in mammography screening has shown promise in retrospective evaluations, but few prospective studies exist. PRAIM is an observational, multicenter, real-world, noninferiority, implementation study comparing the performance of AI-supported double reading to standard double reading (without AI) among women (50-69 years old) undergoing organized mammography screening at 12 sites in Germany. Radiologists in this study voluntarily chose whether to use the AI system. From July 2021 to February 2023, a total of 463,094 women were screened (260,739 with AI support) by 119 radiologists. Radiologists in the AI-supported screening group achieved a breast cancer detection rate of 6.7 per 1,000, which was 17.6% (95% confidence interval: +5.7%, +30.8%) higher than and statistically superior to the rate (5.7 per 1,000) achieved in the control group. The recall rate in the AI group was 37.4 per 1,000, which was lower than and noninferior to that (38.3 per 1,000) in the control group (percentage difference: -2.5% (-6.5%, +1.7%)). The positive predictive value (PPV) of recall was 17.9% in the AI group compared to 14.9% in the control group. The PPV of biopsy was 64.5% in the AI group versus 59.2% in the control group. Compared to standard double reading, AI-supported double reading was associated with a higher breast cancer detection rate without negatively affecting the recall rate, strongly indicating that AI can improve mammography screening metrics.