Availability of Evidence for Predictive Machine Learning Algorithms in Primary Care: A Systematic Review

Margot M Rakers; Marieke M van Buchem; Sergej Kucenko; Anne de Hond; Ilse Kant; Maarten van Smeden; Karel G M Moons; Artuur M Leeuwenberg; Niels Chavannes; María Villalobos-Quesada; Hendrikus J A van Os

doi:10.1001/jamanetworkopen.2024.32990

Availability of Evidence for Predictive Machine Learning Algorithms in Primary Care: A Systematic Review

JAMA Netw Open. 2024 Sep 3;7(9):e2432990. doi: 10.1001/jamanetworkopen.2024.32990.

Authors

Margot M Rakers^{1

2}, Marieke M van Buchem³, Sergej Kucenko⁴, Anne de Hond⁵, Ilse Kant⁵, Maarten van Smeden⁶, Karel G M Moons⁶, Artuur M Leeuwenberg⁶, Niels Chavannes^{1

2}, María Villalobos-Quesada^{1

2}, Hendrikus J A van Os^{1

2}

Affiliations

¹ Department of Public Health and Primary Care, Leiden University Medical Centre, ZA Leiden, the Netherlands.
² National eHealth Living Lab, Leiden University Medical Centre, ZA Leiden, the Netherlands.
³ Department of Information Technology and Digital Innovation, Leiden University Medical Center, ZA Leiden, the Netherlands.
⁴ Hamburg University of Applied Sciences, Department of Health Sciences, Ulmenliet 20, Hamburg, Germany.
⁵ Department of Digital Health, University Medical Center Utrecht, Utrecht University, Universiteitsweg 100, CG Utrecht, the Netherlands.
⁶ Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Universiteitsweg 100, CG Utrecht, the Netherlands.

Abstract

Importance: The aging and multimorbid population and health personnel shortages pose a substantial burden on primary health care. While predictive machine learning (ML) algorithms have the potential to address these challenges, concerns include transparency and insufficient reporting of model validation and effectiveness of the implementation in the clinical workflow.

Objectives: To systematically identify predictive ML algorithms implemented in primary care from peer-reviewed literature and US Food and Drug Administration (FDA) and Conformité Européene (CE) registration databases and to ascertain the public availability of evidence, including peer-reviewed literature, gray literature, and technical reports across the artificial intelligence (AI) life cycle.

Evidence review: PubMed, Embase, Web of Science, Cochrane Library, Emcare, Academic Search Premier, IEEE Xplore, ACM Digital Library, MathSciNet, AAAI.org (Association for the Advancement of Artificial Intelligence), arXiv, Epistemonikos, PsycINFO, and Google Scholar were searched for studies published between January 2000 and July 2023, with search terms that were related to AI, primary care, and implementation. The search extended to CE-marked or FDA-approved predictive ML algorithms obtained from relevant registration databases. Three reviewers gathered subsequent evidence involving strategies such as product searches, exploration of references, manufacturer website visits, and direct inquiries to authors and product owners. The extent to which the evidence for each predictive ML algorithm aligned with the Dutch AI predictive algorithm (AIPA) guideline requirements was assessed per AI life cycle phase, producing evidence availability scores.

Findings: The systematic search identified 43 predictive ML algorithms, of which 25 were commercially available and CE-marked or FDA-approved. The predictive ML algorithms spanned multiple clinical domains, but most (27 [63%]) focused on cardiovascular diseases and diabetes. Most (35 [81%]) were published within the past 5 years. The availability of evidence varied across different phases of the predictive ML algorithm life cycle, with evidence being reported the least for phase 1 (preparation) and phase 5 (impact assessment) (19% and 30%, respectively). Twelve (28%) predictive ML algorithms achieved approximately half of their maximum individual evidence availability score. Overall, predictive ML algorithms from peer-reviewed literature showed higher evidence availability compared with those from FDA-approved or CE-marked databases (45% vs 29%).

Conclusions and relevance: The findings indicate an urgent need to improve the availability of evidence regarding the predictive ML algorithms' quality criteria. Adopting the Dutch AIPA guideline could facilitate transparent and consistent reporting of the quality criteria that could foster trust among end users and facilitating large-scale implementation.

Publication types

Systematic Review

MeSH terms

Algorithms*
Humans
Machine Learning*
Primary Health Care* / standards
Primary Health Care* / statistics & numerical data