Objectives: This systematic review aims to provide further insights into the conduct and reporting of clinical prediction model development and validation over time. We focus on assessing the reporting of information necessary to enable external validation by other investigators.
Materials and methods: We searched Embase, Medline, Web-of-Science, Cochrane Library, and Google Scholar to identify studies that developed 1 or more multivariable prognostic prediction models using electronic health record (EHR) data published in the period 2009-2019.
Results: We identified 422 studies that developed a total of 579 clinical prediction models using EHR data. We observed a steep increase over the years in the number of developed models. The percentage of models externally validated in the same paper remained at around 10%. Throughout 2009-2019, for both the target population and the outcome definitions, code lists were provided for less than 20% of the models. For about half of the models that were developed using regression analysis, the final model was not completely presented.
Discussion: Overall, we observed limited improvement over time in the conduct and reporting of clinical prediction model development and validation. In particular, the prediction problem definition was often not clearly reported, and the final model was often not completely presented.
Conclusion: Improvement in the reporting of information necessary to enable external validation by other investigators is still urgently needed to increase clinical adoption of developed models.
Keywords: clinical decision support; clinical prediction model; electronic health record; external validation; machine learning.
© The Author(s) 2022. Published by Oxford University Press on behalf of the American Medical Informatics Association.