Importance: In the US, more than 600 000 adults will experience an acute myocardial infarction (AMI) each year, and up to 20% of the patients will be rehospitalized within 30 days. This study highlights the need for consideration of calibration in these risk models.
Objective: To compare multiple machine learning risk prediction models using an electronic health record (EHR)-derived data set standardized to a common data model.
Design, setting, and participants: This was a retrospective cohort study that developed risk prediction models for 30-day readmission among all inpatients discharged from Vanderbilt University Medical Center between January 1, 2007, and December 31, 2016, with a primary diagnosis of AMI who were not transferred from another facility. The model was externally validated at Dartmouth-Hitchcock Medical Center from April 2, 2011, to December 31, 2016. Data analysis occurred between January 4, 2019, and November 15, 2020.
Exposures: Acute myocardial infarction that required hospital admission.
Main outcomes and measures: The main outcome was thirty-day hospital readmission. A total of 141 candidate variables were considered from administrative codes, medication orders, and laboratory tests. Multiple risk prediction models were developed using parametric models (elastic net, least absolute shrinkage and selection operator, and ridge regression) and nonparametric models (random forest and gradient boosting). The models were assessed using holdout data with area under the receiver operating characteristic curve (AUROC), percentage of calibration, and calibration curve belts.
Results: The final Vanderbilt University Medical Center cohort included 6163 unique patients, among whom the mean (SD) age was 67 (13) years, 4137 were male (67.1%), 1019 (16.5%) were Black or other race, and 933 (15.1%) were rehospitalized within 30 days. The final Dartmouth-Hitchcock Medical Center cohort included 4024 unique patients, with mean (SD) age of 68 (12) years; 2584 (64.2%) were male, 412 (10.2%) were rehospitalized within 30 days, and most of the cohort were non-Hispanic and White. The final test set AUROC performance was between 0.686 to 0.695 for the parametric models and 0.686 to 0.704 for the nonparametric models. In the validation cohort, AUROC performance was between 0.558 to 0.655 for parametric models and 0.606 to 0.608 for nonparametric models.
Conclusions and relevance: In this study, 5 machine learning models were developed and externally validated to predict 30-day readmission AMI hospitalization. These models can be deployed within an EHR using routinely collected data.