Previous studies have shown promising results of machine learning (ML) models for predicting health outcomes. We develop and test ML models for predicting Clostridioides difficile infection (CDI) in hospitalized patients. This is a retrospective cohort study conducted during 2015-2017. All inpatients tested for C. difficile were included. CDI was defined as having a positive glutamate dehydrogenase and toxin results. We restricted analyses to the first record of C. difficile testing per patient. Of 3514 patients tested, 136 (4%) had CDI. Age and antibiotic use within 90 days before C. difficile testing were associated with CDI (P < 0.01). We tested 10 ML methods with and without resampling. Logistic regression, random forest and naïve Bayes models yielded the highest AUC ROC performance: 0.6. Predicting CDI was difficult in our cohort of patients tested for CDI. Multiple ML models yielded only modest results in a real-world population of hospitalized patients tested for CDI.
Keywords: Clostridioides difficile; Diarrhea; Infection control; Machine learning.
Copyright © 2020 Elsevier Inc. All rights reserved.