Modest Clostridiodes difficile infection prediction using machine learning models in a tertiary care hospital

Diagn Microbiol Infect Dis. 2020 Oct;98(2):115104. doi: 10.1016/j.diagmicrobio.2020.115104. Epub 2020 Jun 8.

Abstract

Previous studies have shown promising results of machine learning (ML) models for predicting health outcomes. We develop and test ML models for predicting Clostridioides difficile infection (CDI) in hospitalized patients. This is a retrospective cohort study conducted during 2015-2017. All inpatients tested for C. difficile were included. CDI was defined as having a positive glutamate dehydrogenase and toxin results. We restricted analyses to the first record of C. difficile testing per patient. Of 3514 patients tested, 136 (4%) had CDI. Age and antibiotic use within 90 days before C. difficile testing were associated with CDI (P < 0.01). We tested 10 ML methods with and without resampling. Logistic regression, random forest and naïve Bayes models yielded the highest AUC ROC performance: 0.6. Predicting CDI was difficult in our cohort of patients tested for CDI. Multiple ML models yielded only modest results in a real-world population of hospitalized patients tested for CDI.

Keywords: Clostridioides difficile; Diarrhea; Infection control; Machine learning.

MeSH terms

  • Aged
  • Anti-Bacterial Agents / therapeutic use
  • Area Under Curve
  • Bayes Theorem
  • Clostridioides difficile*
  • Clostridium Infections / diagnosis*
  • Diarrhea / microbiology
  • Female
  • Forecasting / methods*
  • Humans
  • Logistic Models
  • Machine Learning*
  • Male
  • Middle Aged
  • ROC Curve
  • Retrospective Studies
  • Tertiary Care Centers

Substances

  • Anti-Bacterial Agents