Soft phenotyping for sepsis via EHR time-aware soft clustering

J Biomed Inform. 2024 Apr:152:104615. doi: 10.1016/j.jbi.2024.104615. Epub 2024 Feb 27.

Abstract

Objective: Sepsis is one of the most serious hospital conditions associated with high mortality. Sepsis is the result of a dysregulated immune response to infection that can lead to multiple organ dysfunction and death. Due to the wide variability in the causes of sepsis, clinical presentation, and the recovery trajectories, identifying sepsis sub-phenotypes is crucial to advance our understanding of sepsis characterization, to choose targeted treatments and optimal timing of interventions, and to improve prognostication. Prior studies have described different sub-phenotypes of sepsis using organ-specific characteristics. These studies applied clustering algorithms to electronic health records (EHRs) to identify disease sub-phenotypes. However, prior approaches did not capture temporal information and made uncertain assumptions about the relationships among the sub-phenotypes for clustering procedures.

Methods: We developed a time-aware soft clustering algorithm guided by clinical variables to identify sepsis sub-phenotypes using data available in the EHR.

Results: We identified six novel sepsis hybrid sub-phenotypes and evaluated them for medical plausibility. In addition, we built an early-warning sepsis prediction model using logistic regression.

Conclusion: Our results suggest that these novel sepsis hybrid sub-phenotypes are promising to provide more accurate information on sepsis-related organ dysfunction and sepsis recovery trajectories which can be important to inform management decisions and sepsis prognosis.

Keywords: EHR; Semi-supervised learning; Sepsis sub-phenotyping; Soft clustering.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Cluster Analysis
  • Electronic Health Records*
  • Humans
  • Phenotype
  • Sepsis* / diagnosis