Predicting patients with septic shock and sepsis through analyzing whole-blood expression of NK cell-related hub genes using an advanced machine learning framework

Front Immunol. 2024 Nov 28:15:1493895. doi: 10.3389/fimmu.2024.1493895. eCollection 2024.

Abstract

Background: Sepsis is a life-threatening condition that causes millions of deaths globally each year. The need for biomarkers to predict the progression of sepsis to septic shock remains critical, with rapid, reliable methods still lacking. Transcriptomics data has recently emerged as a valuable resource for disease phenotyping and endotyping, making it a promising tool for predicting disease stages. Therefore, we aimed to establish an advanced machine learning framework to predict sepsis and septic shock using transcriptomics datasets with rapid turnaround methods.

Methods: We retrieved four NCBI GEO transcriptomics datasets previously generated from peripheral blood samples of healthy individuals and patients with sepsis and septic shock. The datasets were processed for bioinformatic analysis and supplemented with a series of bench experiments, leading to the identification of a hub gene panel relevant to sepsis and septic shock. The hub gene panel was used to establish a novel prediction model to distinguish sepsis from septic shock through a multistage machine learning pipeline, incorporating linear discriminant analysis, risk score analysis, and ensemble method combined with Least Absolute Shrinkage and Selection Operator analysis. Finally, we validated the prediction model with the hub gene dataset generated by RT-qPCR using peripheral blood samples from newly recruited patients.

Results: Our analysis led to identify six hub genes (GZMB, PRF1, KLRD1, SH2D1A, LCK, and CD247) which are related to NK cell cytotoxicity and septic shock, collectively termed 6-HubGss. Using this panel, we created SepxFindeR, a machine learning model that demonstrated high accuracy in predicting sepsis and septic shock and distinguishing septic shock from sepsis in a cross-database context. Remarkably, the SepxFindeR model proved compatible with RT-qPCR datasets based on the 6-HubGss panel, facilitating the identification of newly recruited patients with sepsis and septic shock.

Conclusions: Our bioinformatic approach led to the discovery of the 6-HubGss biomarker panel and the development of the SepxFindeR machine learning model, enabling accurate prediction of septic shock and distinction from sepsis with rapid processing capabilities.

Keywords: SepxFindeR model; biomarkers; machine learning for disease diagnosis; sepsis; septic shock; translational medicine.

MeSH terms

  • Biomarkers* / blood
  • Computational Biology / methods
  • Female
  • Gene Expression Profiling
  • Humans
  • Killer Cells, Natural* / immunology
  • Killer Cells, Natural* / metabolism
  • Machine Learning*
  • Male
  • Prognosis
  • Sepsis / blood
  • Sepsis / diagnosis
  • Sepsis / genetics
  • Sepsis / immunology
  • Shock, Septic* / blood
  • Shock, Septic* / diagnosis
  • Shock, Septic* / genetics
  • Shock, Septic* / immunology
  • Transcriptome

Substances

  • Biomarkers

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. The Dorothy M. and Edward E. Burwell Endorsement Professorship (X-DT). The Dorothy M. and Edward E. Burwell Professorship had no role in study design, data collection and analysis, interpretation of data, decision to publish, or preparation of the manuscript.