COVID-19 Risk Stratification and Mortality Prediction in Hospitalized Indian Patients: Harnessing clinical data for public health benefits

PLoS One. 2022 Mar 17;17(3):e0264785. doi: 10.1371/journal.pone.0264785. eCollection 2022.

Abstract

The variability of clinical course and prognosis of COVID-19 highlights the necessity of patient sub-group risk stratification based on clinical data. In this study, clinical data from a cohort of Indian COVID-19 hospitalized patients is used to develop risk stratification and mortality prediction models. We analyzed a set of 70 clinical parameters including physiological and hematological for developing machine learning models to identify biomarkers. We also compared the Indian and Wuhan cohort, and analyzed the role of steroids. A bootstrap averaged ensemble of Bayesian networks was also learned to construct an explainable model for discovering actionable influences on mortality and days to outcome. We discovered blood parameters, diabetes, co-morbidity and SpO2 levels as important risk stratification features, whereas mortality prediction is dependent only on blood parameters. XGboost and logistic regression model yielded the best performance on risk stratification and mortality prediction, respectively (AUC score 0.83, AUC score 0.92). Blood coagulation parameters (ferritin, D-Dimer and INR), immune and inflammation parameters IL6, LDH and Neutrophil (%) are common features for both risk and mortality prediction. Compared with Wuhan patients, Indian patients with extreme blood parameters indicated higher survival rate. Analyses of medications suggest that a higher proportion of survivors and mild patients who were administered steroids had extreme neutrophil and lymphocyte percentages. The ensemble averaged Bayesian network structure revealed serum ferritin to be the most important predictor for mortality and Vitamin D to influence severity independent of days to outcome. The findings are important for effective triage during strains on healthcare infrastructure.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Adolescent
  • Adult
  • Aged
  • Aged, 80 and over
  • Bayes Theorem
  • COVID-19 / epidemiology
  • COVID-19 / etiology
  • COVID-19 / mortality*
  • Child
  • China / epidemiology
  • Female
  • Hospitalization / statistics & numerical data*
  • Humans
  • India / epidemiology
  • Machine Learning
  • Male
  • Middle Aged
  • Models, Statistical
  • Risk Assessment / methods
  • Risk Factors
  • Young Adult

Grants and funding

This work was funded by Intel Corp as part of its Pandemic Response Technology Initiative (PRTI) (https://newsroom.intel.com/news/intel-commits-technology-response-combat-coronavirus/#gs.97i8ts) under grant number (CLP-0034), Council of Scientific and Industrial Research (https://www.csir.res.in/) under grant number MLP-2005, Fondation Botnar (https://www.fondationbotnar.org/) under grant number CLP-0031, Indo-U.S. Science and Technology Forum (https://www.iusstf.org/) under grant number CLP-0033, and Bill & Melinda Gates Foundation (https://www.gatesfoundation.org/) under grant number CLP-0036, Department of Science and Technology-Science and Engineering Research Board under grant number CVD/2020/000343 The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.