Using machine learning of clinical data to diagnose COVID-19: a systematic review and meta-analysis

BMC Med Inform Decis Mak. 2020 Sep 29;20(1):247. doi: 10.1186/s12911-020-01266-z.

Abstract

Background: The recent Coronavirus Disease 2019 (COVID-19) pandemic has placed severe stress on healthcare systems worldwide, which is amplified by the critical shortage of COVID-19 tests.

Methods: In this study, we propose to generate a more accurate diagnosis model of COVID-19 based on patient symptoms and routine test results by applying machine learning to reanalyzing COVID-19 data from 151 published studies. We aim to investigate correlations between clinical variables, cluster COVID-19 patients into subtypes, and generate a computational classification model for discriminating between COVID-19 patients and influenza patients based on clinical variables alone.

Results: We discovered several novel associations between clinical variables, including correlations between being male and having higher levels of serum lymphocytes and neutrophils. We found that COVID-19 patients could be clustered into subtypes based on serum levels of immune cells, gender, and reported symptoms. Finally, we trained an XGBoost model to achieve a sensitivity of 92.5% and a specificity of 97.9% in discriminating COVID-19 patients from influenza patients.

Conclusions: We demonstrated that computational methods trained on large clinical datasets could yield ever more accurate COVID-19 diagnostic models to mitigate the impact of lack of testing. We also presented previously unknown COVID-19 clinical variable correlations and clinical subgroups.

Keywords: COVID-19; Diagnostic model; Machine learning.

Publication types

  • Meta-Analysis
  • Research Support, Non-U.S. Gov't
  • Systematic Review

MeSH terms

  • Betacoronavirus
  • COVID-19
  • COVID-19 Testing
  • Clinical Laboratory Techniques / methods*
  • Computer Simulation
  • Coronavirus Infections / classification
  • Coronavirus Infections / diagnosis*
  • Datasets as Topic
  • Diagnosis, Differential
  • Female
  • Humans
  • Influenza A virus
  • Influenza, Human / diagnosis*
  • Machine Learning*
  • Male
  • Pandemics / classification
  • Pneumonia, Viral / classification
  • Pneumonia, Viral / diagnosis*
  • SARS-CoV-2
  • Sensitivity and Specificity