Using machine learning of clinical data to diagnose COVID-19: a systematic review and meta-analysis

Wei Tse Li; Jiayan Ma; Neil Shende; Grant Castaneda; Jaideep Chakladar; Joseph C Tsai; Lauren Apostol; Christine O Honda; Jingyue Xu; Lindsay M Wong; Tianyi Zhang; Abby Lee; Aditi Gnanasekar; Thomas K Honda; Selena Z Kuo; Michael Andrew Yu; Eric Y Chang; Mahadevan Raj Rajasekaran; Weg M Ongkeko

doi:10.1186/s12911-020-01266-z

Using machine learning of clinical data to diagnose COVID-19: a systematic review and meta-analysis

BMC Med Inform Decis Mak. 2020 Sep 29;20(1):247. doi: 10.1186/s12911-020-01266-z.

Authors

Wei Tse Li^{1

2}, Jiayan Ma^{1

2}, Neil Shende^{1

2}, Grant Castaneda^{1

2}, Jaideep Chakladar^{1

2}, Joseph C Tsai^{1

2}, Lauren Apostol^{1

2}, Christine O Honda^{1

2}, Jingyue Xu^{1

2}, Lindsay M Wong^{1

2}, Tianyi Zhang^{1

2}, Abby Lee^{1

2}, Aditi Gnanasekar^{1

2}, Thomas K Honda^{1

2}, Selena Z Kuo³, Michael Andrew Yu⁴, Eric Y Chang^{5

6}, Mahadevan Raj Rajasekaran^{7

8}, Weg M Ongkeko^{9

10}

Affiliations

¹ Department of Surgery, Division of Otolaryngology-Head and Neck Surgery, UC San Diego School of Medicine, San Diego, CA, 92093, USA.
² Research Service, VA San Diego Healthcare System, San Diego, CA, 92161, USA.
³ Department of Medicine, Columbia University Medical Center, New York, NY, 10032, USA.
⁴ Department of Internal Medicine, Emory University School of Medicine, Atlanta, GA, 30322, USA.
⁵ Department of Radiology, University of California San Diego, San Diego, CA, 92093, USA.
⁶ Radiology Service, VA San Diego Healthcare System, San Diego, CA, 92161, USA.
⁷ Department of Urology, University of California San Diego, San Diego, CA, 92093, USA.
⁸ Urology Service, VA San Diego Healthcare System, San Diego, CA, 92161, USA.
⁹ Department of Surgery, Division of Otolaryngology-Head and Neck Surgery, UC San Diego School of Medicine, San Diego, CA, 92093, USA. [email protected].
¹⁰ Research Service, VA San Diego Healthcare System, San Diego, CA, 92161, USA. [email protected].

Abstract

Background: The recent Coronavirus Disease 2019 (COVID-19) pandemic has placed severe stress on healthcare systems worldwide, which is amplified by the critical shortage of COVID-19 tests.

Methods: In this study, we propose to generate a more accurate diagnosis model of COVID-19 based on patient symptoms and routine test results by applying machine learning to reanalyzing COVID-19 data from 151 published studies. We aim to investigate correlations between clinical variables, cluster COVID-19 patients into subtypes, and generate a computational classification model for discriminating between COVID-19 patients and influenza patients based on clinical variables alone.

Results: We discovered several novel associations between clinical variables, including correlations between being male and having higher levels of serum lymphocytes and neutrophils. We found that COVID-19 patients could be clustered into subtypes based on serum levels of immune cells, gender, and reported symptoms. Finally, we trained an XGBoost model to achieve a sensitivity of 92.5% and a specificity of 97.9% in discriminating COVID-19 patients from influenza patients.

Conclusions: We demonstrated that computational methods trained on large clinical datasets could yield ever more accurate COVID-19 diagnostic models to mitigate the impact of lack of testing. We also presented previously unknown COVID-19 clinical variable correlations and clinical subgroups.

Keywords: COVID-19; Diagnostic model; Machine learning.

Publication types

Meta-Analysis
Research Support, Non-U.S. Gov't
Systematic Review

MeSH terms

Betacoronavirus
COVID-19
COVID-19 Testing
Clinical Laboratory Techniques / methods*
Computer Simulation
Coronavirus Infections / classification
Coronavirus Infections / diagnosis*
Datasets as Topic
Diagnosis, Differential
Female
Humans
Influenza A virus
Influenza, Human / diagnosis*
Machine Learning*
Male
Pandemics / classification
Pneumonia, Viral / classification
Pneumonia, Viral / diagnosis*
SARS-CoV-2
Sensitivity and Specificity

Grants and funding

R00RG2369/Office of the President, University of California/International