Monocytes and neutrophils play key roles in the cytokine storm triggered by SARS-CoV-2 infection, which changes their conformation and function. These changes are detectable at the cellular and molecular level and may be different to what is observed in other respiratory infections. Here, we applied machine learning (ML) to develop and validate an algorithm to diagnose COVID-19 using blood parameters. In this retrospective single-center study, 49 hemogram parameters from 12,321 patients with clinical suspicion of COVID-19 and tested by RT-PCR (4239 positive and 8082 negative) were analysed. The dataset was randomly divided into training and validation sets. Blood cell parameters and patient age were used to construct the predictive model with the support vector machine (SVM) tool. The model constructed from the training set (5936 patients) achieved an accuracy for diagnosis of SARS-CoV-2 infection of 0.952 (95% CI: 0.875-0.892). Test sensitivity and specificity was 0.868 and 0.899, respectively, with a positive (PPV) and negative (NPV) predictive value of 0.896 and 0.872, respectively (prevalence 0.50). The validation set model (4964 patients) achieved an accuracy of 0.894 (95% CI: 0.883-0.903). Test sensitivity and specificity was 0.8922 and 0.8951, respectively, with a positive (PPV) and negative (NPV) predictive value of 0.817 and 0.94, respectively (prevalence 0.34). The area under the receiver operating characteristic curve was 0.952 for the algorithm performance. This algorithm may allow to rule out COVID-19 diagnosis with 94% of probability. This represents a great advance for early diagnostic orientation and guiding clinical decisions.
Keywords: COVID-19; SARS-CoV-2; cell morphological data; cell population data; hemogram; machine learning.
© 2023 The Authors. Journal of Cellular and Molecular Medicine published by Foundation for Cellular and Molecular Medicine and John Wiley & Sons Ltd.