This study aims to apply machine learning models to identify new biomarkers associated with the early diagnosis and prognosis of SARS-CoV-2 infection.Plasma and serum samples from COVID-19 patients (mild, moderate, and severe), patients with other pneumonia (but with negative COVID-19 RT-PCR), and healthy volunteers (control) from hospitals in four different countries (China, Spain, France, and Italy) were analyzed by GC-MS, LC-MS, and NMR. Machine learning models (PCA and PLS-DA) were developed to predict the diagnosis and prognosis of COVID-19 and identify biomarkers associated with these outcomes.A total of 1410 patient samples were analyzed. The PLS-DA model presented a diagnostic and prognostic accuracy of around 95% of all analyzed data. A total of 23 biomarkers (e.g., spermidine, taurine, L-aspartic, L-glutamic, L-phenylalanine and xanthine, ornithine, and ribothimidine) have been identified as being associated with the diagnosis and prognosis of COVID-19. Additionally, we also identified for the first time five new biomarkers (N-Acetyl-4-O-acetylneuraminic acid, N-Acetyl-L-Alanine, N-Acetyltriptophan, palmitoylcarnitine, and glycerol 1-myristate) that are also associated with the severity and diagnosis of COVID-19. These five new biomarkers were elevated in severe COVID-19 patients compared to patients with mild disease or healthy volunteers.The PLS-DA model was able to predict the diagnosis and prognosis of COVID-19 around 95%. Additionally, our investigation pinpointed five novel potential biomarkers linked to the diagnosis and prognosis of COVID-19: N-Acetyl-4-O-acetylneuraminic acid, N-Acetyl-L-Alanine, N-Acetyltriptophan, palmitoylcarnitine, and glycerol 1-myristate. These biomarkers exhibited heightened levels in severe COVID-19 patients compared to those with mild COVID-19 or healthy volunteers.
Keywords: Biomarker; COVID-19; Diagnosis; Machine learning; Prognosis.
© 2024. The Author(s), under exclusive licence to Società Italiana di Medicina Interna (SIMI).