Combining bioinformatics and machine learning algorithms to identify and analyze shared biomarkers and pathways in COVID-19 convalescence and diabetes mellitus

Front Endocrinol (Lausanne). 2023 Dec 19:14:1306325. doi: 10.3389/fendo.2023.1306325. eCollection 2023.

Abstract

Background: Most patients who had coronavirus disease 2019 (COVID-19) fully recovered, but many others experienced acute sequelae or persistent symptoms. It is possible that acute COVID-19 recovery is just the beginning of a chronic condition. Even after COVID-19 recovery, it may lead to the exacerbation of hyperglycemia process or a new onset of diabetes mellitus (DM). In this study, we used a combination of bioinformatics and machine learning algorithms to investigate shared pathways and biomarkers in DM and COVID-19 convalescence.

Methods: Gene transcriptome datasets of COVID-19 convalescence and diabetes mellitus from Gene Expression Omnibus (GEO) were integrated using bioinformatics methods and differentially expressed genes (DEGs) were found using the R programme. These genes were also subjected to Gene Ontology (GO) functional enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis to find potential pathways. The hub DEGs genes were then identified by combining protein-protein interaction (PPI) networks and machine learning algorithms. And transcription factors (TFs) and miRNAs were predicted for DM after COVID-19 convalescence. In addition, the inflammatory and immune status of diabetes after COVID-19 convalescence was assessed by single-sample gene set enrichment analysis (ssGSEA).

Results: In this study, we developed genetic diagnostic models for 6 core DEGs beteen type 1 DM (T1DM) and COVID-19 convalescence and 2 core DEGs between type 2 DM (T2DM) and COVID-19 convalescence and demonstrated statistically significant differences (p<0.05) and diagnostic validity in the validation set. Analysis of immune cell infiltration suggests that a variety of immune cells may be involved in the development of DM after COVID-19 convalescence.

Conclusion: We identified a genetic diagnostic model for COVID-19 convalescence and DM containing 8 core DEGs and constructed a nomogram for the diagnosis of COVID-19 convalescence DM.

Keywords: COVID-19 convalescence; diabetes mellitus (DM); differentially expressed genes (DEGs); gene ontology (GO); hub gene; machine learning; protein-protein interaction (PPI).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Biomarkers
  • COVID-19* / diagnosis
  • COVID-19* / genetics
  • Computational Biology
  • Convalescence
  • Diabetes Mellitus*
  • Humans
  • Machine Learning

Substances

  • Biomarkers

Associated data

  • GEO/GSE227116
  • GEO/GSE193273
  • GEO/GSE156993
  • GEO/GSE163980
  • GEO/GSE166253

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This study was supported grants from the Beijing Xinyue Foundation (2022IIT037) and The First Affiliated Hospital of Harbin Medical University Foundation (2023M17) that were awarded to SS.