Revealing EXPH5 as a potential diagnostic gene biomarker of the late stage of COPD based on machine learning analysis

Comput Biol Med. 2023 Mar:154:106621. doi: 10.1016/j.compbiomed.2023.106621. Epub 2023 Jan 31.

Abstract

Chronic obstructive pulmonary disease is a kind of chronic lung disease characterized by persistent air flow obstruction, which was the third leading cause of death in China. The incidence of COPD is steadily and increasing and has been a globally sever disease. Accordingly, it is urgently needed to explore how to diagnose and treat COPD timely. This study aims to find key genes to diagnose COPD as soon as possible to avoid COPD processing and analyze immune cell infiltration between COPD early stage and late stage. Two GEO datasets were merged as the merge data for analyses. 157 DEGs were used for GSEA analysis to find the pathway between COPD early stage and late stage. Above all, gene EXPH5 stood out from the screen as the most likely candidate diagnosis biomarker of COPD indicating the late-stage by least LASSO and SVM-RFE. ROC curves of EXPH5 were applied to represent the discriminatory ability through the area under the curve which is the gold standard to evaluate the accuracy of diagnosis and survival rate. The CIBERSORT algorithm was used to assess the distribution of tissue-infiltrating immune cells between two COPD stages. The diagnosis biomarker, gene EXPH5 had a positive correlation with NK cells resting; mast cell resting, eosinophils, and negative correlation with T cell gamma delta, macrophages M1, which underscore the role of gene and immune cell infiltration. To make results more reliable, we further analyzed the gene EXPH5 expression in single-cell transcriptome data and showed again that EXPH5 genes significantly downregulated in the late stage of COPD especially in the main lung cell types AT1 and AT2. In a word, our study identified genes EXPH5 as a marker gene, which adds to the knowledge for clinical diagnosis and pharmaceutical design of COPD.

Keywords: Bioinformatics analysis; COPD; Differential expressed genes; EXPH5; Machine learning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adaptor Proteins, Signal Transducing
  • Algorithms
  • Biomarkers
  • Drug Design
  • Humans
  • Machine Learning
  • Pulmonary Disease, Chronic Obstructive* / diagnosis
  • Pulmonary Disease, Chronic Obstructive* / genetics

Substances

  • Biomarkers
  • EXPH5 protein, human
  • Adaptor Proteins, Signal Transducing