Chronic obstructive pulmonary disease is a kind of chronic lung disease characterized by persistent air flow obstruction, which was the third leading cause of death in China. The incidence of COPD is steadily and increasing and has been a globally sever disease. Accordingly, it is urgently needed to explore how to diagnose and treat COPD timely. This study aims to find key genes to diagnose COPD as soon as possible to avoid COPD processing and analyze immune cell infiltration between COPD early stage and late stage. Two GEO datasets were merged as the merge data for analyses. 157 DEGs were used for GSEA analysis to find the pathway between COPD early stage and late stage. Above all, gene EXPH5 stood out from the screen as the most likely candidate diagnosis biomarker of COPD indicating the late-stage by least LASSO and SVM-RFE. ROC curves of EXPH5 were applied to represent the discriminatory ability through the area under the curve which is the gold standard to evaluate the accuracy of diagnosis and survival rate. The CIBERSORT algorithm was used to assess the distribution of tissue-infiltrating immune cells between two COPD stages. The diagnosis biomarker, gene EXPH5 had a positive correlation with NK cells resting; mast cell resting, eosinophils, and negative correlation with T cell gamma delta, macrophages M1, which underscore the role of gene and immune cell infiltration. To make results more reliable, we further analyzed the gene EXPH5 expression in single-cell transcriptome data and showed again that EXPH5 genes significantly downregulated in the late stage of COPD especially in the main lung cell types AT1 and AT2. In a word, our study identified genes EXPH5 as a marker gene, which adds to the knowledge for clinical diagnosis and pharmaceutical design of COPD.
Keywords: Bioinformatics analysis; COPD; Differential expressed genes; EXPH5; Machine learning.
Copyright © 2023 The Authors. Published by Elsevier Ltd.. All rights reserved.