Background: Several studies scatteredly identified the myelodysplastic syndromes' transcriptomic profiles (MDS). However, the exploration of transcriptional signatures, key signalling pathways, and their association with prognosis and diagnosis in the integrated multiple datasets remains lacking.
Methods: We integrated the GSE4619, GSE19429, GSE30195, and GSE58831 microarray datasets of CD34 + cells for identifying the differentially expressed genes (DEGs) in the MDS. The series of bioinformatics methods are applied to identify the key hub genes, gene clusters, prognostic hub genes, and genes associated with diagnostic efficacy. Finally, we validated the expression differences of hub genes in the GSE114922 dataset.
Results: We explored the DEGs related to gene ontology enrichment and KEGG pathways. We identified significant hub genes, including 168 upregulated hub genes (such as STAT1, IFIH1, EPRS, GRB2, RAC2, MAPK14, CASP1, and SPI1) and 52 downregulated hub genes (such as CREBBP, HIF1A, PIK3CA, EZH2, PIK3R1, MDM2, IRF4, CXCR4, PCNA, and CD19) in the MDS. In addition, we identified six significant molecular complex detection (MCODE)-derived upregulated gene clusters and one downregulated gene cluster, respectively. Moreover, we found that the higher expression level of MX2, GBP2, PXN, IFI44, FDXR, PLCB2, ASS1, ERCC4, PML, and RRAGD and the lower expression level of CD19, PAX5, TCF3, LEF1, NUSAP1, and TIMELESS hub genes are significantly correlated with shorter survival times of MDS patients. Furthermore, the area value under the ROC curve (AUC) of PXN, FDXR, PLCB2, PML, CD19, PAX5, and LEF1 prognostic genes are more than 0.80, indicating that these genes could be effectively used for the diagnostic efficacy of MDS patients.
Conclusions: Identifying key hub genes and their association with the prognosis and diagnostic efficacy may provide substantial clues for the treatment and diagnosis of MDS patients.
Keywords: Bioinformatics; Diagnostic efficacy; Myelodysplastic syndrome; Survival times; hub genes.