The molecular mechanism of tumorigenesis of the prevalent cancer hepatocellular carcinoma (HCC) is unclear. In this study, through weighted gene coexpression network analysis, a coexpression network was constructed by selecting the top 25% most variant genes in the dataset GSE62232. The average linkage hierarchical clustering identified 24 modules, and among them, the pink module associated with prognosis of HCC was screened. Five gene candidates (PCNA, RFC4, PTTG1, H2AFZ, and RRM1) with a common network in the module were screened after the protein-protein interaction network complex was combined with the coexpression network. After progression and survival analysis, all candidates were identified as real core genes. According to the Human Protein Atlas and the Oncomine database, these genes were dysregulated in HCC samples. The receiver operating characteristic curve proved that the expression levels of the core genes had high diagnostic efficacy. The results of gene set enrichment analysis and functional enrichment analysis demonstrated the importance of the cell cycle-related pathways in HCC progression and prognosis. In conclusion, the five real core genes and cell cycle-related pathways identified in this study could greatly improve the knowledge about HCC progression and contribute to HCC treatment.
Keywords: GEO; TCGA; bioinformatics analysis; hepatocellular carcinoma; weighted gene co-expression network analysis (WGCNA).