Integrating and analyzing the pancancer data collected from different experiments is crucial for gaining insights into the common mechanisms in the molecular level underlying the development and progression of cancers. Epigenetic study of the pancancer data can provide promising results in biomarker discovery. The genes that are epigenetically dysregulated in different cancers are powerful biomarkers for drug-related studies. This paper identifies the genes having altered expression due to aberrant methylation patterns using differential analysis of TCGA pancancer data of 12 different cancers. We identified a comprehensive set of 115 epigenetic biomarker genes out of which 106 genes having pancancer properties. The correlation analysis, gene set enrichment, protein-protein interaction analysis, pancancer characteristics analysis, and diagnostic modeling were performed on these biomarkers to illustrate the power of this signature and found to be important in different molecular operations related to cancer. An accuracy of 97.56% was obtained on TCGA pancancer gene expression dataset for predicting the binary class tumor or normal. The source code and dataset of this work are available at https://github.com/panchamisuneeth/EpiPanCan.git.
Keywords: Biomarker discovery; DNA methylation; Differential analysis; Epigenetics; Gene expression; Pancancer.
Copyright © 2024 Elsevier Ltd. All rights reserved.