Integrating omics data and protein interaction networks to prioritize driver genes in cancer

Oncotarget. 2017 Jul 22;8(35):58050-58060. doi: 10.18632/oncotarget.19481. eCollection 2017 Aug 29.

Abstract

Although numerous approaches have been proposed to discern driver from passenger, identification of driver genes remains a critical challenge in the cancer genomics field. Driver genes with low mutated frequency tend to be filtered in cancer research. In addition, the accumulation of different omics data necessitates the development of algorithmic frameworks for nominating putative driver genes. In this study, we presented a novel framework to identify driver genes through integrating multi-omics data such as somatic mutation, gene expression, and copy number alterations. We developed a computational approach to detect potential driver genes by virtue of their effect on their neighbors in network. Application to three datasets (head and neck squamous cell carcinoma (HNSC), thyroid carcinoma (THCA) and kidney renal clear cell carcinoma (KIRC)) from The Cancer Genome Atlas (TCGA), by comparing the Precision, Recall and F1 score, our method outperformed DriverNet and MUFFINN in all three datasets. In addition, our method was less affected by protein length compared with DriverNet. Lastly, our method not only identified the known cancer genes but also detected the potential rare drivers (PTPN6 in THCA, SRC, GRB2 and PTPN6 in KIRC, MAPK1 and SMAD2 in HNSC).

Keywords: driver genes; integrative data; protein interaction network.