The availability of increasing volumes of multi-omics profiles across many cancers promises to improve our understanding of the regulatory mechanisms underlying cancer. The main challenge is to integrate these multiple levels of omics profiles and especially to analyze them across many cancers. Here we present AMARETTO, an algorithm that addresses both challenges in three steps. First, AMARETTO identifies potential cancer driver genes through integration of copy number, DNA methylation and gene expression data. Then AMARETTO connects these driver genes with co-expressed target genes that they control, defined as regulatory modules. Thirdly, we connect AMARETTO modules identified from different cancer sites into a pancancer network to identify cancer driver genes. Here we applied AMARETTO in a pancancer study comprising eleven cancer sites and confirmed that AMARETTO captures hallmarks of cancer. We also demonstrated that AMARETTO enables the identification of novel pancancer driver genes. In particular, our analysis led to the identification of pancancer driver genes of smoking-induced cancers and 'antiviral' interferon-modulated innate immune response.
Software availability: AMARETTO is available as an R package at https://bitbucket.org/gevaertlab/pancanceramaretto.
Keywords: Cancer driver gene discovery; Data fusion; Module network.
Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.