Simultaneous Integration of Multi-omics Data Improves the Identification of Cancer Driver Modules

Cell Syst. 2019 May 22;8(5):456-466.e5. doi: 10.1016/j.cels.2019.04.005. Epub 2019 May 15.

Abstract

The identification of molecular pathways driving cancer progression is a fundamental challenge in cancer research. Most approaches to address it are limited in the number of data types they employ and perform data integration in a sequential manner. Here, we describe ModulOmics, a method to de novo identify cancer driver pathways, or modules, by integrating protein-protein interactions, mutual exclusivity of mutations and copy number alterations, transcriptional coregulation, and RNA coexpression into a single probabilistic model. To efficiently search and score the large space of candidate modules, ModulOmics employs a two-step optimization procedure that combines integer linear programming with stochastic search. Applied across several cancer types, ModulOmics identifies highly functionally connected modules enriched with cancer driver genes, outperforming state-of-the-art methods and demonstrating the power of using multiple omics data types simultaneously. On breast cancer subtypes, ModulOmics proposes unexplored connections supported by an independent patient cohort and independent proteomic and phosphoproteomic datasets.

Keywords: cancer; cancer drivers; cancer pathways; data integration; driver modules; integer linear programming; mutual exclusivity; simultaneous optimization.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Breast Neoplasms / genetics
  • Computational Biology / methods*
  • DNA Copy Number Variations
  • Gene Expression Profiling / methods
  • Gene Regulatory Networks
  • Genomics / methods
  • Humans
  • Models, Statistical
  • Mutation
  • Neoplasms / genetics*
  • Neoplasms / metabolism*
  • Proteomics / methods
  • Signal Transduction / genetics
  • Software