Cancer subtype classification and modeling by pathway attention and propagation

Bioinformatics. 2020 Jun 1;36(12):3818-3824. doi: 10.1093/bioinformatics/btaa203.

Abstract

Motivation: Biological pathway is an important curated knowledge of biological processes. Thus, cancer subtype classification based on pathways will be very useful to understand differences in biological mechanisms among cancer subtypes. However, pathways include only a fraction of the entire gene set, only one-third of human genes in KEGG, and pathways are fragmented. For this reason, there are few computational methods to use pathways for cancer subtype classification.

Results: We present an explainable deep-learning model with attention mechanism and network propagation for cancer subtype classification. Each pathway is modeled by a graph convolutional network. Then, a multi-attention-based ensemble model combines several hundreds of pathways in an explainable manner. Lastly, network propagation on pathway-gene network explains why gene expression profiles in subtypes are different. In experiments with five TCGA cancer datasets, our method achieved very good classification accuracies and, additionally, identified subtype-specific pathways and biological functions.

Availability and implementation: The source code is available at http://biohealth.snu.ac.kr/software/GCN_MAE.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Attention
  • Humans
  • Neoplasms* / genetics
  • Software*
  • Transcriptome