Systematic pan-cancer analysis of somatic allele frequency

Sci Rep. 2018 May 16;8(1):7735. doi: 10.1038/s41598-018-25462-0.

Abstract

Imbalanced expression of somatic alleles in cancer can suggest functional and selective features, and can therefore indicate possible driving potential of the underlying genetic variants. To explore the correlation between allele frequency of somatic variants and total gene expression of their harboring gene, we used the unique data set of matched tumor and normal RNA and DNA sequencing data of 5523 distinct single nucleotide variants in 381 individuals across 10 cancer types obtained from The Cancer Genome Atlas (TCGA). We analyzed the allele frequency in the context of the variant and gene functional features and linked it with changes in the total gene expression. We documented higher allele frequency of somatic variants in cancer-implicated genes (Cancer Gene Census, CGC). Furthermore, somatic alleles bearing premature terminating variants (PTVs), when positioned in CGC genes, appeared to be less frequently degraded via nonsense-mediated mRNA decay, indicating possible favoring of truncated proteins by the tumor transcriptome. Among the genes with multiple PTVs with high allele frequency, ARID1, TP53 and NSD1 were known key cancer genes. All together, our analyses suggest that high allele frequency of tumor somatic variants can indicate driving functionality and can serve to identify potential cancer-implicated genes.

MeSH terms

  • Alleles
  • Computational Biology / methods*
  • Gene Expression Regulation, Neoplastic*
  • Gene Frequency
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Mutation*
  • Neoplasm Proteins / genetics*
  • Neoplasms / genetics*
  • Polymorphism, Single Nucleotide*
  • Transcriptome*

Substances

  • Neoplasm Proteins