A method for finding consensus breakpoints in the cancer genome from copy number data

Bioinformatics. 2013 Jul 15;29(14):1793-800. doi: 10.1093/bioinformatics/btt300. Epub 2013 May 28.

Abstract

Motivation: Recurrent DNA breakpoints in cancer genomes indicate the presence of critical functional elements for tumor development. Identifying them can help determine new therapeutic targets. High-dimensional DNA microarray experiments like arrayCGH afford the identification of DNA copy number breakpoints with high precision, offering a solid basis for computational estimation of recurrent breakpoint locations.

Results: We introduce a method for identification of recurrent breakpoints (consensus breakpoints) from copy number aberration datasets. The method is based on weighted kernel counting of breakpoints around genomic locations. Counts larger than expected by chance are considered significant. We show that the consensus breakpoints facilitate consensus segmentation of the samples. We apply our method to three arrayCGH datasets and show that by using consensus segmentation we achieve significant dimension reduction, which is useful for the task of prediction of tumor phenotype based on copy number data. We use our approach for classification of neuroblastoma tumors from different age groups and confirm the recent recommendation for the choice of age cut-off for differential treatment of 18 months. We also investigate the (epi)genetic properties at consensus breakpoint locations for seven datasets and show enrichment in overlap with important functional genomic regions.

Availability: Implementation in R of our approach can be found at http://www.mpi-inf.mpg.de/ ∼laura/FeatureGrouping.html.

Contact: [email protected].

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Chromosome Breakpoints*
  • DNA Copy Number Variations*
  • Genome, Human
  • Genomics / methods
  • Humans
  • Neoplasms / genetics*
  • Neuroblastoma / genetics
  • Oligonucleotide Array Sequence Analysis
  • Software