A robust algorithm for copy number detection using high-density oligonucleotide single nucleotide polymorphism genotyping arrays

Cancer Res. 2005 Jul 15;65(14):6071-9. doi: 10.1158/0008-5472.CAN-05-0465.

Abstract

We have developed a robust algorithm for copy number analysis of the human genome using high-density oligonucleotide microarrays containing 116,204 single-nucleotide polymorphisms. The advantages of this algorithm include the improvement of signal-to-noise (S/N) ratios and the use of an optimized reference. The raw S/N ratios were improved by accounting for the length and GC content of the PCR products using quadratic regressions. The use of constitutional DNA, when available, gives the lowest SD values (0.16 +/- 0.03) and also enables allele-based copy number detection in cancer genomes, which can unmask otherwise concealed allelic imbalances. In the absence of constitutional DNA, optimized selection of multiple normal references with the highest S/N ratios, in combination with the data regressions, dramatically improves SD values from 0.67 +/- 0.12 to 0.18 +/- 0.03. These improvements allow for highly reliable comparison of data across different experimental conditions, detection of allele-based copy number changes, and more accurate estimations of the range and magnitude of copy number aberrations. This algorithm has been implemented in a software package called Copy Number Analyzer for Affymetrix GeneChip Mapping 100K arrays (CNAG). Overall, these enhancements make CNAG a useful tool for high-resolution detection of copy number alterations which can help in the understanding of the pathogenesis of cancers and other diseases as well as in exploring the complexities of the human genome.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Alleles
  • Cell Line, Tumor
  • Gene Dosage*
  • Genome, Human
  • Genotype
  • Humans
  • Loss of Heterozygosity
  • Lung Neoplasms / genetics
  • Oligonucleotide Array Sequence Analysis / methods*
  • Polymorphism, Single Nucleotide
  • Reference Values
  • Reproducibility of Results
  • Signal Processing, Computer-Assisted