Detecting copy-number variations in whole-exome sequencing data using the eXome Hidden Markov Model: an 'exome-first' approach

J Hum Genet. 2015 Apr;60(4):175-82. doi: 10.1038/jhg.2014.124. Epub 2015 Jan 22.

Abstract

Whole-exome sequencing (WES) is becoming a standard tool for detecting nucleotide changes, and determining whether WES data can be used for the detection of copy-number variations (CNVs) is of interest. To date, several algorithms have been developed for such analyses, although verification is needed to establish if they fit well for the appropriate purpose, depending on the characteristics of each algorithm. Here, we performed WES CNV analysis using the eXome Hidden Markov Model (XHMM). We validated its performance using 27 rare CNVs previously identified by microarray as positive controls, finding that the detection rate was 59%, or higher (89%) with three or more targets. XHMM can be effectively used, especially for the detection of >200 kb CNVs. XHMM may be useful for deletion breakpoint detection. Next, we applied XHMM to genetically unsolved patients, demonstrating successful identification of pathogenic CNVs: 1.5-1.9-Mb deletions involving NSD1 in patients with unknown overgrowth syndrome leading to the diagnosis of Sotos syndrome, and 6.4-Mb duplication involving MECP2 in affected brothers with late-onset spasm and progressive cerebral/cerebellar atrophy confirming the clinical suspect of MECP2 duplication syndrome. The possibility of an 'exome-first' approach for clinical genetic investigation may be considered to save the cost of multiple investigations.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Atrophy
  • Brain Diseases / genetics
  • Brain Diseases / pathology
  • Chromosome Breakpoints
  • Chromosome Duplication
  • Computational Biology / methods
  • DNA Copy Number Variations*
  • Exome*
  • Female
  • Gigantism / genetics
  • High-Throughput Nucleotide Sequencing*
  • Histone Methyltransferases
  • Histone-Lysine N-Methyltransferase
  • Humans
  • Intellectual Disability / genetics
  • Intracellular Signaling Peptides and Proteins / genetics
  • Male
  • Markov Chains*
  • Methyl-CpG-Binding Protein 2 / genetics
  • Models, Genetic*
  • Nuclear Proteins / genetics
  • Oligonucleotide Array Sequence Analysis / methods*
  • Sensitivity and Specificity
  • Sequence Deletion

Substances

  • Intracellular Signaling Peptides and Proteins
  • Methyl-CpG-Binding Protein 2
  • Nuclear Proteins
  • Histone Methyltransferases
  • Histone-Lysine N-Methyltransferase
  • NSD1 protein, human