PSCC: sensitive and reliable population-scale copy number variation detection method based on low coverage sequencing

PLoS One. 2014 Jan 21;9(1):e85096. doi: 10.1371/journal.pone.0085096. eCollection 2014.

Abstract

Background: Copy number variations (CNVs) represent an important type of genetic variation that deeply impact phenotypic polymorphisms and human diseases. The advent of high-throughput sequencing technologies provides an opportunity to revolutionize the discovery of CNVs and to explore their relationship with diseases. However, most of the existing methods depend on sequencing depth and show instability with low sequence coverage. In this study, using low coverage whole-genome sequencing (LCS) we have developed an effective population-scale CNV calling (PSCC) method.

Methodology/principal findings: In our novel method, two-step correction was used to remove biases caused by local GC content and complex genomic characteristics. We chose a binary segmentation method to locate CNV segments and designed combined statistics tests to ensure the stable performance of the false positive control. The simulation data showed that our PSCC method could achieve 99.7%/100% and 98.6%/100% sensitivity and specificity for over 300 kb CNV calling in the condition of LCS (∼2×) and ultra LCS (∼0.2×), respectively. Finally, we applied this novel method to analyze 34 clinical samples with an average of 2× LCS. In the final results, all the 31 pathogenic CNVs identified by aCGH were successfully detected. In addition, the performance comparison revealed that our method had significant advantages over existing methods using ultra LCS.

Conclusions/significance: Our study showed that PSCC can sensitively and reliably detect CNVs using low coverage or even ultra-low coverage data through population-scale sequencing.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • DNA Copy Number Variations*
  • Genome, Human*
  • Genome-Wide Association Study
  • High-Throughput Nucleotide Sequencing / statistics & numerical data*
  • Humans
  • Sensitivity and Specificity

Grants and funding

This study was funded by Key Laboratory Project in Shenzhen (CXB200903110066A and CXB201108250096A). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.