Dense and accurate whole-chromosome haplotyping of individual genomes

Nat Commun. 2017 Nov 3;8(1):1293. doi: 10.1038/s41467-017-01389-4.

Abstract

The diploid nature of the human genome is neglected in many analyses done today, where a genome is perceived as a set of unphased variants with respect to a reference genome. This lack of haplotype-level analyses can be explained by a lack of methods that can produce dense and accurate chromosome-length haplotypes at reasonable costs. Here we introduce an integrative phasing strategy that combines global, but sparse haplotypes obtained from strand-specific single-cell sequencing (Strand-seq) with dense, yet local, haplotype information available through long-read or linked-read sequencing. We provide comprehensive guidance on the required sequencing depths and reliably assign more than 95% of alleles (NA12878) to their parental haplotypes using as few as 10 Strand-seq libraries in combination with 10-fold coverage PacBio data or, alternatively, 10X Genomics linked-read sequencing data. We conclude that the combination of Strand-seq with different technologies represents an attractive solution to chart the genetic variation of diploid genomes.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alleles
  • Chromosomes, Human / genetics*
  • Diploidy
  • Gene Library
  • Genetic Variation
  • Genome, Human*
  • Genomics / methods
  • Haplotypes*
  • High-Throughput Nucleotide Sequencing / methods*
  • High-Throughput Nucleotide Sequencing / statistics & numerical data
  • Humans
  • Sequence Analysis, DNA / methods*
  • Sequence Analysis, DNA / statistics & numerical data