NanoSatellite: accurate characterization of expanded tandem repeat length and sequence through whole genome long-read sequencing on PromethION

Genome Biol. 2019 Nov 14;20(1):239. doi: 10.1186/s13059-019-1856-3.

Abstract

Technological limitations have hindered the large-scale genetic investigation of tandem repeats in disease. We show that long-read sequencing with a single Oxford Nanopore Technologies PromethION flow cell per individual achieves 30× human genome coverage and enables accurate assessment of tandem repeats including the 10,000-bp Alzheimer's disease-associated ABCA7 VNTR. The Guppy "flip-flop" base caller and tandem-genotypes tandem repeat caller are efficient for large-scale tandem repeat assessment, but base calling and alignment challenges persist. We present NanoSatellite, which analyzes tandem repeats directly on electric current data and improves calling of GC-rich tandem repeats, expanded alleles, and motif interruptions.

Keywords: ATP-binding cassette; Alzheimer’s disease; Dynamic time warping (DTW); Long-read whole genome sequencing; Member 7 (ABCA7); Sub-family A; Variable number tandem repeat (VNTR).

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • ATP-Binding Cassette Transporters / genetics
  • Algorithms
  • Feasibility Studies
  • Genome, Human*
  • Genomics / methods*
  • High-Throughput Nucleotide Sequencing*
  • Humans
  • Minisatellite Repeats
  • Tandem Repeat Sequences*

Substances

  • ABCA7 protein, human
  • ATP-Binding Cassette Transporters