Characterization of repeat arrays in ultra-long nanopore reads reveals frequent origin of satellite DNA from retrotransposon-derived tandem repeats

Plant J. 2020 Jan;101(2):484-500. doi: 10.1111/tpj.14546. Epub 2019 Nov 3.

Abstract

Amplification of monomer sequences into long contiguous arrays is the main feature distinguishing satellite DNA from other tandem repeats, yet it is also the main obstacle in its investigation because these arrays are in principle difficult to assemble. Here we explore an alternative, assembly-free approach that utilizes ultra-long Oxford Nanopore reads to infer the length distribution of satellite repeat arrays, their association with other repeats and the prevailing sequence periodicities. Using the satellite DNA-rich legume plant Lathyrus sativus as a model, we demonstrated this approach by analyzing 11 major satellite repeats using a set of nanopore reads ranging from 30 to over 200 kb in length and representing 0.73× genome coverage. We found surprising differences between the analyzed repeats because only two of them were predominantly organized in long arrays typical for satellite DNA. The remaining nine satellites were found to be derived from short tandem arrays located within LTR-retrotransposons that occasionally expanded in length. While the corresponding LTR-retrotransposons were dispersed across the genome, this array expansion occurred mainly in the primary constrictions of the L. sativus chromosomes, which suggests that these genome regions are favourable for satellite DNA accumulation.

Keywords: Lathyrus sativus; centromeres; fluorescence in situ hybridization (FISH); heterochromatin; long-range organization; nanopore sequencing; satellite DNA; sequence evolution; technical advance.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Centromere
  • Chromosomes, Plant
  • DNA, Plant / genetics
  • DNA, Satellite*
  • Evolution, Molecular
  • Gene Frequency*
  • Genome, Plant
  • Heterochromatin
  • Lathyrus / genetics
  • Nanopores*
  • Retroelements*
  • Tandem Repeat Sequences*

Substances

  • DNA, Plant
  • DNA, Satellite
  • Heterochromatin
  • Retroelements