The variation and evolution of complete human centromeres

Nature. 2024 May;629(8010):136-145. doi: 10.1038/s41586-024-07278-3. Epub 2024 Apr 3.

Abstract

Human centromeres have been traditionally very difficult to sequence and assemble owing to their repetitive nature and large size1. As a result, patterns of human centromeric variation and models for their evolution and function remain incomplete, despite centromeres being among the most rapidly mutating regions2,3. Here, using long-read sequencing, we completely sequenced and assembled all centromeres from a second human genome and compared it to the finished reference genome4,5. We find that the two sets of centromeres show at least a 4.1-fold increase in single-nucleotide variation when compared with their unique flanks and vary up to 3-fold in size. Moreover, we find that 45.8% of centromeric sequence cannot be reliably aligned using standard methods owing to the emergence of new α-satellite higher-order repeats (HORs). DNA methylation and CENP-A chromatin immunoprecipitation experiments show that 26% of the centromeres differ in their kinetochore position by >500 kb. To understand evolutionary change, we selected six chromosomes and sequenced and assembled 31 orthologous centromeres from the common chimpanzee, orangutan and macaque genomes. Comparative analyses reveal a nearly complete turnover of α-satellite HORs, with characteristic idiosyncratic changes in α-satellite HORs for each species. Phylogenetic reconstruction of human haplotypes supports limited to no recombination between the short (p) and long (q) arms across centromeres and reveals that novel α-satellite HORs share a monophyletic origin, providing a strategy to estimate the rate of saltatory amplification and mutation of human centromeric DNA.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, N.I.H., Extramural

MeSH terms

  • Animals
  • Centromere Protein A / metabolism
  • Centromere* / genetics
  • Centromere* / metabolism
  • Chromatin / genetics
  • Chromatin / metabolism
  • Chromatin Immunoprecipitation
  • DNA Methylation / genetics
  • DNA, Satellite / genetics
  • Evolution, Molecular*
  • Female
  • Gene Amplification
  • Genetic Variation*
  • Haplotypes
  • Humans
  • Kinetochores / metabolism
  • Macaca / genetics
  • Male
  • Mutation
  • Pan troglodytes / genetics
  • Polymorphism, Single Nucleotide / genetics
  • Pongo / genetics
  • Reference Standards
  • Sequence Alignment
  • Species Specificity

Substances

  • CENPA protein, human
  • Centromere Protein A
  • DNA, Satellite
  • Chromatin