Cruciform-forming inverted repeats appear to have mediated many of the microinversions that distinguish the human and chimpanzee genomes

Chromosome Res. 2009;17(4):469-83. doi: 10.1007/s10577-009-9039-9. Epub 2009 May 28.

Abstract

Submicroscopic inversions have contributed significantly to the genomic divergence between humans and chimpanzees over evolutionary time. Those microinversions which are flanked by segmental duplications (SDs) are presumed to have originated via non-allelic homologous recombination between SDs arranged in inverted orientation. However, the nature of the mechanisms underlying those inversions which are not flanked by SDs remains unclear. We have investigated 35 such inversions, ranging in size from 51-nt to 22056-nt, with the goal of characterizing the DNA sequences in the breakpoint-flanking regions. Using the macaque genome as an outgroup, we determined the lineage specificity of these inversions and noted that the majority (N = 31; 89%) were associated with deletions (of length between 1-nt and 6754-nt) immediately adjacent to one or both inversion breakpoints. Overrepresentations of both direct and inverted repeats, >or= 6-nt in length and capable of non-B DNA structure formation, were noted in the vicinity of breakpoint junctions suggesting that these repeats could have contributed to double strand breakage. Inverted repeats capable of cruciform structure formation were also found to be a common feature of the inversion breakpoint-flanking regions, consistent with these inversions having originated through the resolution of Holliday junction-like cruciforms. Sequences capable of non-B DNA structure formation have previously been implicated in promoting gross deletions and translocations causing human genetic disease. We conclude that non-B DNA forming sequences may also have promoted the occurrence of mutations in an evolutionary context, giving rise to at least some of the inversion/deletions which now serve to distinguish the human and chimpanzee genomes.

MeSH terms

  • Animals
  • Base Sequence
  • Chromosome Breakage
  • Chromosome Inversion*
  • Computational Biology / methods
  • DNA, Cruciform / genetics*
  • Evolution, Molecular
  • Genome, Human*
  • Humans
  • Inverted Repeat Sequences / genetics*
  • Models, Genetic
  • Molecular Sequence Data
  • Pan troglodytes / genetics*
  • Recombination, Genetic
  • Reproducibility of Results
  • Sequence Analysis, DNA

Substances

  • DNA, Cruciform