Prospective estimation of recombination signal efficiency and identification of functional cryptic signals in the genome by statistical modeling

J Exp Med. 2003 Jan 20;197(2):207-20. doi: 10.1084/jem.20020250.

Abstract

The recombination signals (RS) that guide V(D)J recombination are phylogenetically conserved but retain a surprising degree of sequence variability, especially in the nonamer and spacer. To characterize RS variability, we computed the position-wise information, a measure correlated with sequence conservation, for each nucleotide position in an RS alignment and demonstrate that most position-wise information is present in the RS heptamers and nonamers. We have previously demonstrated significant correlations between RS positions and here show that statistical models of the correlation structure that underlies RS variability efficiently identify physiologic and cryptic RS and accurately predict the recombination efficiencies of natural and synthetic RS. In scans of mouse and human genomes, these models identify a highly conserved family of repetitive DNA as an unexpected source of frequent, cryptic RS that rearrange both in extrachromosomal substrates and in their genomic context.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Animals
  • Base Sequence
  • Conserved Sequence
  • DNA / genetics
  • Gene Rearrangement
  • Genome*
  • Genome, Human
  • Humans
  • Mice
  • Models, Genetic*
  • Models, Statistical
  • Molecular Sequence Data
  • Recombination, Genetic*
  • Sequence Homology, Nucleic Acid

Substances

  • DNA