Centripetal modules and ancient introns

Gene. 1999 Sep 30;238(1):85-91. doi: 10.1016/s0378-1119(99)00292-9.

Abstract

We have created an algorithm which instantiates the centripetal definition of modules, compact regions of protein structure, as introduced by Go and Nosaka (M. Go and M. Nosaka, 1987. Protein architecture and the origin of introns. Cold Spring Harbor Symp. Quant. Bio. 52, 915-924). That definition seeks the minima of a function that sums the squares of C-alpha carbon distances over a window around each amino acid residue in a three-dimensional protein structure and identifies such minima with module boundaries. We analyze a set of 44 ancient conserved proteins, with known three-dimensional structures, which have intronless homologues in bacteria and intron-containing homologues in the eukaryotes, with a corresponding set of 988 intron positions. We show that the phase zero intron positions are significantly correlated with the module boundaries (p = 0.0002), while the intron positions that lie within codons, in phase one and phase two, are not correlated with these 'centripetal' module boundaries. Furthermore, we analyze the phylogenetic distribution of intron positions and identify a subset of putatively 'ancient' intron positions: phase zero positions in one phylogenetic kingdom which have an associated intron either in an identical position or within three codons in another phylogenetic kingdom (a notion of intron sliding). This subset of 120 'ancient' introns lies closer to the module boundaries than does the full set of phase zero introns with high significance, a p-value of 0.008. We conclude that the behavior of this set of introns supports the prediction of a mixed theory: that some introns are very old and were used for exon shuffling in the progenote, while many introns have been lost and added since.

MeSH terms

  • Algorithms
  • Animals
  • Eukaryota / genetics
  • Fungi / genetics
  • Introns*
  • Phylogeny
  • Plants / genetics
  • Protein Conformation*