Modern origin of numerous alternatively spliced human introns from tandem arrays

Proc Natl Acad Sci U S A. 2007 Jan 16;104(3):882-6. doi: 10.1073/pnas.0604777104. Epub 2007 Jan 8.

Abstract

Despite the widespread occurrence of spliceosomal introns in the genomes of higher eukaryotes, their origin remains controversial. One model proposes that the duplication of small genomic portions could have provided the boundaries for new introns. If this mechanism has occurred recently, the 5' and 3' boundaries of each resulting intron should display distinctive sequence similarity. Here, we report that the human genome contains an excess of introns with perfect matching sequences at boundaries. One-third of these introns interrupt the protein-coding sequences of known genes. Introns with the best-matching boundaries are invariably found in tandem arrays of direct repeats. Sequence analysis of the arrays indicates that many intron-breeding repeats have disseminated in several genes at different times during human evolution. A comparison with orthologous regions in mouse and chimpanzee suggests a young age for the human introns with the most-similar boundaries. Finally, we show that these human introns are alternatively spliced with exceptionally high frequency. Our study indicates that genomic duplication has been an important mode of intron gain in mammals. The alternative splicing of transcripts containing these intron-breeding repeats may provide the plasticity required for the rapid evolution of new human proteins.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alternative Splicing / genetics*
  • Animals
  • Evolution, Molecular*
  • Gene Duplication
  • Humans
  • Introns / genetics*
  • Mice
  • Models, Genetic
  • Molecular Sequence Data
  • Oligonucleotide Array Sequence Analysis
  • Pan troglodytes / genetics
  • Sequence Homology, Nucleic Acid
  • Tandem Repeat Sequences / genetics