Computational discovery of internal micro-exons

Genome Res. 2003 Jun;13(6A):1216-21. doi: 10.1101/gr.677503.

Abstract

Very short exons, also known as micro-exons, occur in large numbers in some eukaryotic genomes. Existing annotation tools have a limited ability to recognize these short sequences, which range in length up to 25 bp. Here, we describe a computational method for the identification of micro-exons using near-perfect alignments between cDNA and genomic DNA sequences. Using this method, we detected 319 micro-exons in 4 complete genomes, of which 224 were previously unknown, human (170), the nematode Caenorhabditis elegans (4), the fruit fly Drosophila melanogaster (14), and the mustard plant Arabidopsis thaliana (36). Comparison of our computational method with popular cDNA alignment programs shows that the new algorithm is both efficient and accurate. The algorithm also aids in the discovery of micro-exon-skipping events and cross-species micro-exon conservation.

Publication types

  • Comparative Study
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Alternative Splicing / genetics
  • Animals
  • Arabidopsis / genetics
  • Base Sequence / genetics
  • Caenorhabditis elegans / genetics
  • Computational Biology / methods*
  • Computational Biology / statistics & numerical data
  • DNA / genetics
  • DNA, Complementary / genetics
  • DNA, Helminth / genetics
  • DNA, Plant / genetics
  • Drosophila melanogaster / genetics
  • Exons / genetics*
  • Humans
  • Molecular Sequence Data
  • Sequence Alignment / methods
  • Sequence Alignment / statistics & numerical data

Substances

  • DNA, Complementary
  • DNA, Helminth
  • DNA, Plant
  • DNA