Selecting for functional alternative splices in ESTs

Genome Res. 2002 Dec;12(12):1837-45. doi: 10.1101/gr.764102.

Abstract

The expressed sequence tag (EST) collection in dbEST provides an extensive resource for detecting alternative splicing on a genomic scale. Using genomically aligned ESTs, a computational tool (TAP) was used to identify alternative splice patterns for 6400 known human genes from the RefSeq database. With sufficient EST coverage, one or more alternatively spliced forms could be detected for nearly all genes examined. To identify high (>95%) confidence observations of alternative splicing, splice variants were clustered on the basis of having mutually exclusive structures, and sample statistics were then applied. Through this selection, alternative splices expected at a frequency of >5% within their respective clusters were seen for only 17%-28% of genes. Although intron retention events (potentially unspliced messages) had been seen for 36% of the genes overall, the same statistical selection yielded reliable cases of intron retention for <5% of genes. For high-confidence alternative splices in the human ESTs, we also noted significantly higher rates both of cross-species conservation in mouse ESTs and of validation in the GenBank mRNA collection. We suggest quantitative analytical approaches such as these can aid in selecting useful targets for further experimental characterization and in so doing may help elucidate the mechanisms and biological implications of alternative splicing.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.
  • Validation Study

MeSH terms

  • Alternative Splicing / genetics*
  • Alternative Splicing / physiology
  • Animals
  • Computational Biology / methods
  • Computational Biology / statistics & numerical data
  • Conserved Sequence / genetics
  • Databases, Genetic
  • Expressed Sequence Tags*
  • Gene Frequency / genetics
  • Genetic Variation / genetics
  • Genome
  • Genome, Human
  • Humans
  • Mice
  • RNA, Messenger / genetics
  • Sequence Alignment / methods

Substances

  • RNA, Messenger