FREP: a database of functional repeats in mouse cDNAs

Nucleic Acids Res. 2004 Jan 1;32(Database issue):D471-5. doi: 10.1093/nar/gkh123.

Abstract

The FREP database (http://facts.gsc.riken.go.jp/FREP/) contains 31 396 RepeatMasker-identified non-redundant variant repeat sequences derived from 16,527 mouse cDNAs with protein-coding potential. The repeats were computationally associated with potential effects on transcriptional variation, translation, protein function or involvement in disease to identify Functional REPeats (FREPs). FREPs are defined by the (i) occurrence of exon-exon boundaries in repeats, (ii) presence of polyadenylation sites in 3'UTR-located repeats, (iii) effect on translation, (iv) position in the protein- coding region or protein domains or (v) conditional association with disease MeSH terms. Currently the database contains 9261 (29.5%) inferred FREPs derived from 6861 (41.5%) mouse cDNAs. Integrated evidence of the functional assignments and dynamically generated sequence similarity search results support the exploration and annotation of functional, ancestral or taxon-specific repeats. Keyword and pre-selected feature searches (e.g. coding sequence-repeat or splice site-repeat relations) support intuitive database querying as well as the retrieval of repeat sequences. Integrated sequence search and alignment tools allow the analysis of known or identification of new functional repeat candidates. FREP is a unique resource for illuminating the role of transposons and repetitive sequences in shaping the coding part of the mouse transcriptome and for selecting the appropriate experimental model to study diseases with suspected repeat etiology contributions.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Computational Biology*
  • DNA, Complementary / genetics*
  • Databases, Nucleic Acid*
  • Disease
  • Information Storage and Retrieval
  • Internet
  • Mice
  • Protein Biosynthesis / genetics
  • Proteins / chemistry
  • Proteins / genetics
  • Proteins / metabolism
  • RNA Splice Sites / genetics
  • Repetitive Sequences, Nucleic Acid / genetics*
  • Repetitive Sequences, Nucleic Acid / physiology*
  • Sequence Alignment
  • Transcription, Genetic / genetics

Substances

  • DNA, Complementary
  • Proteins
  • RNA Splice Sites