trEST, trGEN and Hits: access to databases of predicted protein sequences

M Pagni; C Iseli; T Junier; L Falquet; V Jongeneel; P Bucher

doi:10.1093/nar/29.1.148

trEST, trGEN and Hits: access to databases of predicted protein sequences

Nucleic Acids Res. 2001 Jan 1;29(1):148-51. doi: 10.1093/nar/29.1.148.

Authors

M Pagni¹, C Iseli, T Junier, L Falquet, V Jongeneel, P Bucher

Affiliation

¹ Swiss Institute of Bioinformatics, Ludwig Institute for Cancer Research, Chemin des Boveresses 155, CH-1066, Epalinges s/Lausanne, Switzerland.

Abstract

High throughput genome (HTG) and expressed sequence tag (EST) sequences are currently the most abundant nucleotide sequence classes in the public database. The large volume, high degree of fragmentation and lack of gene structure annotations prevent efficient and effective searches of HTG and EST data for protein sequence homologies by standard search methods. Here, we briefly describe three newly developed resources that should make discovery of interesting genes in these sequence classes easier in the future, especially to biologists not having access to a powerful local bioinformatics environment. trEST and trGEN are regularly regenerated databases of hypothetical protein sequences predicted from EST and HTG sequences, respectively. Hits is a web-based data retrieval and analysis system providing access to precomputed matches between protein sequences (including sequences from trEST and trGEN) and patterns and profiles from Prosite and Pfam. The three resources can be accessed via the Hits home page (http://hits. isb-sib.ch).

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Amino Acid Sequence*
Animals
Databases, Factual
Expressed Sequence Tags*
Humans
Information Services
Internet
Markov Chains*
Molecular Sequence Data
Proteins / genetics
Sequence Alignment
Sequence Homology, Amino Acid

Substances

Proteins