ProtEST: protein multiple sequence alignments from expressed sequence tags

Bioinformatics. 2000 Feb;16(2):111-6. doi: 10.1093/bioinformatics/16.2.111.

Abstract

Motivation: An automatic sequence searching method (ProtEST) is described which constructs multiple protein sequence alignments from protein sequences and translated expressed sequence tags (ESTs). ProtEST is more effective than a simple TBLASTN search of the query against the EST database, as the sequences are automatically clustered, assembled, made non-redundant, checked for sequence errors, translated into protein and then aligned and displayed.

Results: A ProtEST search found a non-redundant, translated, error- and length-corrected EST sequence for > 58% of sequences when single sequences from 1407 Pfam-A seed alignments were used as the probe. The average family size of the resulting alignments of translated EST sequences contained > 10 sequences. In a cross-validated test of protein secondary structure prediction, alignments from the new procedure led to an improvement of 3.4% average Q3 prediction accuracy over single sequences.

Availability: The ProtEST method is available as an Internet World Wide Web service http://barton.ebi.ac.uk/servers/protest.html+ ++ The Wise2 package for protein and genomic comparisons and the ProtESTWise script can be found at http://www.sanger.ac.uk/Software/Wise2

Contact: [email protected]

MeSH terms

  • Amino Acid Sequence
  • Expressed Sequence Tags*
  • Molecular Sequence Data
  • Protein Biosynthesis
  • Proteins / analysis*
  • Sequence Alignment / methods*

Substances

  • Proteins