Using evolutionary information for the query and target improves fold recognition

Proteins. 2004 Feb 1;54(2):342-50. doi: 10.1002/prot.10565.

Abstract

In this study, we show that it is possible to increase the performance over PSI-BLAST by using evolutionary information for both query and target sequences. This information can be used in three different ways: by sequence linking, profile-profile alignments, and by combining sequence-profile and profile-sequence searches. If only PSI-BLAST is used, 16% of superfamily-related protein domains can be detected at 90% specificity, but if a sequence-profile and a profile-sequence search are combined, this is increased to 20%, profile-profile searches detects 19%, whereas a linking procedure identifies 22% of these proteins. All three methods show equal performance, but the best combination of speed and accuracy seems to be obtained by the combined searches, because this method shows a good performance even at high specificity and the lowest computational cost. In addition, we show that the E-values reported by all these methods, including PSI-BLAST, underestimate the true rate of false positives. This behavior is seen even if a very strict E-value cutoff and a limited number of iterations are used. However, the difference is more pronounced with a looser E-value cutoff and more iterations.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • Evolution, Molecular*
  • Protein Folding*
  • Proteins / chemistry*
  • Proteins / classification
  • Proteins / metabolism*
  • Sensitivity and Specificity
  • Software*
  • Time Factors

Substances

  • Proteins