Optimization of a new score function for the detection of remote homologs

Proteins. 2000 Dec 1;41(4):498-503. doi: 10.1002/1097-0134(20001201)41:4<498::aid-prot70>3.0.co;2-3.

Abstract

The growth in protein sequence data has placed a premium on ways to infer structure and function of the newly sequenced proteins. One of the most effective ways is to identify a homologous relationship with a protein about which more is known. While close evolutionary relationships can be confidently determined with standard methods, the difficulty increases as the relationships become more distant. All of these methods rely on some score function to measure sequence similarity. The choice of score function is especially critical for these distant relationships. We describe a new method of determining a score function, optimizing the ability to discriminate between homologs and non-homologs. We find that this new score function performs better than standard score functions for the identification of distant homologies.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Artificial Intelligence
  • Databases, Factual
  • Models, Chemical
  • Proteins / chemistry
  • Sequence Alignment / methods*
  • Sequence Alignment / standards
  • Sequence Homology, Amino Acid*

Substances

  • Proteins