The conservation pattern of short linear motifs is highly correlated with the function of interacting protein domains

BMC Genomics. 2008 Oct 1:9:452. doi: 10.1186/1471-2164-9-452.

Abstract

Background: Many well-represented domains recognize primary sequences usually less than 10 amino acids in length, called Short Linear Motifs (SLiMs). Accurate prediction of SLiMs has been difficult because they are short (often < 10 amino acids) and highly degenerate. In this study, we combined scoring matrixes derived from peptide library and conservation analysis to identify protein classes enriched of functional SLiMs recognized by SH2, SH3, PDZ and S/T kinase domains.

Results: Our combined approach revealed that SLiMs are highly conserved in proteins from functional classes that are known to interact with a specific domain, but that they are not conserved in most other protein groups. We found that SLiMs recognized by SH2 domains were highly conserved in receptor kinases/phosphatases, adaptor molecules, and tyrosine kinases/phosphatases, that SLiMs recognized by SH3 domains were highly conserved in cytoskeletal and cytoskeletal-associated proteins, that SLiMs recognized by PDZ domains were highly conserved in membrane proteins such as channels and receptors, and that SLiMs recognized by S/T kinase domains were highly conserved in adaptor molecules, S/T kinases/phosphatases, and proteins involved in transcription or cell cycle control. We studied Tyr-SLiMs recognized by SH2 domains in more detail, and found that SH2-recognized Tyr-SLiMs on the cytoplasmic side of membrane proteins are more highly conserved than those on the extra-cellular side. Also, we found that SH2-recognized Tyr-SLiMs that are associated with SH3 motifs and a tyrosine kinase phosphorylation motif are more highly conserved.

Conclusion: The interactome of protein domains is reflected by the evolutionary conservation of SLiMs recognized by these domains. Combining scoring matrixes derived from peptide libraries and conservation analysis, we would be able to find those protein groups that are more likely to interact with specific domains.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Motifs / genetics*
  • Amino Acid Sequence
  • Animals
  • Conserved Sequence
  • Databases, Protein
  • Humans
  • Molecular Sequence Data
  • PDZ Domains
  • Protein Interaction Domains and Motifs*
  • Protein-Tyrosine Kinases / chemistry
  • Protein-Tyrosine Kinases / genetics
  • Sequence Alignment
  • src Homology Domains

Substances

  • Protein-Tyrosine Kinases