Background: Protein phosphorylation is an extremely important mechanism of cellular regulation. A large-scale study of phosphoproteins in a whole-cell lysate of Saccharomyces cerevisiae has previously identified 383 phosphorylation sites in 216 peptide sequences. However, the protein kinases responsible for the phosphorylation of the identified proteins have not previously been assigned.
Results: We used Predikin in combination with other bioinformatic tools, to predict which of 116 unique protein kinases in yeast phosphorylates each experimentally determined site in the phosphoproteome. The prediction was based on the match between the phosphorylated 7-residue sequence and the predicted substrate specificity of each kinase, with the highest weight applied to the residues or positions that contribute most to the substrate specificity. We estimated the reliability of the predictions by performing a parallel prediction on phosphopeptides for which the kinase has been experimentally determined.
Conclusion: The results reveal that the functions of the protein kinases and their predicted phosphoprotein substrates are often correlated, for example in endocytosis, cytokinesis, transcription, replication, carbohydrate metabolism and stress response. The predictions link phosphoproteins of unknown function with protein kinases with known functions and vice versa, suggesting functions for the uncharacterized proteins. The study indicates that the phosphoproteins and the associated protein kinases represented in our dataset have housekeeping cellular roles; certain kinases are not represented because they may only be activated during specific cellular responses. Our results demonstrate the utility of our previously reported protein kinase substrate prediction approach (Predikin) as a tool for establishing links between kinases and phosphoproteins that can subsequently be tested experimentally.