Motivation: Phosphoproteomic experiments are increasingly used to study the changes in signaling occurring across different conditions. It has been proposed that changes in phosphorylation of kinase target sites can be used to infer when a kinase activity is under regulation. However, these approaches have not yet been benchmarked due to a lack of appropriate benchmarking strategies.
Results: We used curated phosphoproteomic experiments and a gold standard dataset containing a total of 184 kinase-condition pairs where regulation is expected to occur to benchmark and compare different kinase activity inference strategies: Z-test, Kolmogorov Smirnov test, Wilcoxon rank sum test, gene set enrichment analysis (GSEA), and a multiple linear regression model. We also tested weighted variants of the Z-test and GSEA that include information on kinase sequence specificity as proxy for affinity. Finally, we tested how the number of known substrates and the type of evidence ( in vivo , in vitro or in silico ) supporting these influence the predictions.
Conclusions: Most models performed well with the Z-test and the GSEA performing best as determined by the area under the ROC curve (Mean AUC = 0.722). Weighting kinase targets by the kinase target sequence preference improves the results marginally. However, the number of known substrates and the evidence supporting the interactions has a strong effect on the predictions.
Availability and implementation: The KSEA implementation is available in https://github.com/ evocellnet/ksea. Additional data is available in http://phosfate.com.
Contact: [email protected] or [email protected].
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author 2017. Published by Oxford University Press.