Large-scale discovery and characterization of protein regulatory motifs in eukaryotes

Daniel S Lieber; Olivier Elemento; Saeed Tavazoie

doi:10.1371/journal.pone.0014444

Large-scale discovery and characterization of protein regulatory motifs in eukaryotes

PLoS One. 2010 Dec 29;5(12):e14444. doi: 10.1371/journal.pone.0014444.

Authors

Daniel S Lieber¹, Olivier Elemento, Saeed Tavazoie

Affiliation

¹ Department of Molecular Biology, Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America.

Abstract

The increasing ability to generate large-scale, quantitative proteomic data has brought with it the challenge of analyzing such data to discover the sequence elements that underlie systems-level protein behavior. Here we show that short, linear protein motifs can be efficiently recovered from proteome-scale datasets such as sub-cellular localization, molecular function, half-life, and protein abundance data using an information theoretic approach. Using this approach, we have identified many known protein motifs, such as phosphorylation sites and localization signals, and discovered a large number of candidate elements. We estimate that ~80% of these are novel predictions in that they do not match a known motif in both sequence and biological context, suggesting that post-translational regulation of protein behavior is still largely unexplored. These predicted motifs, many of which display preferential association with specific biological pathways and non-random positioning in the linear protein sequence, provide focused hypotheses for experimental validation.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Algorithms
Amino Acid Motifs / genetics*
Computational Biology / methods
Databases, Protein
Eukaryota
Humans
Mitochondria / metabolism
Phosphorylation
Protein Processing, Post-Translational
Protein Structure, Tertiary
Proteins / chemistry
Proteome
Proteomics / methods*
Saccharomyces cerevisiae / metabolism
Schizosaccharomyces / metabolism

Substances

Proteins
Proteome

Abstract

Publication types

MeSH terms

Substances

Grants and funding