Integrating shotgun proteomics and mRNA expression data to improve protein identification

Bioinformatics. 2009 Jun 1;25(11):1397-403. doi: 10.1093/bioinformatics/btp168. Epub 2009 Mar 24.

Abstract

Motivation: Tandem mass spectrometry (MS/MS) offers fast and reliable characterization of complex protein mixtures, but suffers from low sensitivity in protein identification. In a typical shotgun proteomics experiment, it is assumed that all proteins are equally likely to be present. However, there is often other information available, e.g. the probability of a protein's presence is likely to correlate with its mRNA concentration.

Results: We develop a Bayesian score that estimates the posterior probability of a protein's presence in the sample given its identification in an MS/MS experiment and its mRNA concentration measured under similar experimental conditions. Our method, MSpresso, substantially increases the number of proteins identified in an MS/MS experiment at the same error rate, e.g. in yeast, MSpresso increases the number of proteins identified by approximately 40%. We apply MSpresso to data from different MS/MS instruments, experimental conditions and organisms (Escherichia coli, human), and predict 19-63% more proteins across the different datasets. MSpresso demonstrates that incorporating prior knowledge of protein presence into shotgun proteomics experiments can substantially improve protein identification scores.

Availability and implementation: Software is available upon request from the authors. Mass spectrometry datasets and supplementary information are available from (http://www.marcottelab.org/MSpresso/).

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Bayes Theorem
  • Databases, Protein
  • Humans
  • Proteins / chemistry*
  • Proteome / analysis
  • Proteome / genetics
  • Proteome / metabolism
  • Proteomics / methods*
  • RNA, Messenger / metabolism*
  • Software
  • Tandem Mass Spectrometry / methods
  • User-Computer Interface

Substances

  • Proteins
  • Proteome
  • RNA, Messenger