A statistical framework for protein quantitation in bottom-up MS-based proteomics

Bioinformatics. 2009 Aug 15;25(16):2028-34. doi: 10.1093/bioinformatics/btp362. Epub 2009 Jun 17.

Abstract

Motivation: Quantitative mass spectrometry-based proteomics requires protein-level estimates and associated confidence measures. Challenges include the presence of low quality or incorrectly identified peptides and informative missingness. Furthermore, models are required for rolling peptide-level information up to the protein level.

Results: We present a statistical model that carefully accounts for informative missingness in peak intensities and allows unbiased, model-based, protein-level estimation and inference. The model is applicable to both label-based and label-free quantitation experiments. We also provide automated, model-based, algorithms for filtering of proteins and peptides as well as imputation of missing values. Two LC/MS datasets are used to illustrate the methods. In simulation studies, our methods are shown to achieve substantially more discoveries than standard alternatives.

Availability: The software has been made available in the open-source proteomics platform DAnTE (http://omics.pnl.gov/software/).

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Protein
  • Mass Spectrometry / methods*
  • Models, Statistical
  • Proteins / analysis*
  • Proteome / analysis
  • Proteomics / methods*

Substances

  • Proteins
  • Proteome