A model for random sampling and estimation of relative protein abundance in shotgun proteomics

Hongbin Liu; Rovshan G Sadygov; John R Yates 3rd

doi:10.1021/ac0498563

A model for random sampling and estimation of relative protein abundance in shotgun proteomics

Anal Chem. 2004 Jul 15;76(14):4193-201. doi: 10.1021/ac0498563.

Authors

Hongbin Liu¹, Rovshan G Sadygov, John R Yates 3rd

Affiliation

¹ Department of Cell Biology, The Scripps Research Institute, La Jolla, California 92037, USA.

PMID: 15253663
DOI: 10.1021/ac0498563

Abstract

Proteomic analysis of complex protein mixtures using proteolytic digestion and liquid chromatography in combination with tandem mass spectrometry is a standard approach in biological studies. Data-dependent acquisition is used to automatically acquire tandem mass spectra of peptides eluting into the mass spectrometer. In more complicated mixtures, for example, whole cell lysates, data-dependent acquisition incompletely samples among the peptide ions present rather than acquiring tandem mass spectra for all ions available. We analyzed the sampling process and developed a statistical model to accurately predict the level of sampling expected for mixtures of a specific complexity. The model also predicts how many analyses are required for saturated sampling of a complex protein mixture. For a yeast-soluble cell lysate 10 analyses are required to reach a 95% saturation level on protein identifications based on our model. The statistical model also suggests a relationship between the level of sampling observed for a protein and the relative abundance of the protein in the mixture. We demonstrate a linear dynamic range over 2 orders of magnitude by using the number of spectra (spectral sampling) acquired for each protein.

Publication types

Comparative Study
Evaluation Study
Research Support, U.S. Gov't, Non-P.H.S.
Research Support, U.S. Gov't, P.H.S.

MeSH terms

Data Collection / methods
Models, Statistical*
Proteins / analysis*
Proteomics / methods*
Proteomics / statistics & numerical data

Substances

Proteins

Abstract

Publication types

MeSH terms

Substances

Grants and funding