EBP, a program for protein identification using multiple tandem mass spectrometry datasets

Mol Cell Proteomics. 2007 Mar;6(3):527-36. doi: 10.1074/mcp.T600049-MCP200. Epub 2006 Dec 12.

Abstract

MS/MS combined with database search methods can identify the proteins present in complex mixtures. High throughput methods that infer probable peptide sequences from enzymatically digested protein samples create a challenge in how best to aggregate the evidence for candidate proteins. Typically the results of multiple technical and/or biological replicate experiments must be combined to maximize sensitivity. We present a statistical method for estimating probabilities of protein expression that integrates peptide sequence identifications from multiple search algorithms and replicate experimental runs. The method was applied to create a repository of 797 non-homologous zebrafish (Danio rerio) proteins, at an empirically validated false identification rate under 1%, as a resource for the development of targeted quantitative proteomics assays. We have implemented this statistical method as an analytic module that can be integrated with an existing suite of open-source proteomics software.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Animals
  • Chromatography, Liquid
  • Embryo, Nonmammalian
  • Fish Proteins / analysis*
  • Models, Statistical*
  • Proteomics / methods*
  • Tandem Mass Spectrometry
  • Zebrafish / embryology
  • Zebrafish / metabolism

Substances

  • Fish Proteins