Elimination of redundant protein identifications in high throughput proteomics

Conf Proc IEEE Eng Med Biol Soc. 2005:2005:4803-6. doi: 10.1109/IEMBS.2005.1615546.

Abstract

Tandem mass spectrometry followed by data base search is the preferred method for protein identification in high throughput proteomics. However, standard analysis methods give rise to highly redundant lists of proteins with many proteins identified by the same sets of peptides. In essence, this is a list of all proteins that might be present in the sample. Here we present an algorithm that eliminates redundancy and determines the minimum number of proteins needed to explain the peptides observed. We demonstrate that application of the algorithm results in a significantly smaller set of proteins and greatly reduces the number of "shared" peptides.