Rationale: Molecular phylogenetics is the study of evolution and relatedness of organisms or genes. Mass spectrometry is used routinely for bacterial identification and has also been used for phylogenetic analysis, for instance from bone material. Unfortunately, only a small fraction of the acquired tandem mass spectra allow direct interpretation.
Methods: We describe a new algorithm and software for molecular phylogenetics using pairwise comparisons of tandem mass spectra from enzymatically digested proteins. The spectra need not be annotated and all acquired data is used in the analysis. To demonstrate the method, we analyzed tryptic digests of sera from four great apes and two other primates.
Results: The distribution of spectra dot products for thousands of tandem mass spectra collected from two samples provides a measure on the fraction of shared peptides between the two samples. When inverted, this becomes a distance metric. By pairwise comparison between species and averaging over four individuals per species, it was possible to reconstruct the unique correct phylogenetic tree for the great apes and other primates.
Conclusions: The new method described here has several attractive features compared with existing methods, among them simplicity, the unbiased use of all acquired data rather than a small subset of spectra, and the potential use of heavily degraded proteins or proteins with a priori unknown modifications.
Copyright © 2012 John Wiley & Sons, Ltd.