Computing fragmentation trees from metabolite multiple mass spectrometry data

J Comput Biol. 2011 Nov;18(11):1383-97. doi: 10.1089/cmb.2011.0168. Epub 2011 Oct 28.

Abstract

Since metabolites cannot be predicted from the genome sequence, high-throughput de novo identification of small molecules is highly sought. Mass spectrometry (MS) in combination with a fragmentation technique is commonly used for this task. Unfortunately, automated analysis of such data is in its infancy. Recently, fragmentation trees have been proposed as an analysis tool for such data. Additional fragmentation steps (MS(n)) reveal more information about the molecule. We propose to use MS(n) data for the computation of fragmentation trees, and present the Colorful Subtree Closure problem to formalize this task: There, we search for a colorful subtree inside a vertex-colored graph, such that the weight of the transitive closure of the subtree is maximal. We give several negative results regarding the tractability and approximability of this and related problems. We then present an exact dynamic programming algorithm, which is parameterized by the number of colors in the graph and is swift in practice. Evaluation of our method on a dataset of 45 reference compounds showed that the quality of constructed fragmentation trees is improved by using MS(n) instead of MS² measurements.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Data Interpretation, Statistical*
  • Mass Spectrometry / methods*
  • Mass Spectrometry / standards
  • Metabolome*
  • Models, Chemical
  • Molecular Weight
  • Reference Standards