Deep Neural Networks for Classification of LC-MS Spectral Peaks

Anal Chem. 2019 Oct 1;91(19):12407-12413. doi: 10.1021/acs.analchem.9b02983. Epub 2019 Sep 19.

Abstract

Liquid chromatography-mass spectrometry (LC-MS)-based metabolomics has emerged as a valuable tool for biological discovery, capable of assaying thousands of diverse chemical entities in a single biospecimen. Processing of nontargeted LC-MS spectral data requires identification and isolation of true spectral features from the random, false noise peaks that comprise a significant portion of total signals, using inexact peak selection algorithms and time-consuming visual inspection of data. To increase the fidelity and speed of data processing, herein we establish, optimize, and evaluate a machine learning pipeline employing deep neural networks as well as a simpler multiple logistic regression model for classification of spectral features from nontargeted LC-MS metabolomics data. Machine learning-based approaches were found to remove up to 90% of false peaks from complex nontargeted LC-MS data sets without reducing true positive signals and exhibit excellent reproducibility across multiple data sets. Application of machine learning for nontargeted LC-MS-based peak selection provides for robust and scalable peak classification and data filtering, enabling handling and processing of large scale, complex metabolomics data sets.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Chromatography, Liquid*
  • Data Analysis*
  • Deep Learning*
  • Mass Spectrometry*
  • Metabolomics*