Principal component analysis of normalized full spectrum mass spectrometry data in multiMS-toolbox: An effective tool to identify important factors for classification of different metabolic patterns and bacterial strains

Rapid Commun Mass Spectrom. 2018 Jun 15;32(11):871-881. doi: 10.1002/rcm.8110.

Abstract

Rationale: Explorative statistical analysis of mass spectrometry data is still a time-consuming step. We analyzed critical factors for application of principal component analysis (PCA) in mass spectrometry and focused on two whole spectrum based normalization techniques and their application in the analysis of registered peak data and, in comparison, in full spectrum data analysis. We used this technique to identify different metabolic patterns in the bacterial culture of Cronobacter sakazakii, an important foodborne pathogen.

Methods: Two software utilities, the ms-alone, a python-based utility for mass spectrometry data preprocessing and peak extraction, and the multiMS-toolbox, an R software tool for advanced peak registration and detailed explorative statistical analysis, were implemented. The bacterial culture of Cronobacter sakazakii was cultivated on Enterobacter sakazakii Isolation Agar, Blood Agar Base and Tryptone Soya Agar for 24 h and 48 h and applied by the smear method on an Autoflex speed MALDI-TOF mass spectrometer.

Results: For three tested cultivation media only two different metabolic patterns of Cronobacter sakazakii were identified using PCA applied on data normalized by two different normalization techniques. Results from matched peak data and subsequent detailed full spectrum analysis identified only two different metabolic patterns - a cultivation on Enterobacter sakazakii Isolation Agar showed significant differences to the cultivation on the other two tested media. The metabolic patterns for all tested cultivation media also proved the dependence on cultivation time.

Conclusions: Both whole spectrum based normalization techniques together with the full spectrum PCA allow identification of important discriminative factors in experiments with several variable condition factors avoiding any problems with improper identification of peaks or emphasis on bellow threshold peak data. The amounts of processed data remain still manageable. Both implemented software utilities are available free of charge from http://uprt.vscht.cz/ms.

MeSH terms

  • Bacteriological Techniques
  • Cronobacter sakazakii / growth & development
  • Cronobacter sakazakii / metabolism*
  • Culture Media
  • Principal Component Analysis*
  • Software*
  • Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization / methods
  • Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization / standards
  • Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization / statistics & numerical data*
  • Time Factors

Substances

  • Culture Media