Background: Untargeted metabolomics requires robust and reliable strategies for data processing to extract relevant information form the underlying raw data. Multiple platforms for data processing are available, but the choice of software tool can have an impact on the analysis. This study provides a comprehensive evaluation of four workflows based on commonly used metabolomics software tools: XCMS, Compound Discoverer, MS-DIAL, and MZmine. These tools were applied to a dataset derived from bovine saliva samples spiked with small polar molecules analyzed by anion exchange chromatography coupled to high resolution mass spectrometry.
Results: The analysis revealed significant differences in the number and overlap of detected features, with only approximately 8 % of the features included in all four peak tables. Among the overlapping features, MS-DIAL demonstrated the greatest similarity to manual integration, while XCMS and MZmine also performed well. In contrast, Compound Discoverer had issues to reliably integrate high baseline peaks. This study also explores various post-processing strategies, including missing value imputation, transformation, scaling, and filtering. The assessment of missing values indicated that they primarily originated from low abundance, making imputation with small values the most effective approach. No clear evidence suggested that transformation is necessary for downstream statistical analyses. Auto scaling emerged as the most suitable strategy for data scaling. Low thresholds for blank filtering were found to be the most effective in enhancing data quality. The optimization of filtering thresholds required a careful balance to remove unnecessary information while retaining vital data.
Significance and novelty: This work provides an overview of commonly applied strategies in untargeted metabolomics analysis, emphasizing the importance of careful workflow selection and optimization. It serves as a resource for refining data processing strategies to achieve accurate and reliable results, while also offering fresh insights into the challenges encountered throughout the untargeted metabolomics processing pipeline.
Keywords: Anion exchange chromatography; Data treatment; Mass spectrometry; Metabolomics; Processing.
Copyright © 2024 The Authors. Published by Elsevier B.V. All rights reserved.