A Software Tool for Rapid and Automated Preprocessing of Large-Scale Serum Metabolomic Data by Multisegment Injection-Capillary Electrophoresis-Mass Spectrometry

Anal Chem. 2024 Dec 27. doi: 10.1021/acs.analchem.4c03513. Online ahead of print.

Abstract

Mass spectrometry (MS)-based metabolomics often rely on separation techniques when analyzing complex biological specimens to improve method resolution, metabolome coverage, quantitative performance, and/or unknown identification. However, low sample throughput and complicated data preprocessing procedures remain major barriers to affordable metabolomic studies that are scalable to large populations. Herein, we introduce PeakMeister as a new software tool in the R statistical environment to enable standardized processing of serum metabolomic data acquired by multisegment injection-capillary electrophoresis-mass spectrometry (MSI-CE-MS), a high-throughput separation platform (<4 min/sample) which takes advantage of a serial injection format of 13 samples within a single analytical run. We performed a rigorous validation of PeakMeister by analyzing 47 cationic metabolites consistently measured in 5,000 serum and 420 quality control samples from the Brazilian National Survey on Child Nutrition (ENANI-2019) comprising a total of 224,983 metabolite peaks acquired in 40 days across three batches over an eight-month period. A migration time index using a panel of 11 internal standards was introduced to correct for large variations in migration times, which allowed for reliable peak annotation, peak integration, and sample position assignment for serum metabolites having two flanking internal standards or a single comigrating stable-isotope internal standard. PeakMeister accelerated data preprocessing times by 30-fold compared to manual processing of MSI-CE-MS data by an experienced analyst using vendor software, while also achieving excellent peak annotation fidelity (median accuracy >99.9%), acceptable intermediate precision (median CV = 16.0%), consistent metabolite peak integration (mean bias = -2.1%), and good mutual agreement when quantifying 16 plasma metabolites from NIST SRM-1950 (mean bias = -1.3%). Reference ranges are also reported for 40 serum metabolites in a national nutritional survey of Brazilian children under 5 years of age from the ENANI-2019 study. MSI-CE-MS in conjunction with PeakMeister allows for rapid and automated processing of large-scale metabolomic studies that tolerate nonlinear migration time shifts without complicated dynamic time warping or effective mobility scale transformations.