Statistical analysis in metabolic phenotyping

Benjamin J Blaise; Gonçalo D S Correia; Gordon A Haggart; Izabella Surowiec; Caroline Sands; Matthew R Lewis; Jake T M Pearce; Johan Trygg; Jeremy K Nicholson; Elaine Holmes; Timothy M D Ebbels

doi:10.1038/s41596-021-00579-1

Statistical analysis in metabolic phenotyping

Nat Protoc. 2021 Sep;16(9):4299-4326. doi: 10.1038/s41596-021-00579-1. Epub 2021 Jul 28.

Authors

Benjamin J Blaise^#^{1

2

3}, Gonçalo D S Correia^#^{1

4}, Gordon A Haggart^#^{1

4}, Izabella Surowiec^{5

6}, Caroline Sands^{1

4}, Matthew R Lewis^{1

4}, Jake T M Pearce^{1

4}, Johan Trygg^{5

6}, Jeremy K Nicholson^{7

8}, Elaine Holmes^{1

9}, Timothy M D Ebbels¹⁰

Affiliations

¹ Division of Systems Medicine, Department of Metabolism, Digestion & Reproduction, Faculty of Medicine, Imperial College London, London, UK.
² Department of Paediatric Anaesthetics, Evelina London Children's Hospital, Guy's and St Thomas' NHS Foundation Trust, London, UK.
³ Centre for the Developing Brain, King's College London, London, UK.
⁴ National Phenome Centre, Department of Metabolism, Digestion & Reproduction, Faculty of Medicine, Imperial College London, London, UK.
⁵ Computational Life Science Cluster, Department of Chemistry, Umeå University, Umeå, Sweden.
⁶ Sartorius Corporate Research, Sartorius Stedim Data Analytics, Umeå, Sweden.
⁷ Australian National Phenome Centre, Health Futures Institute, Murdoch University, Perth, Western Australia, Australia.
⁸ Institute of Global Health Innovation, Imperial College London, London, UK.
⁹ Centre for Computational & Systems Medicine Institute of Health Futures, Murdoch University, Perth, Western Australia, Australia.
¹⁰ Division of Systems Medicine, Department of Metabolism, Digestion & Reproduction, Faculty of Medicine, Imperial College London, London, UK. [email protected].

^# Contributed equally.

PMID: 34321638
DOI: 10.1038/s41596-021-00579-1

Abstract

Metabolic phenotyping is an important tool in translational biomedical research. The advanced analytical technologies commonly used for phenotyping, including mass spectrometry (MS) and nuclear magnetic resonance (NMR) spectroscopy, generate complex data requiring tailored statistical analysis methods. Detailed protocols have been published for data acquisition by liquid NMR, solid-state NMR, ultra-performance liquid chromatography (LC-)MS and gas chromatography (GC-)MS on biofluids or tissues and their preprocessing. Here we propose an efficient protocol (guidelines and software) for statistical analysis of metabolic data generated by these methods. Code for all steps is provided, and no prior coding skill is necessary. We offer efficient solutions for the different steps required within the complete phenotyping data analytics workflow: scaling, normalization, outlier detection, multivariate analysis to explore and model study-related effects, selection of candidate biomarkers, validation, multiple testing correction and performance evaluation of statistical models. We also provide a statistical power calculation algorithm and safeguards to ensure robust and meaningful experimental designs that deliver reliable results. We exemplify the protocol with a two-group classification study and data from an epidemiological cohort; however, the protocol can be easily modified to cover a wider range of experimental designs or incorporate different modeling approaches. This protocol describes a minimal set of analyses needed to rigorously investigate typical datasets encountered in metabolic phenotyping.

Publication types

Evaluation Study
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Genetic Techniques*
Humans
Metabolism
Metabolomics / methods*
Phenotype*
Software*
Statistics as Topic*

Abstract

Publication types

MeSH terms

Grants and funding