Utilizing Aggregated Molecular Phenotype (AMP) Scores to Visualize Simultaneous Molecular Changes in Mass Spectrometry Imaging Data

bioRxiv [Preprint]. 2023 Jun 5:2023.06.01.543306. doi: 10.1101/2023.06.01.543306.

Abstract

Mass spectrometry imaging (MSI) has gained increasing popularity for tissue-based diagnostics due to its ability to identify and visualize molecular characteristics unique to different phenotypes within heterogeneous samples. Data from MSI experiments are often visualized using single ion images and further analyzed using machine learning and multivariate statistics to identify m/z features of interest and create predictive models for phenotypic classification. However, often only a single molecule or m/z feature is visualized per ion image, and mainly categorical classifications are provided from the predictive models. As an alternative approach, we developed an aggregated molecular phenotype (AMP) scoring system. AMP scores are generated using an ensemble machine learning approach to first select features differentiating phenotypes, weight the features using logistic regression, and combine the weights and feature abundances. AMP scores are then scaled between 0 and 1, with lower values generally corresponding to class 1 phenotypes (typically control) and higher scores relating to class 2 phenotypes. AMP scores therefore allow the evaluation of multiple features simultaneously and showcase the degree to which these features correlate with various phenotypes, leading to high diagnostic accuracy and interpretability of predictive models. Here, AMP score performance was evaluated using metabolomic data collected from desorption electrospray ionization (DESI) MSI. Initial comparisons of cancerous human tissues to normal or benign counterparts illustrated that AMP scores distinguished phenotypes with high accuracy, sensitivity, and specificity. Furthermore, when combined with spatial coordinates, AMP scores allow visualization of tissue sections in one map with distinguished phenotypic borders, highlighting their diagnostic utility.

Keywords: feature selection; logistic regression; machine learning; mass spectrometry imaging; visualization.

Publication types

  • Preprint