A machine learning one-class logistic regression model to predict stemness for single cell transcriptomics and spatial omics

BMC Genomics. 2023 Nov 28;24(1):717. doi: 10.1186/s12864-023-09722-6.

Abstract

Cell annotation is a crucial methodological component to interpreting single cell and spatial omics data. These approaches were developed for single cell analysis but are often biased, manually curated and yet unproven in spatial omics. Here we apply a stemness model for assessing oncogenic states to single cell and spatial omic cancer datasets. This one-class logistic regression machine learning algorithm is used to extract transcriptomic features from non-transformed stem cells to identify dedifferentiated cell states in tumors. We found this method identifies single cell states in metastatic tumor cell populations without the requirement of cell annotation. This machine learning model identified stem-like cell populations not identified in single cell or spatial transcriptomic analysis using existing methods. For the first time, we demonstrate the application of a ML tool across five emerging spatial transcriptomic and proteomic technologies to identify oncogenic stem-like cell types in the tumor microenvironment.

Keywords: Cancer stem; Machine learning; Proteomic; Single cell; Spatial; Transcriptomic.

MeSH terms

  • Gene Expression Profiling
  • Logistic Models
  • Machine Learning
  • Proteomics*
  • Transcriptome*