A joint analysis of single cell transcriptomics and proteomics using transformer

NPJ Syst Biol Appl. 2025 Jan 2;11(1):1. doi: 10.1038/s41540-024-00484-9.

Abstract

CITE-seq provides a powerful method for simultaneously measuring RNA and protein expression at the single-cell level. The integrated analysis of RNA and protein expression in identical cells is crucial for revealing cellular heterogeneity. However, the high experimental costs associated with CITE-seq limit its widespread application. In this paper, we propose scTEL, a deep learning framework based on Transformer encoder layers, to establish a mapping from sequenced RNA expression to unobserved protein expression in the same cells. This computation-based approach significantly reduces the experimental costs of protein expression sequencing. We are now able to predict protein expression using single-cell RNA sequencing (scRNA-seq) data, which is well-established and available at a lower cost. Moreover, our scTEL model offers a unified framework for integrating multiple CITE-seq datasets, addressing the challenge posed by the partial overlap of protein panels across different datasets. Empirical validation on public CITE-seq datasets demonstrates scTEL significantly outperforms existing methods.

MeSH terms

  • Algorithms
  • Computational Biology* / methods
  • Deep Learning
  • Gene Expression Profiling / methods
  • Humans
  • Proteomics* / methods
  • Sequence Analysis, RNA* / methods
  • Single-Cell Analysis* / methods
  • Transcriptome* / genetics