SpaDiT: diffusion transformer for spatial gene expression prediction using scRNA-seq

Brief Bioinform. 2024 Sep 23;25(6):bbae571. doi: 10.1093/bib/bbae571.

Abstract

The rapid development of spatially resolved transcriptomics (SRT) technologies has provided unprecedented opportunities for exploring the structure of specific organs or tissues. However, these techniques (such as image-based SRT) can achieve single-cell resolution, but can only capture the expression levels of tens to hundreds of genes. Such spatial transcriptomics (ST) data, carrying a large number of undetected genes, have limited its application value. To address the challenge, we develop SpaDiT, a deep learning framework for spatial reconstruction and gene expression prediction using scRNA-seq data. SpaDiT employs scRNA-seq data as an a priori condition and utilizes shared genes between ST and scRNA-seq data as latent representations to construct inputs, thereby facilitating the accurate prediction of gene expression in ST data. SpaDiT enhances the accuracy of spatial gene expression predictions over a variety of spatial transcriptomics datasets. We have demonstrated the effectiveness of SpaDiT by conducting extensive experiments on both seq-based and image-based ST data. We compared SpaDiT with eight highly effective baseline methods and found that our proposed method achieved an 8%-12% improvement in performance across multiple metrics. Source code and all datasets used in this paper are available at https://github.com/wenwenmin/SpaDiT and https://zenodo.org/records/12792074.

Keywords: diffusion model; gene expression prediction; scRNA-seq data; spatial transcriptomics data; transformer.

MeSH terms

  • Algorithms
  • Computational Biology / methods
  • Deep Learning
  • Gene Expression Profiling / methods
  • Humans
  • RNA-Seq / methods
  • Sequence Analysis, RNA / methods
  • Single-Cell Analysis* / methods
  • Single-Cell Gene Expression Analysis
  • Software
  • Transcriptome