Deep clustering representation of spatially resolved transcriptomics data using multi-view variational graph auto-encoders with consensus clustering

Comput Struct Biotechnol J. 2024 Dec 2:23:4369-4383. doi: 10.1016/j.csbj.2024.11.041. eCollection 2024 Dec.

Abstract

The rapid development of spatial transcriptomics (ST) technology has provided unprecedented opportunities to understand tissue relationships and functions within specific spatial contexts. Accurate identification of spatial domains is crucial for downstream spatial transcriptomics analysis. However, effectively combining gene expression data, histological images and spatial coordinate data to identify spatial domains remains a challenge. To this end, we propose STMVGAE, a novel spatial transcriptomics analysis tool that combines a multi-view variational graph autoencoder with a consensus clustering framework. STMVGAE begins by extracting histological images features using a pre-trained convolutional neural network (CNN) and integrates these features with gene expression data to generate augmented gene expression profiles. Subsequently, multiple graphs (views) are constructed using various similarity measures, capturing different aspects of the spatial and transcriptional relationships. These views, combined with the augmented gene expression data, are then processed through variational graph auto-encoders (VGAEs) to learn multiple low-dimensional latent embeddings. Finally, the model employs a consensus clustering method to integrate the clustering results derived from these embeddings, significantly improving clustering accuracy and stability. We applied STMVGAE to five real datasets and compared it with five state-of-the-art methods, showing that STMVGAE consistently achieves competitive results. We assessed its capabilities in spatial domain identification and evaluated its performance across various downstream tasks, including UMAP visualization, PAGA trajectory inference, spatially variable gene (SVG) identification, denoising, batch integration, and other analyses. All code and public datasets used in this paper is available at https://github.com/wenwenmin/STMVGAE and https://zenodo.org/records/13119867.

Keywords: Consensus clustering; Deep learning; Multi-view variational graph autoencoders; Spatially resolved transcriptomics.