GraphCVAE: Uncovering cell heterogeneity and therapeutic target discovery through residual and contrastive learning

Life Sci. 2024 Oct 31:359:123208. doi: 10.1016/j.lfs.2024.123208. Online ahead of print.

Abstract

Advancements in Spatial Transcriptomics (ST) technologies in recent years have transformed the analysis of tissue structure and function within spatial contexts. However, accurately identifying spatial domains remains challenging due to data sparsity and noise. Traditional clustering methods often fail to capture spatial dependencies, while spatial clustering methods struggle with batch effects and data integration. We introduce GraphCVAE, a model designed to enhance spatial domain identification by integrating spatial and morphological information, correcting batch effects, and managing heterogeneous data. GraphCVAE employs a multi-layer Graph Convolutional Network (GCN) and a variational autoencoder to improve the representation and integration of spatial information. Through contrastive learning, the model captures subtle differences between cell types and states. Extensive testing on various ST datasets demonstrates GraphCVAE's robustness and biological contributions. In the dorsolateral prefrontal cortex (DLPFC) dataset, it accurately delineates cortical layer boundaries. In glioblastoma, GraphCVAE reveals critical therapeutic targets such as TF and NFIB. In colorectal cancer, it explores the role of the extracellular matrix in colorectal cancer. The model's performance metrics consistently surpass existing methods, validating its effectiveness. GraphCVAE's advanced visualization capabilities further highlight its precision in resolving spatial structures, making it a powerful tool for spatial transcriptomics analysis and offering new insights into disease studies.

Keywords: Contrastive learning; Deep learning; Gene expression; Spatial clustering; Spatial transcriptomics; Variational graph autoencoder.