GraphCVAE: Uncovering cell heterogeneity and therapeutic target discovery through residual and contrastive learning

Zhiwei Zhang; Mengqiu Wang; Ruoyan Dai; Zhenghui Wang; Lixin Lei; Xudong Zhao; Kaitai Han; Chaojing Shi; Qianjin Guo

doi:10.1016/j.lfs.2024.123208

GraphCVAE: Uncovering cell heterogeneity and therapeutic target discovery through residual and contrastive learning

Life Sci. 2024 Oct 31:359:123208. doi: 10.1016/j.lfs.2024.123208. Online ahead of print.

Authors

Zhiwei Zhang¹, Mengqiu Wang¹, Ruoyan Dai¹, Zhenghui Wang¹, Lixin Lei¹, Xudong Zhao¹, Kaitai Han¹, Chaojing Shi¹, Qianjin Guo²

Affiliations

¹ Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China.
² Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China. Electronic address: [email protected].

PMID: 39488267
DOI: 10.1016/j.lfs.2024.123208

Abstract

Advancements in Spatial Transcriptomics (ST) technologies in recent years have transformed the analysis of tissue structure and function within spatial contexts. However, accurately identifying spatial domains remains challenging due to data sparsity and noise. Traditional clustering methods often fail to capture spatial dependencies, while spatial clustering methods struggle with batch effects and data integration. We introduce GraphCVAE, a model designed to enhance spatial domain identification by integrating spatial and morphological information, correcting batch effects, and managing heterogeneous data. GraphCVAE employs a multi-layer Graph Convolutional Network (GCN) and a variational autoencoder to improve the representation and integration of spatial information. Through contrastive learning, the model captures subtle differences between cell types and states. Extensive testing on various ST datasets demonstrates GraphCVAE's robustness and biological contributions. In the dorsolateral prefrontal cortex (DLPFC) dataset, it accurately delineates cortical layer boundaries. In glioblastoma, GraphCVAE reveals critical therapeutic targets such as TF and NFIB. In colorectal cancer, it explores the role of the extracellular matrix in colorectal cancer. The model's performance metrics consistently surpass existing methods, validating its effectiveness. GraphCVAE's advanced visualization capabilities further highlight its precision in resolving spatial structures, making it a powerful tool for spatial transcriptomics analysis and offering new insights into disease studies.

Keywords: Contrastive learning; Deep learning; Gene expression; Spatial clustering; Spatial transcriptomics; Variational graph autoencoder.