ScGSLC: An unsupervised graph similarity learning framework for single-cell RNA-seq data clustering

Comput Biol Chem. 2021 Feb:90:107415. doi: 10.1016/j.compbiolchem.2020.107415. Epub 2020 Nov 18.

Abstract

Accurate clustering of cells from single-cell RNA sequencing (scRNA-seq) data is an essential step for biological analysis such as putative cell type identification. However, scRNA-seq data has high dimension and high sparsity, which makes traditional clustering methods less effective to reflect the similarity between cells. Since genetic network fundamentally defines the functions of cell and deep learning shows strong advantages in network representation learning, we propose a novel scRNA-seq clustering framework ScGSLC based on graph similarity learning. ScGSLC effectively integrates scRNA-seq data and protein-protein interaction network to a graph. Then graph convolution network is employed by ScGSLC to embedding graph and clustering the cells by the calculated similarity between graphs. Unsupervised clustering results of nine public data sets demonstrate that ScGSLC shows better performance than the state-of-the-art methods.

Keywords: Graph convolution network; Graph embedding; Graph similarity; Single-cell RNA sequencing data; Unsupervised clustering.

MeSH terms

  • Algorithms*
  • Cluster Analysis
  • Gene Regulatory Networks*
  • Humans
  • Protein Interaction Maps
  • RNA-Seq*
  • Single-Cell Analysis*
  • Software