Complex hierarchical structures analysis in single-cell data with Poincaré deep manifold transformation

Brief Bioinform. 2024 Nov 22;26(1):bbae687. doi: 10.1093/bib/bbae687.

Abstract

Single-cell RNA sequencing (scRNA-seq) offers remarkable insights into cellular development and differentiation by capturing the gene expression profiles of individual cells. The role of dimensionality reduction and visualization in the interpretation of scRNA-seq data has gained widely acceptance. However, current methods face several challenges, including incomplete structure-preserving strategies and high distortion in embeddings, which fail to effectively model complex cell trajectories with multiple branches. To address these issues, we propose the Poincaré deep manifold transformation (PoincaréDMT) method, which maps high-dimensional scRNA-seq data to a hyperbolic Poincaré disk. This approach preserves global structure from a graph Laplacian matrix while achieving local structure correction through a structure module combined with data augmentation. Additionally, PoincaréDMT alleviates batch effects by integrating a batch graph that accounts for batch labels into the low-dimensional embeddings during network training. Furthermore, PoincaréDMT introduces the Shapley additive explanations method based on trained model to identify the important marker genes in specific clusters and cell differentiation process. Therefore, PoincaréDMT provides a unified framework for multiple key tasks essential for scRNA-seq analysis, including trajectory inference, pseudotime inference, batch correction, and marker gene selection. We validate PoincaréDMT through extensive evaluations on both simulated and real scRNA-seq datasets, demonstrating its superior performance in preserving global and local data structures compared to existing methods.

Keywords: batch correction; data visualization; deep manifold learning; dimensionality reduction; pseudotime inference.

MeSH terms

  • Algorithms
  • Cell Differentiation
  • Computational Biology / methods
  • Gene Expression Profiling / methods
  • Humans
  • RNA-Seq / methods
  • Sequence Analysis, RNA* / methods
  • Single-Cell Analysis* / methods