Decomposing Cell Identity for Transfer Learning across Cellular Measurements, Platforms, Tissues, and Species

Cell Syst. 2019 May 22;8(5):395-411.e8. doi: 10.1016/j.cels.2019.04.004.

Abstract

Analysis of gene expression in single cells allows for decomposition of cellular states as low-dimensional latent spaces. However, the interpretation and validation of these spaces remains a challenge. Here, we present scCoGAPS, which defines latent spaces from a source single-cell RNA-sequencing (scRNA-seq) dataset, and projectR, which evaluates these latent spaces in independent target datasets via transfer learning. Application of developing mouse retina to scRNA-Seq reveals intrinsic relationships across biological contexts and assays while avoiding batch effects and other technical features. We compare the dimensions learned in this source dataset to adult mouse retina, a time-course of human retinal development, select scRNA-seq datasets from developing brain, chromatin accessibility data, and a murine-cell type atlas to identify shared biological features. These tools lay the groundwork for exploratory analysis of scRNA-seq data via latent space representations, enabling a shift in how we compare and identify cells beyond reliance on marker genes or ensemble molecular identity.

Keywords: NMF; developmental biology; dimension reduction; integrated analysis; latent spaces; retina; scRNA-seq; single cells; transfer learning.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Animals
  • Databases, Genetic
  • Exome Sequencing / methods
  • Female
  • Gene Expression Profiling / methods*
  • Humans
  • Machine Learning
  • Male
  • Mice
  • Mice, Transgenic
  • Retina / embryology
  • Sequence Analysis, RNA / methods*
  • Single-Cell Analysis / methods*
  • Software
  • Transcriptome / genetics