Putative cell type discovery from single-cell gene expression data

Nat Methods. 2020 Jun;17(6):621-628. doi: 10.1038/s41592-020-0825-9. Epub 2020 May 18.

Abstract

We present the Single-Cell Clustering Assessment Framework, a method for the automated identification of putative cell types from single-cell RNA sequencing (scRNA-seq) data. By iteratively applying a machine learning approach to a given set of cells, we simultaneously identify distinct cell groups and a weighted list of feature genes for each group. The differentially expressed feature genes discriminate the given cell group from other cells. Each such group of cells corresponds to a putative cell type or state, characterized by the feature genes as markers. Benchmarking using expert-annotated scRNA-seq datasets shows that our method automatically identifies the 'ground truth' cell assignments with high accuracy.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Cluster Analysis
  • Datasets as Topic
  • Gene Expression*
  • Humans
  • Machine Learning*
  • RNA-Seq / methods*
  • Reproducibility of Results
  • Single-Cell Analysis / methods*
  • Software