Automatic cell-type harmonization and integration across Human Cell Atlas datasets

Cell. 2023 Dec 21;186(26):5876-5891.e20. doi: 10.1016/j.cell.2023.11.026.

Abstract

Harmonizing cell types across the single-cell community and assembling them into a common framework is central to building a standardized Human Cell Atlas. Here, we present CellHint, a predictive clustering tree-based tool to resolve cell-type differences in annotation resolution and technical biases across datasets. CellHint accurately quantifies cell-cell transcriptomic similarities and places cell types into a relationship graph that hierarchically defines shared and unique cell subtypes. Application to multiple immune datasets recapitulates expert-curated annotations. CellHint also reveals underexplored relationships between healthy and diseased lung cell states in eight diseases. Furthermore, we present a workflow for fast cross-dataset integration guided by harmonized cell types and cell hierarchy, which uncovers underappreciated cell types in adult human hippocampus. Finally, we apply CellHint to 12 tissues from 38 datasets, providing a deeply curated cross-tissue database with ∼3.7 million cells and various machine learning models for automatic cell annotation across human tissues.

Keywords: Human Cell Atlas; cell hierarchy; cell-type harmonization; data integration; harmonization graph; machine learning; organ atlas; predictive clustering tree; single cell.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Factual
  • Gene Expression Profiling*
  • Humans
  • Single-Cell Analysis
  • Transcriptome*