Unsupervised removal of systematic background noise from droplet-based single-cell experiments using CellBender

Nat Methods. 2023 Sep;20(9):1323-1335. doi: 10.1038/s41592-023-01943-7. Epub 2023 Aug 7.

Abstract

Droplet-based single-cell assays, including single-cell RNA sequencing (scRNA-seq), single-nucleus RNA sequencing (snRNA-seq) and cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq), generate considerable background noise counts, the hallmark of which is nonzero counts in cell-free droplets and off-target gene expression in unexpected cell types. Such systematic background noise can lead to batch effects and spurious differential gene expression results. Here we develop a deep generative model based on the phenomenology of noise generation in droplet-based assays. The proposed model accurately distinguishes cell-containing droplets from cell-free droplets, learns the background noise profile and provides noise-free quantification in an end-to-end fashion. We implement this approach in the scalable and robust open-source software package CellBender. Analysis of simulated data demonstrates that CellBender operates near the theoretically optimal denoising limit. Extensive evaluations using real datasets and experimental benchmarks highlight enhanced concordance between droplet-based single-cell data and established gene expression patterns, while the learned background noise profile provides evidence of degraded or uncaptured cell types.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Gene Expression Profiling / methods
  • RNA, Small Nuclear*
  • Sequence Analysis, RNA / methods
  • Single-Cell Analysis / methods
  • Software*

Substances

  • RNA, Small Nuclear