Quantile normalization of single-cell RNA-seq read counts without unique molecular identifiers

Genome Biol. 2020 Jul 3;21(1):160. doi: 10.1186/s13059-020-02078-0.

Abstract

Single-cell RNA-seq (scRNA-seq) profiles gene expression of individual cells. Unique molecular identifiers (UMIs) remove duplicates in read counts resulting from polymerase chain reaction, a major source of noise. For scRNA-seq data lacking UMIs, we propose quasi-UMIs: quantile normalization of read counts to a compound Poisson distribution empirically derived from UMI datasets. When applied to ground-truth datasets having both reads and UMIs, quasi-UMI normalization has higher accuracy than competing methods. Using quasi-UMIs enables methods designed specifically for UMI data to be applied to non-UMI scRNA-seq datasets.

Keywords: Gene expression; Normalization; Quasi-UMI; RNA-seq; Single cell.

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Humans
  • Normal Distribution
  • Poisson Distribution
  • Sequence Analysis, RNA*
  • Single-Cell Analysis*