Accounting for technical noise in differential expression analysis of single-cell RNA sequencing data

Nucleic Acids Res. 2017 Nov 2;45(19):10978-10988. doi: 10.1093/nar/gkx754.

Abstract

Recent technological breakthroughs have made it possible to measure RNA expression at the single-cell level, thus paving the way for exploring expression heterogeneity among individual cells. Current single-cell RNA sequencing (scRNA-seq) protocols are complex and introduce technical biases that vary across cells, which can bias downstream analysis without proper adjustment. To account for cell-to-cell technical differences, we propose a statistical framework, TASC (Toolkit for Analysis of Single Cell RNA-seq), an empirical Bayes approach to reliably model the cell-specific dropout rates and amplification bias by use of external RNA spike-ins. TASC incorporates the technical parameters, which reflect cell-to-cell batch effects, into a hierarchical mixture model to estimate the biological variance of a gene and detect differentially expressed genes. More importantly, TASC is able to adjust for covariates to further eliminate confounding that may originate from cell size and cell cycle differences. In simulation and real scRNA-seq data, TASC achieves accurate Type I error control and displays competitive sensitivity and improved robustness to batch effects in differential expression analysis, compared to existing methods. TASC is programmed to be computationally efficient, taking advantage of multi-threaded parallelization. We believe that TASC will provide a robust platform for researchers to leverage the power of scRNA-seq.

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Computational Biology / methods*
  • Gene Expression Profiling / methods*
  • Reproducibility of Results
  • Sequence Analysis, RNA / methods*
  • Single-Cell Analysis / methods*
  • Software