Single-cell RNA-seq data contain a large proportion of zeros for expressed genes. Such dropout events present a fundamental challenge for various types of data analyses. Here, we describe the SCRABBLE algorithm to address this problem. SCRABBLE leverages bulk data as a constraint and reduces unwanted bias towards expressed genes during imputation. Using both simulation and several types of experimental data, we demonstrate that SCRABBLE outperforms the existing methods in recovering dropout events, capturing true distribution of gene expression across cells, and preserving gene-gene relationship and cell-cell relationship in the data.
Keywords: Imputation; Matrix regularization; Optimization; Single-cell RNA-seq.