Single-cell RNA-sequencing (scRNA-seq) provides unprecedented insights into cellular heterogeneity. Although scRNA-seq reads from most prevalent and popular tagged-end protocols are expected to arise from the 3' end of polyadenylated RNAs, recent studies have shown that "off-target" reads can constitute a substantial portion of the read population. In this work, we introduced scCensus, a comprehensive analysis workflow for systematically evaluating and categorizing off-target reads in scRNA-seq. We applied scCensus to seven scRNA-seq datasets. Our analysis of intergenic reads shows that these off-target reads contain information about chromatin structure and can be used to identify similar cells across modalities. Our analysis of antisense reads suggests that these reads can be used to improve gene detection and capture interesting transcriptional activities like antisense transcription. Furthermore, using splice-aware quantification, we find that spliced and unspliced reads provide distinct information about cell clusters and biomarkers, suggesting the utility of integrating signals from reads with different splicing statuses. Overall, our results suggest that off-target scRNA-seq reads contain underappreciated information about various transcriptional activities. These observations about yet-unexploited information in existing scRNA-seq data will help guide and motivate the community to improve current algorithms and analysis methods, and to develop novel approaches that utilize off-target reads to extend the reach and accuracy of single-cell data analysis pipelines.
Keywords: antisense transcription; chromatin structure; enhancer RNA; off-target priming; scRNA-seq; single-cell multiomics.