Target-Oriented Reference Construction for supervised cell-type identification in scRNA-seq

Res Sq [Preprint]. 2024 Jun 26:rs.3.rs-4559348. doi: 10.21203/rs.3.rs-4559348/v1.

Abstract

Cell-type identification is the most crucial step in single cell RNA-seq (scRNA-seq) data analysis, for which the supervised cell-type identification method is a desired solution due to the accuracy and efficiency. The performance of such methods is highly dependent on the quality of the reference data. Even though there are many supervised cell-type identification tools, there is no method for selecting and constructing reference data. Here we develop Target-Oriented Reference Construction (TORC), a widely applicable strategy for constructing reference given target dataset in scRNA-seq supervised cell-type identification. TORC alleviates the differences in data distribution and cell-type composition between reference and target. Extensive benchmarks on simulated and real data analyses demonstrate consistent improvements in cell-type identification from TORC. TORC is freely available at https://github.com/weix21/TORC.

Keywords: Cell-type identification; Reference construction; Supervised learning; scRNA-seq.

Publication types

  • Preprint