SQANTI-SIM: a simulator of controlled transcript novelty for lrRNA-seq benchmark

bioRxiv [Preprint]. 2023 Aug 24:2023.08.23.554392. doi: 10.1101/2023.08.23.554392.

Abstract

Long-read RNA-seq has emerged as a powerful tool for transcript discovery, even in well-annotated organisms. However, assessing the accuracy of different methods in identifying annotated and novel transcripts remains a challenge. Here, we present SQANTI-SIM, a versatile utility that wraps around popular long-read simulators to allow precise management of transcript novelty based on the structural categories defined by SQANTI3. By selectively excluding specific transcripts from the reference dataset, SQANTI-SIM effectively emulates scenarios involving unannotated transcripts. Furthermore, the tool provides customizable features and supports the simulation of additional types of data, representing the first multi-omics simulation tool for the lrRNA-seq field. We demonstrate the effectiveness of SQANTI-SIM by benchmarking five transcriptome reconstruction pipelines using the simulated data.

Keywords: SQANTI; isoform discovery; long-read transcriptomics; transcript simulation; transcriptome reconstruction.

Publication types

  • Preprint