Characterization of the subcellular distribution of RNA is essential for understanding the molecular basis of biological processes. Here, the subcellular nanopore direct RNA-sequencing (DRS) of four lung cancer cell lines (A549, H1975, H358, and HCC4006) is performed, coupled with a computational pipeline, Low-abundance Aware Full-length Isoform clusTEr (LAFITE), to comprehensively analyze the full-length cytoplasmic and nuclear transcriptome. Using additional DRS and orthogonal data sets, it is shown that LAFITE outperforms current methods for detecting full-length transcripts, particularly for low-abundance isoforms that are usually overlooked due to poor read coverage. Experimental validation of six novel isoforms exclusively identified by LAFITE further confirms the reliability of this pipeline. By applying LAFITE to subcellular DRS data, the complexity of the nuclear transcriptome is revealed in terms of isoform diversity, 3'-UTR usage, m6A modification patterns, and intron retention. Overall, LAFITE provides enhanced full-length isoform identification and enables a high-resolution view of the RNA landscape at the isoform level.
Keywords: direct RNA-sequencing; full-length transcripts; long read; nanopore; subcellular fraction.
© 2022 The Authors. Advanced Science published by Wiley-VCH GmbH.