Expanding and improving analyses of nucleotide recoding RNA-seq experiments with the EZbakR suite

Isaac W Vock; Justin W Mabin; Martin Machyna; Alexandra Zhang; J Robert Hogg; Matthew D Simon

doi:10.1101/2024.10.14.617411

Expanding and improving analyses of nucleotide recoding RNA-seq experiments with the EZbakR suite

bioRxiv [Preprint]. 2024 Oct 17:2024.10.14.617411. doi: 10.1101/2024.10.14.617411.

Authors

Isaac W Vock^{1

2}, Justin W Mabin³, Martin Machyna^{1

2

4}, Alexandra Zhang^{1

2}, J Robert Hogg³, Matthew D Simon^{1

2}

Affiliations

¹ Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520, USA.
² Institute of Biomolecular Design and Discovery, Yale University, West Haven, Connecticut 06516, USA.
³ Biochemistry and Biophysics Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD 20892, USA.
⁴ Present address: Paul-Ehrlich-Institut, Host-Pathogen-Interactions, 63225 Langen, Germany.

Abstract

Nucleotide recoding RNA sequencing methods (NR-seq; TimeLapse-seq, SLAM-seq, TUC-seq, etc.) are powerful approaches for assaying transcript population dynamics. In addition, these methods have been extended to probe a host of regulated steps in the RNA life cycle. Current bioinformatic tools significantly constrain analyses of NR-seq data. To address this limitation, we developed EZbakR, an R package to facilitate a more comprehensive set of NR-seq analyses, and fastq2EZbakR, a Snakemake pipeline for flexible preprocessing of NR-seq datasets, collectively referred to as the EZbakR suite. Together, these tools generalize many aspects of the NR-seq analysis workflow. The fastq2EZbakR pipeline can assign reads to a diverse set of genomic features (e.g., genes, exons, splice junctions, etc.), and EZbakR can perform analyses on any combination of these features. EZbakR extends standard NR-seq mutational modeling to support multi-label analyses (e.g., s⁴U and s⁶G dual labeling), and implements an improved hierarchical model to better account for transcript-to-transcript variance in metabolic label incorporation. EZbakR also generalizes dynamical systems modeling of NR-seq data to support analyses of premature mRNA processing and flow between subcellular compartments. Finally, EZbakR implements flexible and well-powered comparative analyses of all estimated parameters via design matrix-specified generalized linear modeling. The EZbakR suite will thus allow researchers to make full, effective use of NR-seq data.

Publication types

Preprint

Abstract

Publication types

Grants and funding