Fusion InPipe, an integrative pipeline for gene fusion detection from RNA-seq data in acute pediatric leukemia

Front Mol Biosci. 2023 Jun 9:10:1141310. doi: 10.3389/fmolb.2023.1141310. eCollection 2023.

Abstract

RNA sequencing (RNA-seq) is a reliable tool for detecting gene fusions in acute leukemia. Multiple bioinformatics pipelines have been developed to analyze RNA-seq data, but an agreed gold standard has not been established. This study aimed to compare the applicability of 5 fusion calling pipelines (Arriba, deFuse, CICERO, FusionCatcher, and STAR-Fusion), as well as to define and develop an integrative bioinformatics pipeline (Fusion InPipe) to detect clinically relevant gene fusions in acute pediatric leukemia. We analyzed RNA-seq data by each pipeline individually and by Fusion InPipe. Each algorithm individually called most of the fusions with similar sensitivity and precision. However, not all rearrangements were called, suggesting that choosing a single pipeline might cause missing important fusions. To improve this, we integrated the results of the five algorithms in just one pipeline, Fusion InPipe, comparing the output from the agreement of 5/5, 4/5, and 3/5 algorithms. The maximum sensitivity was achieved with the agreement of 3/5 algorithms, with a global sensitivity of 95%, achieving a 100% in patients' data. Furthermore, we showed the necessity of filtering steps to reduce the false positive detection rate. Here, we demonstrate that Fusion InPipe is an excellent tool for fusion detection in pediatric acute leukemia with the best performance when selecting those fusions called by at least 3/5 pipelines.

Keywords: RNA-sequencing; bioinformatic pipelines; fusion detection; molecular diagnostics; pediatric acute leukemia.

Grants and funding

This study has been supported by a grant from Instituto de Salud Carlos III (PI21/00213) and by associations of parents and families of children with cancer, in coordination with the Obra Social Hospital Sant Joan de Deu.