DriverDetector: An R package providing multiple statistical methods for cancer driver genes detection and tools for downstream analysis

Heliyon. 2024 Jul 1;10(14):e33582. doi: 10.1016/j.heliyon.2024.e33582. eCollection 2024 Jul 30.

Abstract

Identifying driver genes in cancer is a difficult task because of the heterogeneity of cancer as well as the complex interactions among genes. As sequencing data become more readily available, there is a growing need for detecting cancer driver genes based on statistical and mathematical modeling methods. Currently, plenty of driver gene identification algorithms have been published, but they fail to achieve consistent results. In order to obtain gene sets with high confidence, we present DriverDetector, an R package providing a convenient workflow for cancer driver genes detection and downstream analysis. We develop the background mutation rate calculating module based on the distance between genes in covariate space and binomial test, followed by the driver gene selection module which integrates 11 methods, including two already recognized approaches, a de novo method, and five variants of Fisher's method which are applied to driver gene identification for the first time. Through verification on 12 TCGA datasets, each method is able to identify a set of confirmed driver genes while the number of resulting genes vary significantly across different methods. For robust driver genes detection, a voting strategy based on 10 of the statistical methods is further applied. Results show that the collective prediction based on the voting strategy demonstrates superiority in achieving the consistency of prediction while ensuring a reasonable number of predicted genes and confirmed drivers. By comparing the results of each cancer dataset, we also find that sample size has a huge impact on the number of predicted genes. For downstream analysis, DriverDetector automatically generates plenty of plots and tables to elaborate the results. We propose DriverDetector as a user-friendly tool promoting early diagnosis of cancer and the development of targeted drugs.

Keywords: Background mutation rate; Cancer driver genes; Genome analysis software.