Diffsig: Associating Risk Factors With Mutational Signatures

bioRxiv [Preprint]. 2023 Feb 10:2023.02.09.527740. doi: 10.1101/2023.02.09.527740.

Abstract

Somatic mutational signatures elucidate molecular vulnerabilities to therapy and therefore detecting signatures and classifying tumors with respect to signatures has clinical value. However, identifying the etiology of the mutational signatures remains a statistical challenge, with both small sample sizes and high variability in classification algorithms posing barriers. As a result, few signatures have been strongly linked to particular risk factors. Here we present Diffsig, a model and R package for estimating the association of risk factors with mutational signatures, suggesting etiologies for the pre-defined mutational signatures. Diffsig is a Bayesian Dirichlet-multinomial hierarchical model that allows testing of any type of risk factor while taking into account the uncertainty associated with samples with a low number of observations. In simulation, we found that our method can accurately estimate risk factor-mutational signal associations. We applied Diffsig to breast cancer data to assess relationships between five established breast-relevant mutational signatures and etiologic variables, confirming known mechanisms of cancer development. Diffsig is implemented as an R package available at: https://github.com/jennprk/diffsig.

Publication types

  • Preprint