The GENDULF algorithm: mining transcriptomics to uncover modifier genes for monogenic diseases

Mol Syst Biol. 2020 Dec;16(12):e9701. doi: 10.15252/msb.20209701.

Abstract

Modifier genes are believed to account for the clinical variability observed in many Mendelian disorders, but their identification remains challenging due to the limited availability of genomics data from large patient cohorts. Here, we present GENDULF (GENetic moDULators identiFication), one of the first methods to facilitate prediction of disease modifiers using healthy and diseased tissue gene expression data. GENDULF is designed for monogenic diseases in which the mechanism is loss of function leading to reduced expression of the mutated gene. When applied to cystic fibrosis, GENDULF successfully identifies multiple, previously established disease modifiers, including EHF, SLC6A14, and CLCA1. It is then utilized in spinal muscular atrophy (SMA) and predicts U2AF1 as a modifier whose low expression correlates with higher SMN2 pre-mRNA exon 7 retention. Indeed, knockdown of U2AF1 in SMA patient-derived cells leads to increased full-length SMN2 transcript and SMN protein expression. Taking advantage of the increasing availability of transcriptomic data, GENDULF is a novel addition to existing strategies for prediction of genetic disease modifiers, providing insights into disease pathogenesis and uncovering novel therapeutic targets.

Keywords: cystic fibrosis; digenic inheritance; gene expression; modifier gene; spinal muscular atrophy.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, N.I.H., Intramural

MeSH terms

  • Algorithms*
  • Data Mining*
  • Disease / genetics*
  • Genes, Modifier*
  • Genetic Association Studies
  • Genetic Linkage
  • HEK293 Cells
  • Humans
  • Reproducibility of Results
  • Transcriptome / genetics*

Associated data

  • GEO/GSE159642