Genome-wide identification of directed gene networks using large-scale population genomics data

Nat Commun. 2018 Aug 6;9(1):3097. doi: 10.1038/s41467-018-05452-6.

Abstract

Identification of causal drivers behind regulatory gene networks is crucial in understanding gene function. Here, we develop a method for the large-scale inference of gene-gene interactions in observational population genomics data that are both directed (using local genetic instruments as causal anchors, akin to Mendelian Randomization) and specific (by controlling for linkage disequilibrium and pleiotropy). Analysis of genotype and whole-blood RNA-sequencing data from 3072 individuals identified 49 genes as drivers of downstream transcriptional changes (Wald P < 7 × 10-10), among which transcription factors were overrepresented (Fisher's P = 3.3 × 10-7). Our analysis suggests new gene functions and targets, including for SENP7 (zinc-finger genes involved in retroviral repression) and BCL2A1 (target genes possibly involved in auditory dysfunction). Our work highlights the utility of population genomics data in deriving directed gene expression networks. A resource of trans-effects for all 6600 genes with a genetic instrument can be explored individually using a web-based browser.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cohort Studies
  • Endopeptidases / genetics
  • Epistasis, Genetic
  • Gene Expression
  • Gene Expression Profiling
  • Gene Expression Regulation
  • Gene Regulatory Networks*
  • Genetics, Population*
  • Genotype
  • Humans
  • Linkage Disequilibrium
  • Metagenomics*
  • Minor Histocompatibility Antigens / genetics
  • Phenotype
  • Proto-Oncogene Proteins c-bcl-2 / genetics
  • Sequence Analysis, RNA
  • Transcription Factors / genetics
  • Transcription, Genetic
  • Transcriptome
  • Zinc Fingers

Substances

  • BCL2-related protein A1
  • Minor Histocompatibility Antigens
  • Proto-Oncogene Proteins c-bcl-2
  • Transcription Factors
  • Endopeptidases
  • SENP7 protein, human