Convolutional neural network model to predict causal risk factors that share complex regulatory features

Nucleic Acids Res. 2019 Dec 16;47(22):e146. doi: 10.1093/nar/gkz868.

Abstract

Major progress in disease genetics has been made through genome-wide association studies (GWASs). One of the key tasks for post-GWAS analyses is to identify causal noncoding variants with regulatory function. Here, on the basis of >2000 functional features, we developed a convolutional neural network framework for combinatorial, nonlinear modeling of complex patterns shared by risk variants scattered among multiple associated loci. When applied for major psychiatric disorders and autoimmune diseases, neural and immune features, respectively, exhibited high explanatory power while reflecting the pathophysiology of the relevant disease. The predicted causal variants were concentrated in active regulatory regions of relevant cell types and tended to be in physical contact with transcription factors while residing in evolutionarily conserved regions and resulting in expression changes of genes related to the given disease. We demonstrate some examples of novel candidate causal variants and associated genes. Our method is expected to contribute to the identification and functional interpretation of potential causal noncoding variants in post-GWAS analyses.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Autoimmune Diseases / genetics*
  • Genetic Predisposition to Disease / genetics*
  • Genome-Wide Association Study / methods*
  • Humans
  • Mental Disorders / genetics*
  • Neural Networks, Computer*
  • Polymorphism, Single Nucleotide / genetics
  • Quantitative Trait Loci / genetics
  • Regulatory Sequences, Nucleic Acid / genetics
  • Risk Factors