A Bayesian model for detection of high-order interactions among genetic variants in genome-wide association studies

BMC Genomics. 2015 Nov 25:16:1011. doi: 10.1186/s12864-015-2217-6.

Abstract

Background: A central question for disease studies and crop improvements is how genetics variants drive phenotypes. Genome Wide Association Study (GWAS) provides a powerful tool for characterizing the genotype-phenotype relationships in complex traits and diseases. Epistasis (gene-gene interaction), including high-order interaction among more than two genes, often plays important roles in complex traits and diseases, but current GWAS analysis usually just focuses on additive effects of single nucleotide polymorphisms (SNPs). The lack of effective computational modelling of high-order functional interactions often leads to significant under-utilization of GWAS data.

Results: We have developed a novel Bayesian computational method with a Markov Chain Monte Carlo (MCMC) search, and implemented the method as a Bayesian High-order Interaction Toolkit (BHIT) for detecting epistatic interactions among SNPs. BHIT first builds a Bayesian model on both continuous data and discrete data, which is capable of detecting high-order interactions in SNPs related to case--control or quantitative phenotypes. We also developed a pipeline that enables users to apply BHIT on different species in different use cases.

Conclusions: Using both simulation data and soybean nutritional seed composition studies on oil content and protein content, BHIT effectively detected some high-order interactions associated with phenotypes, and it outperformed a number of other available tools. BHIT is freely available for academic users at http://digbio.missouri.edu/BHIT/.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Bayes Theorem*
  • Computational Biology / methods
  • Computer Simulation
  • Epistasis, Genetic*
  • Genetic Variation*
  • Genome-Wide Association Study*
  • Genotype
  • Glycine max / genetics
  • Markov Chains
  • Models, Genetic
  • Monte Carlo Method
  • Phenotype
  • Polymorphism, Single Nucleotide
  • Protein Interaction Mapping
  • Quantitative Trait, Heritable
  • Software