A powerful and data-adaptive test for rare-variant-based gene-environment interaction analysis

Stat Med. 2019 Mar 30;38(7):1230-1244. doi: 10.1002/sim.8037. Epub 2018 Nov 20.

Abstract

As whole-exome/genome sequencing data become increasingly available in genetic epidemiology research consortia, there is emerging interest in testing the interactions between rare genetic variants and environmental exposures that modify the risk of complex diseases. However, testing rare-variant-based gene-by-environment interactions (GxE) is more challenging than testing the genetic main effects due to the difficulty in correctly estimating the latter under the null hypothesis of no GxE effects and the presence of neutral variants. In response, we have developed a family of powerful and data-adaptive GxE tests, called "aGE" tests, in the framework of the adaptive powered score test, originally proposed for testing the genetic main effects. Using extensive simulations, we show that aGE tests can control the type I error rate in the presence of a large number of neutral variants or a nonlinear environmental main effect, and the power is more resilient to the inclusion of neutral variants than that of existing methods. We demonstrate the performance of the proposed aGE tests using Pancreatic Cancer Case-Control Consortium Exome Chip data. An R package "aGE" is available at http://github.com/ytzhong/projects/.

Keywords: data-adaptive hypothesis testing; gene-environment interaction; model misspecification; rare variant.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Case-Control Studies
  • Computer Simulation
  • Confounding Factors, Epidemiologic*
  • Gene-Environment Interaction*
  • Genetic Association Studies / methods*
  • Humans
  • Models, Genetic
  • Pancreatic Neoplasms / genetics