A non-linear regression method for estimation of gene-environment heritability

Bioinformatics. 2021 Apr 5;36(24):5632-5639. doi: 10.1093/bioinformatics/btaa1079.

Abstract

Motivation: Gene-environment (GxE) interactions are one of the least studied aspects of the genetic architecture of human traits and diseases. The environment of an individual is inherently high dimensional, evolves through time and can be expensive and time consuming to measure. The UK Biobank study, with all 500 000 participants having undergone an extensive baseline questionnaire, represents a unique opportunity to assess GxE heritability for many traits and diseases in a well powered setting.

Results: We have developed a randomized Haseman-Elston non-linear regression method applicable when many environmental variables have been measured on each individual. The method (GPLEMMA) simultaneously estimates a linear environmental score (ES) and its GxE heritability. We compare the method via simulation to a whole-genome regression approach (LEMMA) for estimating GxE heritability. We show that GPLEMMA is more computationally efficient than LEMMA on large datasets, and produces results highly correlated with those from LEMMA when applied to simulated data and real data from the UK Biobank.

Availability and implementation: Software implementing the GPLEMMA method is available from https://jmarchini.org/gplemma/.

Supplementary information: Supplementary data are available at Bioinformatics online.