By adapting a well-known affected-relative-pair linkage model that can incorporate covariate or sub-phenotype information [Olson, 1999: Am J Hum Genet 65:1760-1769], we have developed a recursive-partitioning (RP) algorithm (tree-based model) for identifying phenotype and covariate groupings that interact with the evidence for linkage. This strategy is designed to identify subgroups of affected relative pairs demonstrating increased evidence for linkage, where subgroups are defined by pair-level or family-level covariates. After growing a full tree, we identified optimal tree size through a form of tree pruning and chose the best covariate at each split by using bootstrap algorithms. Simulation studies showed that power to detect linkage can increase in the presence of gene-environment interactions, depending on the magnitude of the interaction. As expected, however, power can decrease by examining more covariates, despite the pruning to optimize tree size. The RP model correctly identifies tree structure in a large proportion of simulations. We applied the RP model to a dataset of families with bipolar affective disorder (BPAD) where linkage regions on chromosome 18 have been previously identified. Using the all-pairs score in Genehunter, the NPL tests showed no regions with strong linkage evidence on chromosome 18. However, using the RP model, several suggestive regions were found on chromosome 18. Two covariates appeared to influence the degree of linkage: the type II BPAD subtype and a pattern of displaying mania before or after a depressive episode. The RP model has the potential to identify previously unknown gene-environmental interactions; here we have demonstrated the practical utility and potential this new methodology holds.
(c) 2005 Wiley-Liss, Inc.