Summary: Genome-wide association studies (GWAS) have lead to the identification of hundreds of genomic regions associated with complex diseases. Nevertheless, a large fraction of their heritability remains unexplained. Interaction between genetic variants is one of several putative explanations for the 'case of missing heritability' and, therefore, a compelling next analysis step. However, genome-wide interaction analysis (GWIA) of all pairs of SNPs from a standard marker panel is computationally unfeasible without massive parallelization. Furthermore, GWIA of all SNP triples is utopian. In order to overcome these computational constraints, we present a GWIA approach that selects combinations of SNPs for interaction analysis based on a priori information. Sources of information are statistical evidence (single marker association at a moderate level), genetic relevance (genomic location) and biologic relevance (SNP function class and pathway information). We introduce the software package INTERSNP that implements a logistic regression framework as well as log-linear models for joint analysis of multiple SNPs. Automatic handling of SNP annotation and pathways from the KEGG database is provided. In addition, Monte Carlo simulations to judge genome-wide significance are implemented. We introduce various meaningful GWIA strategies that can be conducted using INTERSNP. Typical examples are, for instance, the analysis of all pairs of non-synonymous SNPs, or, the analysis of all combinations of three SNPs that lie in a common pathway and that are among the top 50,000 single-marker results. We demonstrate the feasibility of these and other GWIA strategies by application to a GWAS dataset and discuss promising results.
Availability: The software is available at http://intersnp.meb.uni-bonn.de
Contact: [email protected]; [email protected].