Integration of data from genome-wide single nucleotide polymorphism (SNP) association studies of different traits should allow researchers to disentangle the genetics of potentially related traits within individually associated regions. Formal statistical colocalisation testing of individual regions requires selection of a set of SNPs summarising the association in a region. We show that the SNP selection method greatly affects type 1 error rates, with published studies having used methods expected to result in substantially inflated type 1 error rates. We show that either avoiding variable selection and instead testing the most informative principal components or integrating over variable selection using Bayesian model averaging can help control type 1 error rates. Application to data from Graves' disease and Hashimoto's thyroiditis reveals a common genetic signature across seven regions shared between the diseases, and indicates that in five of six regions associated with Graves' disease and not Hashimoto's thyroiditis, this more likely reflects genuine absence of association with the latter rather than lack of power. Our examination, by simulation, of the performance of colocalisation tests and associated software will foster more widespread adoption of formal colocalisation testing. Given the increasing availability of large expression and genetic association datasets from disease-relevant tissue and purified cell populations, coupled with identification of regulatory sequences by projects such as ENCODE, colocalisation analysis has the potential to reveal both shared genetic signatures of related traits and causal disease genes and tissues.
Keywords: GWAS; causal variants; colocalisation; genetic association.
© 2013 WILEY PERIODICALS, INC.