In genetic association studies, quality control (QC) filters are applied to remove potentially problematic markers before the markers are tested for statistical associations. However, spurious associations can still occur after QC. We introduce Post-Association Cleaning (PAC) approach that can complement QC by capturing spurious associations using the information in the post-association results. Specifically, we propose a PAC filter based on the linkage disequilibrium (LD) information. The intuition is that if the association is caused by a true genetic effect, neighboring markers in LD should show comparably significant P-values. If not, it may be evidence of spurious association. Previous studies have applied the same idea but only manually without a formal statistical framework. Our proposed method LD-PAC provides a systematic framework to quantitatively measure the evidence of spurious associations based on the likelihood ratio. Simulations show that LD-PAC can detect spurious associations with high detection rate (84%). In addition to detecting spurious associations, our method can also be used to "rescue" candidate associations from the supposedly unclean data such as the markers excluded by QC. Although the additional associations must be treated with care, they can suggest interesting regions. The application of our method to the Wellcome Trust Case Control Consortium (WTCCC) data led to the discovery of an additional candidate association for type 1 diabetes among the QC-excluded markers. This locus turns out to be in a region recently identified as significant by a meta-analysis performed after the WTCCC study was published.
© 2010 Wiley-Liss, Inc.