HAPRAP: a haplotype-based iterative method for statistical fine mapping using GWAS summary statistics

Bioinformatics. 2017 Jan 1;33(1):79-86. doi: 10.1093/bioinformatics/btw565. Epub 2016 Sep 1.

Abstract

Motivation: Fine mapping is a widely used approach for identifying the causal variant(s) at disease-associated loci. Standard methods (e.g. multiple regression) require individual level genotypes. Recent fine mapping methods using summary-level data require the pairwise correlation coefficients ([Formula: see text]) of the variants. However, haplotypes rather than pairwise [Formula: see text], are the true biological representation of linkage disequilibrium (LD) among multiple loci. In this article, we present an empirical iterative method, HAPlotype Regional Association analysis Program (HAPRAP), that enables fine mapping using summary statistics and haplotype information from an individual-level reference panel.

Results: Simulations with individual-level genotypes show that the results of HAPRAP and multiple regression are highly consistent. In simulation with summary-level data, we demonstrate that HAPRAP is less sensitive to poor LD estimates. In a parametric simulation using Genetic Investigation of ANthropometric Traits height data, HAPRAP performs well with a small training sample size (N < 2000) while other methods become suboptimal. Moreover, HAPRAP's performance is not affected substantially by single nucleotide polymorphisms (SNPs) with low minor allele frequencies. We applied the method to existing quantitative trait and binary outcome meta-analyses (human height, QTc interval and gallbladder disease); all previous reported association signals were replicated and two additional variants were independently associated with human height. Due to the growing availability of summary level data, the value of HAPRAP is likely to increase markedly for future analyses (e.g. functional prediction and identification of instruments for Mendelian randomization).

Availability and implementation: The HAPRAP package and documentation are available at http://apps.biocompute.org.uk/haprap/ CONTACT: : [email protected] or [email protected] information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Chromosome Mapping / methods*
  • Gene Frequency
  • Genome-Wide Association Study
  • Genotype
  • Haplotypes*
  • Humans
  • Linkage Disequilibrium
  • Polymorphism, Single Nucleotide*
  • Quantitative Trait, Heritable
  • Sample Size
  • Software*