Universal false discovery rate estimation methodology for genome-wide association studies

Hum Hered. 2008;65(4):183-94. doi: 10.1159/000112365. Epub 2007 Dec 11.

Abstract

Genome-wide case-control association studies aim at identifying significant differential markers between sick and healthy populations. With the development of large-scale technologies allowing the genotyping of thousands of single nucleotide polymorphisms (SNPs) comes the multiple testing problem and the practical issue of selecting the most probable set of associated markers. Several False Discovery Rate (FDR) estimation methods have been developed and tuned mainly for differential gene expression studies. However they are based on hypotheses and designs that are not necessarily relevant in genetic association studies. In this article we present a universal methodology to estimate the FDR of genome-wide association results. It uses a single global probability value per SNP and is applicable in practice for any study design, using any statistic. We have benchmarked this algorithm on simulated data and shown that it outperforms previous methods in cases requiring non-parametric estimation. We exemplified the usefulness of the method by applying it to the analysis of experimental genotyping data of three Multiple Sclerosis case-control association studies.

MeSH terms

  • Case-Control Studies
  • Data Interpretation, Statistical
  • False Positive Reactions
  • Female
  • Genetic Predisposition to Disease
  • Genome, Human
  • Genotype
  • Humans
  • Male
  • Models, Genetic*
  • Multiple Sclerosis / epidemiology
  • Multiple Sclerosis / genetics*
  • Polymorphism, Single Nucleotide*
  • Risk