A fast and noise-resilient approach to detect rare-variant associations with deep sequencing data for complex disorders

Genet Epidemiol. 2012 Nov;36(7):675-85. doi: 10.1002/gepi.21662. Epub 2012 Aug 3.

Abstract

Next generation sequencing technology has enabled the paradigm shift in genetic association studies from the common disease/common variant to common disease/rare-variant hypothesis. Analyzing individual rare variants is known to be underpowered; therefore association methods have been developed that aggregate variants across a genetic region, which for exome sequencing is usually a gene. The foreseeable widespread use of whole genome sequencing poses new challenges in statistical analysis. It calls for new rare-variant association methods that are statistically powerful, robust against high levels of noise due to inclusion of noncausal variants, and yet computationally efficient. We propose a simple and powerful statistic that combines the disease-associated P-values of individual variants using a weight that is the inverse of the expected standard deviation of the allele frequencies under the null. This approach, dubbed as Sigma-P method, is extremely robust to the inclusion of a high proportion of noncausal variants and is also powerful when both detrimental and protective variants are present within a genetic region. The performance of the Sigma-P method was tested using simulated data based on realistic population demographic and disease models and its power was compared to several previously published methods. The results demonstrate that this method generally outperforms other rare-variant association methods over a wide range of models. Additionally, sequence data on the ANGPTL family of genes from the Dallas Heart Study were tested for associations with nine metabolic traits and both known and novel putative associations were uncovered using the Sigma-P method.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Angiopoietin-Like Protein 3
  • Angiopoietin-Like Protein 4
  • Angiopoietin-Like Protein 6
  • Angiopoietin-like Proteins
  • Angiopoietins / genetics
  • Case-Control Studies
  • Data Interpretation, Statistical*
  • Gene Frequency
  • Genetic Association Studies
  • Genetic Predisposition to Disease
  • Genetic Variation*
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Metabolism / genetics
  • Sequence Analysis, DNA / methods
  • Sequence Analysis, DNA / statistics & numerical data*
  • Texas / ethnology
  • Triglycerides / blood
  • Triglycerides / genetics

Substances

  • ANGPTL3 protein, human
  • ANGPTL4 protein, human
  • ANGPTL6 protein, human
  • Angiopoietin-Like Protein 3
  • Angiopoietin-Like Protein 4
  • Angiopoietin-Like Protein 6
  • Angiopoietin-like Proteins
  • Angiopoietins
  • Triglycerides