Are rare variants really independent?

Genet Epidemiol. 2017 May;41(4):363-371. doi: 10.1002/gepi.22039. Epub 2017 Mar 16.

Abstract

Recent advances in genotyping with high-density markers allow researchers access to genomic variants including rare ones. Linkage disequilibrium (LD) is widely used to provide insight into evolutionary history. It is also the basis for association mapping in humans and other species. Better understanding of the genomic LD structure may lead to better-informed statistical tests that can improve the power of association studies. Although rare variant associations with common diseases (RVCD) have been extensively studied recently, there is very limited understanding, and even controversial view of LD structures among rare variants and between rare and common variants. In fact, many popular RVCD tests make the assumptions that rare variants are independent. In this report, we show that two commonly used LD measures are not capable of detecting LD when rare variants are involved. We present this argument from two perspectives, both the LD measures themselves and the computational issues associated with them. To address these issues, we propose an alternative LD measure, the polychoric correlation, that was originally designed for detecting associations among categorical variables. Using simulated as well as the 1000 Genomes data, we explore the performances of LD measures in detail and discuss their implications in association studies.

Keywords: 1000 Genomes data; GWAS; linkage disequilibrium; next-generation sequencing data; polychoric correlation.

MeSH terms

  • Chromosomes, Human, Pair 21 / genetics
  • Computer Simulation
  • Gene Frequency / genetics
  • Genetic Variation*
  • Genome-Wide Association Study*
  • Genotype
  • Humans
  • Linkage Disequilibrium / genetics
  • Polymorphism, Single Nucleotide / genetics