Closed-form two-locus sampling distributions: accuracy and universality

Genetics. 2009 Nov;183(3):1087-103. doi: 10.1534/genetics.109.107995. Epub 2009 Sep 7.

Abstract

Sampling distributions play an important role in population genetics analyses, but closed-form sampling formulas are generally intractable to obtain. In the presence of recombination, there is no known closed-form sampling formula that holds for an arbitrary recombination rate. However, we recently showed that it is possible to obtain useful closed-form sampling formulas when the population-scaled recombination rate rho is large. Specifically, in the case of the two-locus infinite-alleles model, we considered an asymptotic expansion of the sampling formula in inverse powers of rho and obtained closed-form expressions for the first few terms in the expansion. In this article, we generalize this result to an arbitrary finite-alleles mutation model and show that, up to the first few terms in the expansion that we are able to compute analytically, the functional form of the asymptotic sampling formula is common to all mutation models. We carry out an extensive study of the accuracy of the asymptotic formula for the two-locus parent-independent mutation model and discuss in detail a concrete application in the context of the composite-likelihood method. Furthermore, using our asymptotic sampling formula, we establish a simple sufficient condition for a given two-locus sample configuration to have a finite maximum-likelihood estimate (MLE) of rho. This condition is the first analytic result on the classification of the MLE of rho and is instantaneous to check in practice, provided that one-locus probabilities are known.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Alleles
  • Gene Frequency
  • Genetic Variation
  • Genetics, Population / methods*
  • Genetics, Population / standards
  • Likelihood Functions
  • Models, Genetic*
  • Recombination, Genetic
  • Reproducibility of Results
  • Sample Size