Empirical Bayes Estimation of Coalescence Times from Nucleotide Sequence Data

Genetics. 2016 Sep;204(1):249-57. doi: 10.1534/genetics.115.185751. Epub 2016 Jul 20.

Abstract

We demonstrate the advantages of using information at many unlinked loci to better calibrate estimates of the time to the most recent common ancestor (TMRCA) at a given locus. To this end, we apply a simple empirical Bayes method to estimate the TMRCA. This method is both asymptotically optimal, in the sense that the estimator converges to the true value when the number of unlinked loci for which we have information is large, and has the advantage of not making any assumptions about demographic history. The algorithm works as follows: we first split the sample at each locus into inferred left and right clades to obtain many estimates of the TMRCA, which we can average to obtain an initial estimate of the TMRCA. We then use nucleotide sequence data from other unlinked loci to form an empirical distribution that we can use to improve this initial estimate.

Keywords: Robbins’ method; TMRCA; coalescent; empirical Bayes.

MeSH terms

  • Algorithms*
  • Base Sequence*
  • Bayes Theorem
  • Computer Simulation
  • Evolution, Molecular
  • Genetics, Population / methods
  • Humans
  • Models, Genetic*
  • Phylogeny