When long-lasting, balancing selection can lead to "trans-species" polymorphisms that are shared by two or more species identical by descent. In such cases, the gene genealogy at the selected site clusters by allele instead of by species, and nearby neutral sites also have unusual genealogies because of linkage. While this scenario is expected to leave discernible footprints in genetic variation data, the specific patterns remain poorly characterized. Motivated by recent findings in primates, we focus on the case of a biallelic polymorphism under ancient balancing selection and derive approximations for summaries of the polymorphism data from two species. Specifically, we characterize the length of the segment that carries most of the footprints, the expected number of shared neutral single nucleotide polymorphisms (SNPs), and the patterns of allelic associations among them. We confirm the accuracy of our approximations by coalescent simulations. We further show that for humans and chimpanzees-more generally, for pairs of species with low genetic diversity levels-these patterns are highly unlikely to be generated by neutral recurrent mutations. We discuss the implications for the design and interpretation of genome scans for ancient balanced polymorphisms in primates and other taxa.
Keywords: Ancient genetic variation; balancing selection; genome scan for selection; trans-species polymorphism.
© 2014 The Authors. Evolution published by Wiley Periodicals, Inc. on behalf of The Society for the Study of Evolution.