Models for Similarity Distributions of Syntenic Homologs and Applications to Phylogenomics

IEEE/ACM Trans Comput Biol Bioinform. 2018 Jul 31. doi: 10.1109/TCBB.2018.2849377. Online ahead of print.

Abstract

We outline an integrated approach to speciation and whole genome duplication (WGD) to resolve the occurrence of these events in phylogenetic analysis. We propose a more principled way of estimating the parameters of gene divergence and fractionation than the standard mixture of normals analysis. We formulate an algorithm for resolving data on local peaks in the distributions of duplicate gene similarities for a number of related genomes. We illustrate with a comprehensive analysis of WGD-origin duplicate gene data from the family Brassicaceae.