Phylogenetic Reconstruction Based on Synteny Block and Gene Adjacencies

Mol Biol Evol. 2020 Sep 1;37(9):2747-2762. doi: 10.1093/molbev/msaa114.

Abstract

Gene order can be used as an informative character to reconstruct phylogenetic relationships between species independently from the local information present in gene/protein sequences. PhyChro is a reconstruction method based on chromosomal rearrangements, applicable to a wide range of eukaryotic genomes with different gene contents and levels of synteny conservation. For each synteny breakpoint issued from pairwise genome comparisons, the algorithm defines two disjoint sets of genomes, named partial splits, respectively, supporting the two block adjacencies defining the breakpoint. Considering all partial splits issued from all pairwise comparisons, a distance between two genomes is computed from the number of partial splits separating them. Tree reconstruction is achieved through a bottom-up approach by iteratively grouping sister genomes minimizing genome distances. PhyChro estimates branch lengths based on the number of synteny breakpoints and provides confidence scores for the branches. PhyChro performance is evaluated on two data sets of 13 vertebrates and 21 yeast genomes by using up to 130,000 and 179,000 breakpoints, respectively, a scale of genomic markers that has been out of reach until now. PhyChro reconstructs very accurate tree topologies even at known problematic branching positions. Its robustness has been benchmarked for different synteny block reconstruction methods. On simulated data PhyChro reconstructs phylogenies perfectly in almost all cases, and shows the highest accuracy compared with other existing tools. PhyChro is very fast, reconstructing the vertebrate and yeast phylogenies in <15 min.

Keywords: adjacency; breakpoint; chromosomal rearrangement; distance; parsimony; phylogenetic tree; split; synteny block; vertebrate; yeast.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Gene Order
  • Genetic Techniques*
  • Genome
  • Models, Genetic*
  • Phylogeny*
  • Software*
  • Synteny*
  • Vertebrates / genetics
  • Yeasts / genetics