A Collinearity-Incorporating Homology Inference Strategy for Connecting Emerging Assemblies in the Triticeae Tribe as a Pilot Practice in the Plant Pangenomic Era

Mol Plant. 2020 Dec 7;13(12):1694-1708. doi: 10.1016/j.molp.2020.09.019. Epub 2020 Sep 23.

Abstract

Plant genome sequencing has dramatically increased, and some species even have multiple high-quality reference versions. Demands for clade-specific homology inference and analysis have increased in the pangenomic era. Here we present a novel method, GeneTribe (https://chenym1.github.io/genetribe/), for homology inference among genetically similar genomes that incorporates gene collinearity and shows better performance than traditional sequence-similarity-based methods in terms of accuracy and scalability. The Triticeae tribe is a typical allopolyploid-rich clade with complex species relationships that includes many important crops, such as wheat, barley, and rye. We built Triticeae-GeneTribe (http://wheat.cau.edu.cn/TGT/), a homology database, by integrating 12 Triticeae genomes and 3 outgroup model genomes and implemented versatile analysis and visualization functions. With macrocollinearity analysis, we were able to construct a refined model illustrating the structural rearrangements of the 4A-5A-7B chromosomes in wheat as two major translocation events. With collinearity analysis at both the macro- and microscale, we illustrated the complex evolutionary history of homologs of the wheat vernalization gene Vrn2, which evolved as a combined result of genome translocation, duplication, and polyploidization and gene loss events. Our work provides a useful practice for connecting emerging genome assemblies, with awareness of the extensive polyploidy in plants, and will help researchers efficiently exploit genome sequence resources.

Keywords: Triticeae tribe; collinearity; database; homology inference; pangenome; polyploid.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Chromosomes, Plant / genetics
  • Databases, Genetic
  • Diploidy
  • Evolution, Molecular
  • Genome, Plant
  • Genomics*
  • Poaceae / genetics*
  • Software