The Evolutionary History of R2R3-MYB Proteins Across 50 Eukaryotes: New Insights Into Subfamily Classification and Expansion

Sci Rep. 2015 Jun 5:5:11037. doi: 10.1038/srep11037.

Abstract

R2R3-MYB proteins (2R-MYBs) are one of the main transcription factor families in higher plants. Since the evolutionary history of this gene family across the eukaryotic kingdom remains unknown, we performed a comparative analysis of 2R-MYBs from 50 major eukaryotic lineages, with particular emphasis on land plants. A total of 1548 candidates were identified among diverse taxonomic groups, which allowed for an updated classification of 73 highly conserved subfamilies, including many newly identified subfamilies. Our results revealed that the protein architectures, intron patterns, and sequence characteristics were remarkably conserved in each subfamily. At least four subfamilies were derived from early land plants, 10 evolved from spermatophytes, and 19 from angiosperms, demonstrating the diversity and preferential expansion of this gene family in land plants. Moreover, we determined that their remarkable expansion was mainly attributed to whole genome and segmental duplication, where duplicates were preferentially retained within certain subfamilies that shared three homologous intron patterns (a, b, and c) even though up to 12 types of patterns existed. Through our integrated distributions, sequence characteristics, and phylogenetic tree analyses, we confirm that 2R-MYBs are old and postulate that 3R-MYBs may be evolutionarily derived from 2R-MYBs via intragenic domain duplication.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Databases, Factual
  • Eukaryota / genetics
  • Eukaryota / metabolism*
  • Evolution, Molecular
  • Introns
  • Molecular Sequence Data
  • Multigene Family
  • Phylogeny
  • Sequence Alignment
  • Sequence Homology, Amino Acid
  • Transcription Factors / classification
  • Transcription Factors / genetics
  • Transcription Factors / metabolism*

Substances

  • Transcription Factors