In model eukaryotes, the C-terminal domain (CTD) of the largest subunit of DNA-dependent RNA polymerase II (RNAP II) is composed of tandemly repeated heptads with the consensus sequence YSPTSPS. The core motif and tandem structure generally are conserved across model taxa, including animals, yeasts and higher plants. Broader investigations revealed that CTDs of many organisms deviate substantially from this canonical structure; however, limited sampling made it difficult to determine whether disordered sequences reflect the CTD's ancestral state or degeneration from ancestral repetitive structures. Therefore, we undertook, to our knowledge, the broadest investigation to date of the evolution of the RNAP II CTD across eukaryotic diversity. Our results indicate that a tandemly repeated CTD existed in the ancestors of each major taxon, and that degeneration and reinvention of this ordered structure are common features of CTD evolution. Lineage-specific CTD modifications appear to be associated with greater developmental complexity in multicellular organisms, a pattern taken to an extreme in fungi and red algae, in which the CTD has undergone dramatic to complete alteration during the transition from unicellular to developmentally complex forms. Overall, loss and reinvention of repeats have punctuated CTD evolution, occurring independently and sometimes repeatedly in various groups.
Keywords: development; parasitism; splicing; transcription.