Recurrent loss of specific introns during angiosperm evolution

PLoS Genet. 2014 Dec 4;10(12):e1004843. doi: 10.1371/journal.pgen.1004843. eCollection 2014 Dec.

Abstract

Numerous instances of presence/absence variations for introns have been documented in eukaryotes, and some cases of recurrent loss of the same intron have been suggested. However, there has been no comprehensive or phylogenetically deep analysis of recurrent intron loss. Of 883 cases of intron presence/absence variation that we detected in five sequenced grass genomes, 93 were confirmed as recurrent losses and the rest could be explained by single losses (652) or single gains (118). No case of recurrent intron gain was observed. Deep phylogenetic analysis often indicated that apparent intron gains were actually numerous independent losses of the same intron. Recurrent loss exhibited extreme non-randomness, in that some introns were removed independently in many lineages. The two larger genomes, maize and sorghum, were found to have a higher rate of both recurrent loss and overall loss and/or gain than foxtail millet, rice or Brachypodium. Adjacent introns and small introns were found to be preferentially lost. Intron loss genes exhibited a high frequency of germ line or early embryogenesis expression. In addition, flanking exon A+T-richness and intron TG/CG ratios were higher in retained introns. This last result suggests that epigenetic status, as evidenced by a loss of methylated CG dinucleotides, may play a role in the process of intron loss. This study provides the first comprehensive analysis of recurrent intron loss, makes a series of novel findings on the patterns of recurrent intron loss during the evolution of the grass family, and provides insight into the molecular mechanism(s) underlying intron loss.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Base Sequence
  • Evolution, Molecular*
  • Gene Deletion*
  • Gene Frequency
  • Genome, Plant / genetics
  • Introns*
  • Magnoliopsida / genetics*
  • Molecular Sequence Data
  • Phylogeny
  • Sequence Analysis, DNA

Grants and funding

This study was supported in part by resources and technical expertise from the Georgia Advanced Computing Resource Center, a partnership between the University of Georgia's Office of the Vice President for Research and Office of the Vice President for Information Technology. Additional support was provided by the NSF Plant Genome Program (grants #0607123 and #043707-01), a 1000 Talents grant from the Chinese Academy of Sciences, and the endowment for the Giles Professorship at the University of Georgia. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.