Genome-wide evolution analysis reveals low CpG contents of fast-evolving genes and identifies antiviral microRNAs

J Genet Genomics. 2020 Jan 20;47(1):49-60. doi: 10.1016/j.jgg.2019.12.001. Epub 2019 Dec 23.

Abstract

Noncoding RNAs (ncRNAs) play important roles in many biological processes and provide materials for evolutionary adaptations beyond protein-coding genes, such as in the arms race between the host and pathogen. However, currently, a comprehensive high-resolution analysis of primate genomes that includes the latest annotated ncRNAs is not available. Here, we developed a computational pipeline to estimate the selections that act on noncoding regions based on comparisons with a large number of reference sequences in introns adjacent to the interested regions. Our method yields result comparable with those of the established codon-based method and phyloP method for coding genes; thus, it provides a holistic framework for estimating the selection on the entire genome. We further showed that fast-evolving protein-coding genes and their corresponding 5' UTRs have a significantly lower frequency of the CpG dinucleotides than those evolving at an average pace, and these fast-evolving genes are enriched in the process of immunity and host defense. We also identified fast-evolving miRNAs with antiviral functions in cells. Our results provide a resource for high-resolution evolution analysis of the primate genomes.

Keywords: CpG dinucleotide; Host-virus interaction; MicroRNA; Noncoding RNA; Positive selection.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Antiviral Agents / pharmacology
  • Evolution, Molecular*
  • Genome / genetics
  • Glycine / analogs & derivatives
  • Glycine / genetics
  • Humans
  • MicroRNAs / genetics*
  • Primates / genetics
  • RNA, Untranslated / genetics*

Substances

  • Antiviral Agents
  • MicroRNAs
  • RNA, Untranslated
  • carboxyphenylglycine
  • Glycine