GC skew is a conserved property of unmethylated CpG island promoters across vertebrates

Nucleic Acids Res. 2015 Nov 16;43(20):9729-41. doi: 10.1093/nar/gkv811. Epub 2015 Aug 7.

Abstract

GC skew is a measure of the strand asymmetry in the distribution of guanines and cytosines. GC skew favors R-loops, a type of three stranded nucleic acid structures that form upon annealing of an RNA strand to one strand of DNA, creating a persistent RNA:DNA hybrid. Previous studies show that GC skew is prevalent at thousands of human CpG island (CGI) promoters and transcription termination regions, which correspond to hotspots of R-loop formation. Here, we investigated the conservation of GC skew patterns in 60 sequenced chordates genomes. We report that GC skew is a conserved sequence characteristic of the CGI promoter class in vertebrates. Furthermore, we reveal that promoter GC skew peaks at the exon 1/ intron1 junction and that it is highly correlated with gene age and CGI promoter strength. Our data also show that GC skew is predictive of unmethylated CGI promoters in a range of vertebrate species and that it imparts significant DNA hypomethylation for promoters with intermediate CpG densities. Finally, we observed that terminal GC skew is conserved for a subset of vertebrate genes that tend to be located significantly closer to their downstream neighbors, consistent with a role for R-loop formation in transcription termination.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Base Composition
  • Base Sequence
  • Conserved Sequence
  • CpG Islands*
  • DNA / chemistry
  • Exons
  • Genes
  • Genomics
  • Humans
  • Introns
  • Mice
  • Promoter Regions, Genetic*
  • Terminator Regions, Genetic
  • Vertebrates / genetics*

Substances

  • DNA