Integrative analysis reveals RNA G-quadruplexes in UTRs are selectively constrained and enriched for functional associations

Nat Commun. 2020 Jan 27;11(1):527. doi: 10.1038/s41467-020-14404-y.

Abstract

G-quadruplex (G4) sequences are abundant in untranslated regions (UTRs) of human messenger RNAs, but their functional importance remains unclear. By integrating multiple sources of genetic and genomic data, we show that putative G-quadruplex forming sequences (pG4) in 5' and 3' UTRs are selectively constrained, and enriched for cis-eQTLs and RNA-binding protein (RBP) interactions. Using over 15,000 whole-genome sequences, we find that negative selection acting on central guanines of UTR pG4s is comparable to that of missense variation in protein-coding sequences. At multiple GWAS-implicated SNPs within pG4 UTR sequences, we find robust allelic imbalance in gene expression across diverse tissue contexts in GTEx, suggesting that variants affecting G-quadruplex formation within UTRs may also contribute to phenotypic variation. Our results establish UTR G4s as important cis-regulatory elements and point to a link between disruption of UTR pG4 and disease.

MeSH terms

  • G-Quadruplexes*
  • Genetic Association Studies
  • Genetic Variation
  • Humans
  • Nucleotide Motifs
  • RNA Folding
  • RNA-Binding Proteins / metabolism*
  • RNA-Binding Proteins / physiology
  • Untranslated Regions*

Substances

  • RNA-Binding Proteins
  • Untranslated Regions