Interplay between coding and exonic splicing regulatory sequences

Genome Res. 2019 May;29(5):711-722. doi: 10.1101/gr.241315.118. Epub 2019 Apr 8.

Abstract

The inclusion of exons during the splicing process depends on the binding of splicing factors to short low-complexity regulatory sequences. The relationship between exonic splicing regulatory sequences and coding sequences is still poorly understood. We demonstrate that exons that are coregulated by any given splicing factor share a similar nucleotide composition bias and preferentially code for amino acids with similar physicochemical properties because of the nonrandomness of the genetic code. Indeed, amino acids sharing similar physicochemical properties correspond to codons that have the same nucleotide composition bias. In particular, we uncover that the TRA2A and TRA2B splicing factors that bind to adenine-rich motifs promote the inclusion of adenine-rich exons coding preferentially for hydrophilic amino acids that correspond to adenine-rich codons. SRSF2 that binds guanine/cytosine-rich motifs promotes the inclusion of GC-rich exons coding preferentially for small amino acids, whereas SRSF3 that binds cytosine-rich motifs promotes the inclusion of exons coding preferentially for uncharged amino acids, like serine and threonine that can be phosphorylated. Finally, coregulated exons encoding amino acids with similar physicochemical properties correspond to specific protein features. In conclusion, the regulation of an exon by a splicing factor that relies on the affinity of this factor for specific nucleotide(s) is tightly interconnected with the exon-encoded physicochemical properties. We therefore uncover an unanticipated bidirectional interplay between the splicing regulatory process and its biological functional outcome.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alternative Splicing*
  • Amino Acids / chemistry
  • Base Composition / genetics
  • Cell Line
  • Exons / genetics*
  • Genetic Code
  • Heterogeneous-Nuclear Ribonucleoproteins / metabolism
  • Humans
  • Introns / genetics
  • Nucleotide Motifs / genetics
  • RNA Splice Sites / genetics*
  • RNA Splicing Factors / metabolism*
  • Sequence Analysis, Protein
  • Sequence Analysis, RNA
  • Serine-Arginine Splicing Factors / metabolism

Substances

  • Amino Acids
  • Heterogeneous-Nuclear Ribonucleoproteins
  • RNA Splice Sites
  • RNA Splicing Factors
  • Serine-Arginine Splicing Factors