Long-range RNA structures in the human transcriptome beyond evolutionarily conserved regions

PeerJ. 2023 Nov 28:11:e16414. doi: 10.7717/peerj.16414. eCollection 2023.

Abstract

RNA structure has been increasingly recognized as a critical player in the biogenesis and turnover of many transcripts classes. In eukaryotes, the prediction of RNA structure by thermodynamic modeling meets fundamental limitations due to the large sizes and complex, discontinuous organization of eukaryotic genes. Signatures of functional RNA structures can be found by detecting compensatory substitutions in homologous sequences, but a comparative approach is applicable only within conserved sequence blocks. Here, we developed a computational pipeline called PHRIC, which is not limited to conserved regions and relies on RNA contacts derived from RNA in situ conformation sequencing (RIC-seq) experiments. It extracts pairs of short RNA fragments surrounded by nested clusters of RNA contacts and predicts long, nearly perfect complementary base pairings formed between these fragments. In application to a panel of RIC-seq experiments in seven human cell lines, PHRIC predicted ~12,000 stable long-range RNA structures with equilibrium free energy below -15 kcal/mol, the vast majority of which fall outside of regions annotated as conserved among vertebrates. These structures, nevertheless, show some level of sequence conservation and remarkable compensatory substitution patterns in other clades. Furthermore, we found that introns have a higher propensity to form stable long-range RNA structures between each other, and moreover that RNA structures tend to concentrate within the same intron rather than connect adjacent introns. These results for the first time extend the application of proximity ligation assays to RNA structure prediction beyond conserved regions.

Keywords: Alternative splicing; Long-range; Proximity ligation; RIC-seq; RNA structure; RNAcontacts.

MeSH terms

  • Animals
  • Base Sequence
  • Humans
  • Introns
  • RNA Splicing
  • RNA* / genetics
  • Transcriptome* / genetics

Substances

  • RNA

Grants and funding

This work was supported by the research grant of Russian Ministry of Science and Education (075-10-2021-116) and the research grant from the National Key Research and Development Program of China (2021YFE0114900). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.