RNA profiling 2.0: Enhanced cluster analysis of structural ensembles

ArXiv [Preprint]. 2023 Mar 27:arXiv:2303.15552v1.

Abstract

Understanding the base pairing of an RNA sequence provides insight into its molecular structure. By mining suboptimal sampling data, RNAprofiling 1.0 identifies the dominant helices in low-energy secondary structures as features, organizes them into profiles which partition the Boltzmann sample, and highlights key similarities/differences among the most informative, i.e. selected, profiles in a graphical format. Version 2.0 enhances every step of this approach. First, the featured substructures are expanded from helices to stems. Second, profile selection includes low-frequency pairings similar to featured ones. In conjunction, these updates extend the utility of the method to sequences up to length 600, as evaluated over a sizable dataset. Third, relationships are visualized in a decision tree which highlights the most important structural differences. Finally, this cluster analysis is made accessible to experimental researchers in a portable format as an interactive webpage, permitting a much greater understanding of trade-offs among different possible base pairing combinations.

Publication types

  • Preprint