Quantifying sequence and structural features of protein-RNA interactions

Nucleic Acids Res. 2014 Sep;42(15):10086-98. doi: 10.1093/nar/gku681. Epub 2014 Jul 25.

Abstract

Increasing awareness of the importance of protein-RNA interactions has motivated many approaches to predict residue-level RNA binding sites in proteins based on sequence or structural characteristics. Sequence-based predictors are usually high in sensitivity but low in specificity; conversely structure-based predictors tend to have high specificity, but lower sensitivity. Here we quantified the contribution of both sequence- and structure-based features as indicators of RNA-binding propensity using a machine-learning approach. In order to capture structural information for proteins without a known structure, we used homology modeling to extract the relevant structural features. Several novel and modified features enhanced the accuracy of residue-level RNA-binding propensity beyond what has been reported previously, including by meta-prediction servers. These features include: hidden Markov model-based evolutionary conservation, surface deformations based on the Laplacian norm formalism, and relative solvent accessibility partitioned into backbone and side chain contributions. We constructed a web server called aaRNA that implements the proposed method and demonstrate its use in identifying putative RNA binding sites.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acids / chemistry
  • Artificial Intelligence
  • Binding Sites
  • Models, Molecular
  • Protein Binding
  • Protein Structure, Secondary
  • RNA / chemistry
  • RNA / metabolism
  • RNA-Binding Proteins / chemistry*
  • RNA-Binding Proteins / metabolism
  • Sequence Analysis, Protein
  • Software
  • Structural Homology, Protein

Substances

  • Amino Acids
  • RNA-Binding Proteins
  • RNA