Prediction of RNA binding proteins comes of age from low resolution to high resolution

Mol Biosyst. 2013 Oct;9(10):2417-25. doi: 10.1039/c3mb70167k.

Abstract

Networks of protein-RNA interactions is likely to be larger than protein-protein and protein-DNA interaction networks because RNA transcripts are encoded tens of times more than proteins (e.g. only 3% of human genome coded for proteins), have diverse function and localization, and are controlled by proteins from birth (transcription) to death (degradation). This massive network is evidenced by several recent experimental discoveries of large numbers of previously unknown RNA-binding proteins (RBPs). Meanwhile, more than 400 non-redundant protein-RNA complex structures (at 25% sequence identity or less) have been deposited into the protein databank. These sequences and structural resources for RBPs provide ample data for the development of computational techniques dedicated to RBP prediction, as experimentally determining RNA-binding functions is time-consuming and expensive. This review compares traditional machine-learning based approaches with emerging template-based methods at several levels of prediction resolution ranging from two-state binding/non-binding prediction, to binding residue prediction and protein-RNA complex structure prediction. The analysis indicates that the two approaches are complementary and their combinations may lead to further improvements.

Publication types

  • Review

MeSH terms

  • Artificial Intelligence*
  • Binding Sites
  • Humans
  • Models, Molecular*
  • Protein Binding
  • Quantitative Structure-Activity Relationship*
  • RNA / chemistry
  • RNA / metabolism
  • RNA-Binding Proteins / chemistry*
  • RNA-Binding Proteins / metabolism

Substances

  • RNA-Binding Proteins
  • RNA