Identifying molecular features that are associated with biological function of intrinsically disordered protein regions

Elife. 2021 Feb 22:10:e60220. doi: 10.7554/eLife.60220.

Abstract

In previous work, we showed that intrinsically disordered regions (IDRs) of proteins contain sequence-distributed molecular features that are conserved over evolution, despite little sequence similarity that can be detected in alignments (Zarin et al., 2019). Here, we aim to use these molecular features to predict specific biological functions for individual IDRs and identify the molecular features within them that are associated with these functions. We find that the predictable functions are diverse. Examining the associated molecular features, we note some that are consistent with previous reports and identify others that were previously unknown. We experimentally confirm that elevated isoelectric point and hydrophobicity, features that are positively associated with mitochondrial localization, are necessary for mitochondrial targeting function. Remarkably, increasing isoelectric point in a synthetic IDR restores weak mitochondrial targeting. We believe feature analysis represents a new systematic approach to understand how biological functions of IDRs are specified by their protein sequences.

Keywords: Expectation-Maximization; IDRs; IRLS; S. cerevisiae; cell biology; computational biology; evolutionary signatures; lasso; logistic regression; systems biology.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Hydrophobic and Hydrophilic Interactions
  • Intrinsically Disordered Proteins / chemistry
  • Intrinsically Disordered Proteins / metabolism*
  • Isoelectric Point
  • Mitochondria / metabolism
  • Models, Statistical
  • Proteome / chemistry
  • Proteome / metabolism*
  • Saccharomyces cerevisiae / metabolism

Substances

  • Intrinsically Disordered Proteins
  • Proteome