Random protein sequences can form defined secondary structures and are well-tolerated in vivo

Sci Rep. 2017 Nov 13;7(1):15449. doi: 10.1038/s41598-017-15635-8.

Abstract

The protein sequences found in nature represent a tiny fraction of the potential sequences that could be constructed from the 20-amino-acid alphabet. To help define the properties that shaped proteins to stand out from the space of possible alternatives, we conducted a systematic computational and experimental exploration of random (unevolved) sequences in comparison with biological proteins. In our study, combinations of secondary structure, disorder, and aggregation predictions are accompanied by experimental characterization of selected proteins. We found that the overall secondary structure and physicochemical properties of random and biological sequences are very similar. Moreover, random sequences can be well-tolerated by living cells. Contrary to early hypotheses about the toxicity of random and disordered proteins, we found that random sequences with high disorder have low aggregation propensity (unlike random sequences with high structural content) and were particularly well-tolerated. This direct structure content/aggregation propensity dependence differentiates random and biological proteins. Our study indicates that while random sequences can be both structured and disordered, the properties of the latter make them better suited as progenitors (in both in vivo and in vitro settings) for further evolution of complex, soluble, three-dimensional scaffolds that can perform specific biochemical tasks.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Circular Dichroism
  • Computational Biology
  • Databases, Protein
  • Datasets as Topic
  • Models, Molecular*
  • Nuclear Magnetic Resonance, Biomolecular
  • Peptide Library*
  • Protein Aggregates
  • Protein Folding
  • Protein Structure, Secondary*
  • Recombinant Proteins / chemistry*
  • Recombinant Proteins / isolation & purification
  • Recombinant Proteins / toxicity
  • Solubility

Substances

  • Peptide Library
  • Protein Aggregates
  • Recombinant Proteins