Natural protein sequences are more intrinsically disordered than random sequences

Jia-Feng Yu; Zanxia Cao; Yuedong Yang; Chun-Ling Wang; Zhen-Dong Su; Ya-Wei Zhao; Ji-Hua Wang; Yaoqi Zhou

doi:10.1007/s00018-016-2138-9

Natural protein sequences are more intrinsically disordered than random sequences

Cell Mol Life Sci. 2016 Aug;73(15):2949-57. doi: 10.1007/s00018-016-2138-9. Epub 2016 Jan 22.

Authors

Jia-Feng Yu¹, Zanxia Cao^{1

2}, Yuedong Yang³, Chun-Ling Wang², Zhen-Dong Su¹, Ya-Wei Zhao¹, Ji-Hua Wang^{1

2}, Yaoqi Zhou^{4

5}

Affiliations

¹ Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, 253023, China.
² College of Physics and Electronic Information, Dezhou University, Dezhou, 253023, China.
³ Institute for Glycomics and School of Information and Communication Technology, Griffith University, Parklands Dr, Southport, QLD, 4222, Australia.
⁴ Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, 253023, China. [email protected].
⁵ Institute for Glycomics and School of Information and Communication Technology, Griffith University, Parklands Dr, Southport, QLD, 4222, Australia. [email protected].

Abstract

Most natural protein sequences have resulted from millions or even billions of years of evolution. How they differ from random sequences is not fully understood. Previous computational and experimental studies of random proteins generated from noncoding regions yielded inclusive results due to species-dependent codon biases and GC contents. Here, we approach this problem by investigating 10,000 sequences randomized at the amino acid level. Using well-established predictors for protein intrinsic disorder, we found that natural sequences have more long disordered regions than random sequences, even when random and natural sequences have the same overall composition of amino acid residues. We also showed that random sequences are as structured as natural sequences according to contents and length distributions of predicted secondary structure, although the structures from random sequences may be in a molten globular-like state, according to molecular dynamics simulations. The bias of natural sequences toward more intrinsic disorder suggests that natural sequences are created and evolved to avoid protein aggregation and increase functional diversity.

Keywords: Molecular dynamics simulation; Molten globule; Protein intrinsic disorder; Random sequence; Secondary structure.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Amino Acids / chemistry
Computational Biology
Databases, Protein
Intrinsically Disordered Proteins / chemistry*
Protein Aggregates
Protein Conformation
Protein Structure, Secondary
Proteins / chemistry*
Sequence Analysis, Protein

Substances

Amino Acids
Intrinsically Disordered Proteins
Protein Aggregates
Proteins

Abstract

Publication types

MeSH terms

Substances

Grants and funding