Predictions of one-dimensional protein structures such as secondary structures and contact numbers are useful for predicting three-dimensional structure and important for understanding the sequence-structure relationship. Here we present a new machine-learning method, critical random networks (CRNs), for predicting one-dimensional structures, and apply it, with position-specific scoring matrices, to the prediction of secondary structures (SS), contact numbers (CN), and residue-wise contact orders (RWCO). The present method achieves, on average, Q3 accuracy of 77.8% for SS, and correlation coefficients of 0.726 and 0.601 for CN and RWCO, respectively. The accuracy of the SS prediction is comparable to that obtained with other state-of-the-art methods, and accuracy of the CN prediction is a significant improvement over that with previous methods. We give a detailed formulation of the critical random networks-based prediction scheme, and examine the context-dependence of prediction accuracies. In order to study the nonlinear and multi-body effects, we compare the CRNs-based method with a purely linear method based on position-specific scoring matrices. Although not superior to the CRNs-based method, the surprisingly good accuracy achieved by the linear method highlights the difficulty in extracting structural features of higher order from an amino acid sequence beyond the information provided by the position-specific scoring matrices.
Keywords: critical random network; one-dimensional structure; position-specific scoring matrix; protein structure prediction.