Single amino acid substitution is the type of protein alteration most related to human diseases. Current studies seek primarily to distinguish neutral mutations from harmful ones. Very few methods offer an explanation of the final prediction result in terms of the probable structural or functional effect on the protein. In this study, we describe the use of three novel parameters to identify experimentally-verified critical residues of the TP53 protein (p53). The first two parameters make use of a surface clustering method to calculate the protein surface area of highly conserved regions or regions with high nonlocal atomic interaction energy (ANOLEA) score. These parameters help identify important functional regions on the surface of a protein. The last parameter involves the use of a new method for pseudobinding free-energy estimation to specifically probe the importance of residue side-chains to the stability of protein fold. A decision tree was designed to optimally combine these three parameters. The result was compared to the functional data stored in the International Agency for Research on Cancer (IARC) TP53 mutation database. The final prediction achieved a prediction accuracy of 70% and a Matthews correlation coefficient of 0.45. It also showed a high specificity of 91.8%. Mutations in the 85 correctly identified important residues represented 81.7% of the total mutations recorded in the database. In addition, the method was able to correctly assign a probable functional or structural role to the residues. Such information could be critical for the interpretation and prediction of the effect of missense mutations, as it not only provided the fundamental explanation of the observed effect, but also helped design the most appropriate laboratory experiment to verify the prediction results.
(c) 2006 Wiley-Liss, Inc.