Protein stability is a critical aspect of molecular biology and biochemistry, hinges on an intricate balance of thermodynamic and structural factors. Determining protein stability is crucial for understanding and manipulating biological machineries, as it directly correlated with the protein function. Thus, this study delves into the intricacies of protein stability, highlighting its dependence on various factors, including thermodynamics, thermal conditions, and structural properties. Moreover, a notable focus is placed on the free energy change of unfolding (ΔGunfolding), change in heat capacity (ΔCp) with protein structural transition, melting temperature (Tm) and number of disulfide bonds, which are critical parameters in understanding protein stability. In this study, a machine learning (ML) predictive model was developed to estimate these four parameters using the primary sequence of the protein. The shortfall of available tools for protein stability prediction based on multiple parameters propelled the completion of this study. Convolutional Neural Network (CNN) with multiple layers was adopted to develop a more reliable ML model. Individual predictive models were prepared for each property, and all the prepared models showed results with high accuracy. The R2 (coefficient of determination) of these models were 0.79, 0.78, 0.92 and 0.92, respectively, for ΔG, ΔCp, Tm and disulfide bonds. A case study on stability analysis of two homologous proteins was presented to validate the results predicted through the developed model. The case study included in silico analysis of protein stability using molecular docking and molecular dynamic simulations. This validation study assured the accuracy of each model in predicting the stability associated properties. The alignment of physics-based principles with ML models has provided an opportunity to develop a fast machine learning solution to replace the computationally demanding physics-based calculations used to determine protein stability. Furthermore, this work provided valuable insights into the impact of mutation on protein stability, which has implications for the field of protein engineering. The source codes are available at https://github.com/Growdeatechnology.
Keywords: Convolutional Neural Networks; Disulfide bonds; Gibbs Free Energy of Unfolding; Heat capacity; Machine Learning; Melting Temperature; Protein Folding; Protein Stability; Thermodynamics of Proteins.
Copyright © 2024 Elsevier Ltd. All rights reserved.