While there has been substantial progress in our ability to predict changes in protein stability due to amino acid substitutions, progress has been slower in methods to predict the absolute stability of a protein. Here, we show how a generative model for protein sequence can be leveraged to predict absolute protein stability. We benchmark our predictions across a broad set of proteins and find a mean error of 1.5 kcal/mol and a correlation coefficient of 0.7 for the absolute stability across a range of natural, small- to medium-sized proteins up to ca. 150 amino acid residues. We analyze current limitations and future directions including how such a model may be useful for predicting conformational free energies. Our approach is simple to use and freely available at an online implementation available via https://github.com/KULL-Centre/_2024_cagiada_stability.
Keywords: machine learning; protein folding; protein stability; thermodynamics.
© 2024 The Protein Society.