Laccases, multi-copper oxidases, play pivotal roles in the oxidation of a variety of substrates, impacting numerous biological functions and industrial processes. However, their industrial adoption has been limited by challenges in thermostability. This study employed advanced computational models, including random forest (RF) regressors and convolutional neural networks (CNNs), to predict and enhance the thermostability of laccases. Initially, the RF model estimated melting temperatures with a training mean squared error (MSE) of 13.98, and while it demonstrated high training accuracy (93.01%), the test and validation MSEs of 48.81 and 58.42, respectively, indicated areas for model optimization. The CNN model further refined these predictions, achieving lower training and validation MSEs, thus demonstrating enhanced capability in discerning complex patterns within genomic sequences indicative of thermostability. The integration of these models not only improved prediction accuracy but also provided insights into the critical determinants of enzyme stability, thereby supporting their broader industrial application. Our findings underscore the potential of machine learning in advancing enzyme engineering, with implications for enhancing industrial enzyme stability.
Keywords: computational genomics; convolutional neural network; enzyme engineering; laccase thermostability; machine learning; random forest regression.