Despite the promise of deep learning accelerated protein engineering, examples of such improved proteins are scarce. Here we report that a 3D convolutional neural network trained to associate amino acids with neighboring chemical microenvironments can guide identification of novel gain-of-function mutations that are not predicted by energetics-based approaches. Amalgamation of these mutations improved protein function in vivo across three diverse proteins by at least 5-fold. Furthermore, this model provides a means to interrogate the chemical space within protein microenvironments and identify specific chemical interactions that contribute to the gain-of-function phenotypes resulting from individual mutations.
Keywords: computational protein design; machine learning; neural networks; protein engineering.