Motivation: Mutations in protein-protein interactions can affect the corresponding complexes, impacting function and potentially leading to disease. Given the abundance of membrane proteins, it is crucial to assess the impact of mutations on the binding affinity of these proteins. Although several methods exist to predict the binding free energy change due to mutations in protein-protein complexes, most require structural information of the protein complex and are primarily trained on the SKEMPI database, which is composed mainly of soluble proteins.
Results: A novel sequence-based method (SAAMBE-MEM) for predicting binding free energy changes (ΔΔG) in membrane protein-protein complexes due to mutations has been developed. This method utilized the MPAD database, which contains binding affinities for wild-type and mutant membrane protein complexes. A machine learning model was developed to predict ΔΔG by leveraging features such as amino acid indices and position-specific scoring matrices (PSSM). Through extensive dataset curation and feature extraction, SAAMBE-MEM was trained and validated using the XGBoost regression algorithm. The optimal feature set, including PSSM-related features, achieved a Pearson correlation coefficient of 0.64, outperforming existing methods trained on the SKEMPI database. Furthermore, it was demonstrated that SAAMBE-MEM performs much better when utilizing evolution-based features in contrast to physicochemical features.
Availability and implementation: The method is accessible via a web server and standalone code at http://compbio.clemson.edu/SAAMBE-MEM/. The cleaned MPAD database is available at the website.
© The Author(s) 2024. Published by Oxford University Press.