Machine Learning for Prioritization of Thermostabilizing Mutations for G-Protein Coupled Receptors

Biophys J. 2019 Dec 3;117(11):2228-2239. doi: 10.1016/j.bpj.2019.10.023. Epub 2019 Oct 24.

Abstract

Although the three-dimensional structures of G-protein coupled receptors (GPCRs), the largest superfamily of drug targets, have enabled structure-based drug design, there are no structures available for 87% of GPCRs. This is due to the stiff challenge in purifying the inherently flexible GPCRs. Identifying thermostabilized mutant GPCRs via systematic alanine scanning mutations has been a successful strategy in stabilizing GPCRs, but it remains a daunting task for each GPCR. We developed a computational method that combines sequence-, structure-, and dynamics-based molecular properties of GPCRs that recapitulate GPCR stability, with four different machine learning methods to predict thermostable mutations ahead of experiments. This method has been trained on thermostability data for 1231 mutants, the largest publicly available data set. A blind prediction for thermostable mutations of the complement factor C5a receptor 1 retrieved 36% of the thermostable mutants in the top 50 prioritized mutants compared to 3% in the first 50 attempts using systematic alanine scanning.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alanine / chemistry
  • Alanine / genetics
  • Amino Acid Substitution
  • HEK293 Cells
  • Humans
  • Machine Learning
  • Molecular Dynamics Simulation*
  • Mutation*
  • Protein Domains
  • Protein Stability
  • Receptor, Anaphylatoxin C5a / chemistry*
  • Receptor, Anaphylatoxin C5a / genetics
  • Sequence Analysis / methods*

Substances

  • Receptor, Anaphylatoxin C5a
  • Alanine