Lipid discovery enabled by sequence statistics and machine learning

Elife. 2024 Dec 10:13:RP94929. doi: 10.7554/eLife.94929.

Abstract

Bacterial membranes are complex and dynamic, arising from an array of evolutionary pressures. One enzyme that alters membrane compositions through covalent lipid modification is MprF. We recently identified that Streptococcus agalactiae MprF synthesizes lysyl-phosphatidylglycerol (Lys-PG) from anionic PG, and a novel cationic lipid, lysyl-glucosyl-diacylglycerol (Lys-Glc-DAG), from neutral glycolipid Glc-DAG. This unexpected result prompted us to investigate whether Lys-Glc-DAG occurs in other MprF-containing bacteria, and whether other novel MprF products exist. Here, we studied protein sequence features determining MprF substrate specificity. First, pairwise analyses identified several streptococcal MprFs synthesizing Lys-Glc-DAG. Second, a restricted Boltzmann machine-guided approach led us to discover an entirely new substrate for MprF in Enterococcus, diglucosyl-diacylglycerol (Glc2-DAG), and an expanded set of organisms that modify glycolipid substrates using MprF. Overall, we combined the wealth of available sequence data with machine learning to model evolutionary constraints on MprF sequences across the bacterial domain, thereby identifying a novel cationic lipid.

Keywords: Enterococcus faecalis; MprF; Streptococcus agalactiae; computational biology; enzyme specificity; infectious disease; lipid membrane; microbiology; restricted Boltzmann machine; systems biology.

MeSH terms

  • Bacterial Proteins / chemistry
  • Bacterial Proteins / genetics
  • Bacterial Proteins / metabolism
  • Enterococcus / genetics
  • Glycolipids / chemistry
  • Glycolipids / metabolism
  • Lysine
  • Machine Learning*
  • Phosphatidylglycerols / chemistry
  • Phosphatidylglycerols / metabolism
  • Streptococcus agalactiae / enzymology
  • Streptococcus agalactiae / genetics
  • Substrate Specificity

Substances

  • Bacterial Proteins
  • Phosphatidylglycerols
  • lysylphosphatidylglycerol
  • Glycolipids
  • Lysine