Unconstrained generation of synthetic antibody-antigen structures to guide machine learning methodology for antibody specificity prediction

Nat Comput Sci. 2022 Dec;2(12):845-865. doi: 10.1038/s43588-022-00372-4. Epub 2022 Dec 19.

Abstract

Machine learning (ML) is a key technology for accurate prediction of antibody-antigen binding. Two orthogonal problems hinder the application of ML to antibody-specificity prediction and the benchmarking thereof: the lack of a unified ML formalization of immunological antibody-specificity prediction problems and the unavailability of large-scale synthetic datasets to benchmark real-world relevant ML methods and dataset design. Here we developed the Absolut! software suite that enables parameter-based unconstrained generation of synthetic lattice-based three-dimensional antibody-antigen-binding structures with ground-truth access to conformational paratope, epitope and affinity. We formalized common immunological antibody-specificity prediction problems as ML tasks and confirmed that for both sequence- and structure-based tasks, accuracy-based rankings of ML methods trained on experimental data hold for ML methods trained on Absolut!-generated data. The Absolut! framework has the potential to enable real-world relevant development and benchmarking of ML strategies for biotherapeutics design.

MeSH terms

  • Antibodies*
  • Antibody Specificity
  • Antigen-Antibody Reactions*
  • Epitopes / chemistry
  • Machine Learning

Substances

  • Antibodies
  • Epitopes