Linguistics-based formalization of the antibody language as a basis for antibody language models

Nat Comput Sci. 2024 Jun;4(6):412-422. doi: 10.1038/s43588-024-00642-3. Epub 2024 Jun 14.

Abstract

Apparent parallels between natural language and antibody sequences have led to a surge in deep language models applied to antibody sequences for predicting cognate antigen recognition. However, a linguistic formal definition of antibody language does not exist, and insight into how antibody language models capture antibody-specific binding features remains largely uninterpretable. Here we describe how a linguistic formalization of the antibody language, by characterizing its tokens and grammar, could address current challenges in antibody language model rule mining.

Publication types

  • Review

MeSH terms

  • Antibodies* / immunology
  • Humans
  • Linguistics*

Substances

  • Antibodies