Machine learning-enhanced immunopeptidomics applied to T-cell epitope discovery for COVID-19 vaccines

Nat Commun. 2024 Nov 28;15(1):10316. doi: 10.1038/s41467-024-54734-9.

Abstract

Next-generation T-cell-directed vaccines for COVID-19 focus on establishing lasting T-cell immunity against current and emerging SARS-CoV-2 variants. Precise identification of conserved T-cell epitopes is critical for designing effective vaccines. Here we introduce a comprehensive computational framework incorporating a machine learning algorithm-MHCvalidator-to enhance mass spectrometry-based immunopeptidomics sensitivity. MHCvalidator identifies unique T-cell epitopes presented by the B7 supertype, including an epitope from a + 1-frameshift in a truncated Spike antigen, supported by ribosome profiling. Analysis of 100,512 COVID-19 patient proteomes shows Spike antigen truncation in 0.85% of cases, revealing frameshifted viral antigens at the population level. Our EpiTrack pipeline tracks global mutations of MHCvalidator-identified CD8 + T-cell epitopes from the BNT162b4 vaccine. While most vaccine epitopes remain globally conserved, an immunodominant A*01-associated epitope mutates in Delta and Omicron variants. This work highlights SARS-CoV-2 antigenic features and emphasizes the importance of continuous adaptation in T-cell vaccine development.

MeSH terms

  • CD8-Positive T-Lymphocytes / immunology
  • COVID-19 Vaccines* / immunology
  • COVID-19* / immunology
  • COVID-19* / prevention & control
  • COVID-19* / virology
  • Epitopes, T-Lymphocyte* / immunology
  • Humans
  • Machine Learning*
  • Proteomics / methods
  • SARS-CoV-2* / genetics
  • SARS-CoV-2* / immunology
  • Spike Glycoprotein, Coronavirus* / genetics
  • Spike Glycoprotein, Coronavirus* / immunology

Substances

  • Epitopes, T-Lymphocyte
  • COVID-19 Vaccines
  • Spike Glycoprotein, Coronavirus
  • spike protein, SARS-CoV-2

Supplementary concepts

  • SARS-CoV-2 variants