The recognition of pathogen or cancer-specific epitopes by CD8+ T cells is crucial for the clearance of infections and the response to cancer immunotherapy. This process requires epitopes to be presented on class I human leukocyte antigen (HLA-I) molecules and recognized by the T-cell receptor (TCR). Machine learning models capturing these two aspects of immune recognition are key to improve epitope predictions. Here, we assembled a high-quality dataset of naturally presented HLA-I ligands and experimentally verified neo-epitopes. We then integrated these data in a refined computational framework to predict antigen presentation (MixMHCpred2.2) and TCR recognition (PRIME2.0). The depth of our training data and the algorithmic developments resulted in improved predictions of HLA-I ligands and neo-epitopes. Prospectively applying our tools to SARS-CoV-2 proteins revealed several epitopes. TCR sequencing identified a monoclonal response in effector/memory CD8+ T cells against one of these epitopes and cross-reactivity with the homologous peptides from other coronaviruses.
Keywords: CD8(+) T cell epitopes; HLA-I peptidomics; antigen presentation; computational biology; epitope predictions; immunology; machine learning.
Copyright © 2022 Elsevier Inc. All rights reserved.