Introduction: The identification of peptides eluted from HLA complexes by mass spectrometry (MS) can provide critical data for deep learning models of antigen presentation prediction and promote neoantigen vaccine design. A major challenge remains in determining which HLA allele eluted peptides correspond to.
Methods: To address this, we present a tool for prediction of multiple allele (MA) presentation called LRMAHpan, which integrates LSTM network and ResNet_CA network for antigen processing and presentation prediction. We trained and tested the LRMAHpan BA (binding affinity) and the LRMAHpan AP (antigen processing) models using mass spectrometry data, subsequently combined them into the LRMAHpan PS (presentation score) model. Our approach is based on a novel pHLA encoding method that enables the integration of neoantigen prediction tasks into computer vision methods. This method aggregates MA data into a multichannel matrix and incorporates peptide sequences to efficiently capture binding signals.
Results: LRMAHpan outperforms standard predictors such as NetMHCpan 4.1, MHCflurry 2.0, and TransPHLA in terms of positive predictive value (PPV) when applied to MA data. Additionally, it can accommodate peptides of variable lengths and predict HLA class I and II presentation. We also predicted neoantigens in a cohort of metastatic melanoma patients, identifying several shared neoantigens.
Discussion: Our results demonstrate that LRMAHpan significantly improves the accuracy of antigen presentation predictions.
Keywords: MHC; antigen processing; biomedical engineering; deep learning; multi allelic HLA; neoantigen prediction.
Copyright © 2024 Mi, Li, Ye, Dai, Ding, Sun, Shen and Xiao.