The recognition of T-cell receptor (TCR) on the surface of T cell to specific epitope presented by the major histocompatibility complex is the key to trigger the immune response. Identifying the binding rules of TCR-epitope pair is crucial for developing immunotherapies, including neoantigen vaccine and drugs. Accurate prediction of TCR-epitope binding specificity via deep learning remains challenging, especially in test cases which are unseen in the training set. Here, we propose TEPCAM (TCR-EPitope identification based on Cross-Attention and Multi-channel convolution), a deep learning model that incorporates self-attention, cross-attention mechanism, and multi-channel convolution to improve the generalizability and enhance the model interpretability. Experimental results demonstrate that our model outperformed several state-of-the-art models on two challenging tasks including a strictly split dataset and an external dataset. Furthermore, the model can learn some interaction patterns between TCR and epitope by extracting the interpretable matrix from cross-attention layer and mapping them to the three-dimensional structures. The source code and data are freely available at https://github.com/Chenjw99/TEPCAM.
Keywords: TCR-epitope binding specificity; convolution; cross-attention; deep learning; model interpretability.
© 2023 The Protein Society.