It has recently become possible to simultaneously assay T-cell specificity with respect to large sets of antigens and the T-cell receptor sequence in high-throughput single-cell experiments. Leveraging this new type of data, we propose and benchmark a collection of deep learning architectures to model T-cell specificity in single cells. In agreement with previous results, we found that models that treat antigens as categorical outcome variables outperform those that model the TCR and antigen sequence jointly. Moreover, we show that variability in single-cell immune repertoire screens can be mitigated by modeling cell-specific covariates. Lastly, we demonstrate that the number of bound pMHC complexes can be predicted in a continuous fashion providing a gateway to disentangle cell-to-dextramer binding strength and receptor-to-pMHC affinity. We provide these models in the Python package TcellMatch to allow imputation of antigen specificities in single-cell RNA-seq studies on T cells without the need for MHC staining.
Keywords: T-cell receptors; antigen specificity; multimodal; single cell; supervised learning.
© 2020 The Authors. Published under the terms of the CC BY 4.0 license.