Background: Major histocompatibility complex (MHC) class I molecules play key roles in host immunity against pathogens by presenting peptide antigens to CD8+ T-cells. Many variants of MHC molecules exist, and each has a unique preference for certain peptide ligands. Both experimental approaches and computational algorithms have been utilized to analyze these peptide MHC binding characteristics. Traditionally, MHC binding specificities have been described in terms of binding motifs. Such motifs classify certain peptide positions as primary and secondary anchors according to their impact on binding, and they list the preferred and deleterious residues at these positions. This provides a concise and easily communicatable summary of MHC binding specificities. However, so far there has been no algorithm to generate such binding motifs in an automated and uniform fashion.
Results: In this paper, we present a computational pipeline that takes peptide MHC binding data as input and produces a concise MHC binding motif. We tested our pipeline on a set of 18 MHC class I molecules and showed that the derived motifs are consistent with historic expert assignments.
Conclusions: We have implemented a pipeline that formally codifies rules to generate MHC binding motifs. The pipeline has been incorporated into the immune epitope database and analysis resource (IEDB) and motifs can be visualized while browsing MHC alleles in the IEDB.