INeo-Epp: A Novel T-Cell HLA Class-I Immunogenicity or Neoantigenic Epitope Prediction Method Based on Sequence-Related Amino Acid Features

Biomed Res Int. 2020 Jun 15:2020:5798356. doi: 10.1155/2020/5798356. eCollection 2020.

Abstract

In silico T-cell epitope prediction plays an important role in immunization experimental design and vaccine preparation. Currently, most epitope prediction research focuses on peptide processing and presentation, e.g., proteasomal cleavage, transporter associated with antigen processing (TAP), and major histocompatibility complex (MHC) combination. To date, however, the mechanism for the immunogenicity of epitopes remains unclear. It is generally agreed upon that T-cell immunogenicity may be influenced by the foreignness, accessibility, molecular weight, molecular structure, molecular conformation, chemical properties, and physical properties of target peptides to different degrees. In this work, we tried to combine these factors. Firstly, we collected significant experimental HLA-I T-cell immunogenic peptide data, as well as the potential immunogenic amino acid properties. Several characteristics were extracted, including the amino acid physicochemical property of the epitope sequence, peptide entropy, eluted ligand likelihood percentile rank (EL rank(%)) score, and frequency score for an immunogenic peptide. Subsequently, a random forest classifier for T-cell immunogenic HLA-I presenting antigen epitopes and neoantigens was constructed. The classification results for the antigen epitopes outperformed the previous research (the optimal AUC = 0.81, external validation data set AUC = 0.77). As mutational epitopes generated by the coding region contain only the alterations of one or two amino acids, we assume that these characteristics might also be applied to the classification of the endogenic mutational neoepitopes also called "neoantigens." Based on mutation information and sequence-related amino acid characteristics, a prediction model of a neoantigen was established as well (the optimal AUC = 0.78). Further, an easy-to-use web-based tool "INeo-Epp" was developed for the prediction of human immunogenic antigen epitopes and neoantigen epitopes.

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Area Under Curve
  • Computational Biology / methods*
  • Databases, Protein
  • Epitopes, T-Lymphocyte* / chemistry
  • Epitopes, T-Lymphocyte* / immunology
  • Epitopes, T-Lymphocyte* / metabolism
  • Genes, MHC Class I / immunology*
  • Humans
  • Machine Learning
  • Peptides / chemistry
  • Peptides / immunology
  • Peptides / metabolism
  • Sequence Analysis, Protein
  • Software*
  • T-Lymphocytes* / chemistry
  • T-Lymphocytes* / immunology

Substances

  • Epitopes, T-Lymphocyte
  • Peptides