Dana-Farber repository for machine learning in immunology

J Immunol Methods. 2011 Nov 30;374(1-2):18-25. doi: 10.1016/j.jim.2011.07.007. Epub 2011 Jul 18.

Abstract

The immune system is characterized by high combinatorial complexity that necessitates the use of specialized computational tools for analysis of immunological data. Machine learning (ML) algorithms are used in combination with classical experimentation for the selection of vaccine targets and in computational simulations that reduce the number of necessary experiments. The development of ML algorithms requires standardized data sets, consistent measurement methods, and uniform scales. To bridge the gap between the immunology community and the ML community, we designed a repository for machine learning in immunology named Dana-Farber Repository for Machine Learning in Immunology (DFRMLI). This repository provides standardized data sets of HLA-binding peptides with all binding affinities mapped onto a common scale. It also provides a list of experimentally validated naturally processed T cell epitopes derived from tumor or virus antigens. The DFRMLI data were preprocessed and ensure consistency, comparability, detailed descriptions, and statistically meaningful sample sizes for peptides that bind to various HLA molecules. The repository is accessible at http://bio.dfci.harvard.edu/DFRMLI/.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Academies and Institutes
  • Algorithms
  • Allergy and Immunology / statistics & numerical data*
  • Artificial Intelligence*
  • Boston
  • Databases, Factual
  • Epitope Mapping / methods
  • Epitopes, T-Lymphocyte / metabolism
  • HLA Antigens / metabolism
  • Humans
  • Peptides / metabolism
  • Protein Binding

Substances

  • Epitopes, T-Lymphocyte
  • HLA Antigens
  • Peptides