Classification of dendritic cell phenotypes from gene expression data

BMC Immunol. 2011 Aug 29:12:50. doi: 10.1186/1471-2172-12-50.

Abstract

Background: The selection of relevant genes for sample classification is a common task in many gene expression studies. Although a number of tools have been developed to identify optimal gene expression signatures, they often generate gene lists that are too long to be exploited clinically. Consequently, researchers in the field try to identify the smallest set of genes that provide good sample classification. We investigated the genome-wide expression of the inflammatory phenotype in dendritic cells. Dendritic cells are a complex group of cells that play a critical role in vertebrate immunity. Therefore, the prediction of the inflammatory phenotype in these cells may help with the selection of immune-modulating compounds.

Results: A data mining protocol was applied to microarray data for murine cell lines treated with various inflammatory stimuli. The learning and validation data sets consisted of 155 and 49 samples, respectively. The data mining protocol reduced the number of probe sets from 5,802 to 10, then from 10 to 6 and finally from 6 to 3. The performances of a set of supervised classification models were compared. The best accuracy, when using the six following genes --Il12b, Cd40, Socs3, Irgm1, Plin2 and Lgals3bp-- was obtained by Tree Augmented Naïve Bayes and Nearest Neighbour (91.8%). Using the smallest set of three genes --Il12b, Cd40 and Socs3-- the performance remained satisfactory and the best accuracy was with Support Vector Machine (95.9%). These data mining models, using data for the genes Il12b, Cd40 and Socs3, were validated with a human data set consisting of 27 samples. Support Vector Machines (71.4%) and Nearest Neighbour (92.6%) gave the worst performances, but the remaining models correctly classified all the 27 samples.

Conclusions: The genes selected by the data mining protocol proposed were shown to be informative for discriminating between inflammatory and steady-state phenotypes in dendritic cells. The robustness of the data mining protocol was confirmed by the accuracy for a human data set, when using only the following three genes: Il12b, Cd40 and Socs3. In summary, we analysed the longitudinal pattern of expression in dendritic cells stimulated with activating agents with the aim of identifying signatures that would predict or explain the dentritic cell response to an inflammatory agent.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • CD40 Antigens / genetics*
  • Cell Differentiation / immunology
  • Data Mining / methods
  • Dendritic Cells / classification*
  • Dendritic Cells / immunology*
  • Dendritic Cells / metabolism
  • Dendritic Cells / pathology
  • Gene Expression Profiling
  • Genome-Wide Association Study
  • Humans
  • Immunity, Cellular
  • Inflammation Mediators / immunology
  • Inflammation Mediators / metabolism
  • Information Systems
  • Interleukin-12 Subunit p40 / genetics*
  • Mice
  • Microarray Analysis
  • Suppressor of Cytokine Signaling 3 Protein
  • Suppressor of Cytokine Signaling Proteins / genetics*

Substances

  • CD40 Antigens
  • IL12B protein, human
  • Inflammation Mediators
  • Interleukin-12 Subunit p40
  • SOCS3 protein, human
  • Suppressor of Cytokine Signaling 3 Protein
  • Suppressor of Cytokine Signaling Proteins