Challenges in interpreting allergen microarrays in relation to clinical symptoms: a machine learning approach

Pediatr Allergy Immunol. 2014 Feb;25(1):71-9. doi: 10.1111/pai.12139. Epub 2013 Oct 16.

Abstract

Background: Identifying different patterns of allergens and understanding their predictive ability in relation to asthma and other allergic diseases is crucial for the design of personalized diagnostic tools.

Methods: Allergen-IgE screening using ImmunoCAP ISAC(®) assay was performed at age 11 yrs in children participating a population-based birth cohort. Logistic regression (LR) and nonlinear statistical learning models, including random forests (RF) and Bayesian networks (BN), coupled with feature selection approaches, were used to identify patterns of allergen responses associated with asthma, rhino-conjunctivitis, wheeze, eczema and airway hyper-reactivity (AHR, positive methacholine challenge). Sensitivity/specificity and area under the receiver operating characteristic (AUROC) were used to assess model performance via repeated validation.

Results: Serum sample for IgE measurement was obtained from 461 of 822 (56.1%) participants. Two hundred and thirty-eight of 461 (51.6%) children had at least one of 112 allergen components IgE > 0 ISU. The binary threshold >0.3 ISU performed less well than using continuous IgE values, discretizing data or using other data transformations, but not significantly (p = 0.1). With the exception of eczema (AUROC~0.5), LR, RF and BN achieved comparable AUROC, ranging from 0.76 to 0.82. Dust mite, pollens and pet allergens were highly associated with asthma, whilst pollens and dust mite with rhino-conjunctivitis. Egg/bovine allergens were associated with eczema.

Conclusions: After validation, LR, RF and BN demonstrated reasonable discrimination ability for asthma, rhino-conjunctivitis, wheeze and AHR, but not for eczema. However, further improvements in threshold ascertainment and/or value transformation for different components, and better interpretation algorithms are needed to fully capitalize on the potential of the technology.

Keywords: Bayesian networks; IgE; airway hyper-reactivity; asthma; children; component-resolved diagnostics; feature selection; logistic regression; machine learning; methacholine; random forests; rhinitis; wheeze.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Allergens / immunology
  • Animals
  • Artificial Intelligence
  • Asthma / diagnosis*
  • Automation, Laboratory
  • Bronchial Hyperreactivity / diagnosis*
  • Bronchial Provocation Tests
  • Child
  • Cohort Studies
  • Diagnostic Tests, Routine
  • Female
  • Humans
  • Hypersensitivity / diagnosis*
  • Immunoglobulin E / blood*
  • Male
  • Microarray Analysis / methods*
  • Population Groups
  • Precision Medicine
  • Predictive Value of Tests
  • Reproducibility of Results

Substances

  • Allergens
  • Immunoglobulin E