Background: The asthma syndrome is influenced by hereditary and environmental factors. With the example of farm exposure, we study whether genetic and environmental factors interact for asthma.
Methods: Statistical learning approaches based on penalized regression and decision trees were used to predict asthma in the GABRIELA study with 850 cases (9% farm children) and 857 controls (14% farm children). Single-nucleotide polymorphisms (SNPs) were selected from a genome-wide dataset based on a literature search or by statistical selection techniques. Prediction was assessed by receiver operating characteristics (ROC) curves and validated in the PASTURE cohort.
Results: Prediction by family history of asthma and atopy yielded an area under the ROC curve (AUC) of 0.62 [0.57-0.66] in the random forest machine learning approach. By adding information on demographics (sex and age) and 26 environmental exposure variables, the quality of prediction significantly improved (AUC = 0.65 [0.61-0.70]). In farm children, however, environmental variables did not improve prediction quality. Rather SNPs related to IL33 and RAD50 contributed significantly to the prediction of asthma (AUC = 0.70 [0.62-0.78]).
Conclusions: Asthma in farm children is more likely predicted by other factors as compared to non-farm children though in both forms, family history may integrate environmental exposure, genotype and degree of penetrance.
Keywords: childhood asthma; environment; farming; genome-wide association studies; machine learning; penalized regression; random forest; risk prediction; single-nucleotide polymorphisms; statistical learning.
© 2020 The Authors. Pediatric Allergy and Immunology published by European Academy of Allergy and Clinical Immunology and John Wiley & Sons Ltd.