Background & aims: Relatives of individuals with Crohn's disease (CD) carry CD-associated genetic variants and are often exposed to environmental factors that increase their risk for this disease. We aimed to estimate the utility of genotype, smoking status, family history, and biomarkers can calculate risk in asymptomatic first-degree relatives of patients with CD.
Methods: We recruited 480 healthy first-degree relatives (full siblings, offspring or parents) of patients with CD through the Guy's and St Thomas' NHS Foundation Trust and from members of Crohn's and Colitis, United Kingdom. DNA samples were genotyped using the Immunochip. We calculated a risk score for 454 participants, based on 72 genetic variants associated with CD, family history, and smoking history. Participants were assigned to highest and lowest risk score quartiles. We assessed pre-symptomatic inflammation by capsule endoscopy and measured 22 markers of inflammation in stool and serum samples (reference standard). Two machine-learning classifiers (elastic net and random forest) were used to assess the ability of the risk factors and biomarkers to identify participants with small intestinal inflammation in the same dataset.
Results: The machine-learning classifiers identified participants with pre-symptomatic intestinal inflammation: elastic net (area under the curve, 0.80; 95% CI, 0.62-0.98) and random forest (area under the curve, 0.87; 95% CI, 0.75-1.00). The elastic net method identified 3 variables that can be used to calculate odds for intestinal inflammation: combined family history of CD (odds ratio, 1.31), genetic risk score (odds ratio, 1.14), and fecal calprotectin (odds ratio, 1.04). These same 3 variables were among the 5 factors associated with intestinal inflammation in the random forest model.
Conclusion: Using machine learning classifiers, we found that genetic variants associated with CD, family history, and fecal calprotectin together identify individuals with pre-symptomatic intestinal inflammation who are therefore at risk for CD. A tool for detecting people at risk for CD before they develop symptoms would help identify the individuals most likely to benefit from early intervention.
Keywords: Capsule Endoscopy; Crohn's Disease; Environmental Risk; Fecal Calprotectin; First-Degree Relative; Genetic Risk; Machine Learning.
Copyright © 2020 AGA Institute. Published by Elsevier Inc. All rights reserved.