Background and objectives: Vitamin D (25-hydroxyvitamin D or 25OHD) has a key role in the pathogenesis of several chronic disorders. Vitamin D deficiency is a common global public health problem. We aimed to evaluate the risk factors associated with vitamin D deficiency using a decision tree algorithm.
Methods: A total of 988 adolescent girls, aged 12-18 years old, were recruited to the study. Demographic characteristics, serum biochemical factors, all blood count parameters and trace elements such as Zinc, Copper, Calcium and SOD were measured. Serum levels of vitamin D below 20 ng/ml were considered to be deficiency. 70% of these girls (618 cases) were randomly allocated to a training dataset for the constructing of the decision-tree. The remaining 30% (285 cases) were used as the testing dataset to evaluate the performance of decision-tree. In this model, 14 input variables were included: age, academic attainment of their father, waist circumference, waist to hip ratio, zinc, copper, calcium, SOD, FBG, HDL-C, RBC, MCV, MCHC, HCT. The validation of the model was assessed by constructing a receiver operating characteristic (ROC) curve.
Results: The results showed that serum Zn concentration was the most important associated risk factor for vitamin D deficiency. The sensitivity, specificity, accuracy and the area under the ROC curve (AUC) values were 79.3%, 64%, 77.8% and 0.72 respectively using the testing dataset.
Conclusions: The results suggest that the serum levels of Zn is an important associated risk factor for identifying subjects with vitamin D deficiency among Iranian adolescent girls.
Keywords: 25OHD; Data mining; Decision tree; Vitamin D deficiency.
Crown Copyright © 2019. Published by Elsevier Ltd. All rights reserved.