The worldwide prevalence of thyroid disease is on the rise, representing a chronic condition that significantly impacts global mortality rates. Machine learning (ML) approaches have demonstrated potential superiority in mitigating the occurrence of this disease by facilitating early detection and treatment. However, there is a growing demand among stakeholders and patients for reliable and credible explanations of the generated predictions in sensitive medical domains. Hence, we propose an interpretable thyroid classification model to illustrate outcome explanations and investigate the contribution of predictive features by utilizing explainable AI. Two real-time thyroid datasets underwent various preprocessing approaches, addressing data imbalance issues using the Synthetic Minority Over-sampling Technique with Edited Nearest Neighbors (SMOTE-ENN). Subsequently, two hybrid classifiers, namely RDKVT and RDKST, were introduced to train the processed and selected features from Univariate and Information Gain feature selection techniques. Following the training phase, the Shapley Additive Explanation (SHAP) was applied to identify the influential characteristics and corresponding values contributing to the outcomes. The conducted experiments ultimately concluded that the presented RDKST classifier achieved the highest performance, demonstrating an accuracy of 98.98 % when trained on Information Gain selected features. Notably, the features T3 (triiodothyronine), TT4 (total thyroxine), TSH (thyroid-stimulating hormone), FTI (free thyroxine index), and T3_measured significantly influenced the generated outcomes. By balancing classification accuracy and outcome explanation ability, this study aims to enhance the clinical decision-making process and improve patient care.
Keywords: And explainable AI; Ensemble methods; Machine learning; SMOTE-ENN; Thyroid disease.
© 2024 The Authors.