In this paper, we have focused on machine learning (ML) feature selection (FS) algorithms for identifying and diagnosing multidrug-resistant (MDR) tuberculosis (TB). MDR-TB is a universal public health problem, and its early detection has been one of the burning issues. The present study has been conducted in the Malakand Division of Khyber Pakhtunkhwa, Pakistan, to further add to the knowledge on the disease and to deal with the issues of identification and early detection of MDR-TB by ML algorithms. These models also identify the most important factors causing MDR-TB infection whose study gives additional insights into the matter. ML algorithms such as random forest, k-nearest neighbors, support vector machine, logistic regression, leaset absolute shrinkage and selection operator (LASSO), artificial neural networks (ANNs), and decision trees are applied to analyse the case-control dataset. This study reveals that close contacts of MDR-TB patients, smoking, depression, previous TB history, improper treatment, and interruption in first-line TB treatment have a great impact on the status of MDR. Accordingly, weight loss, chest pain, hemoptysis, and fatigue are important symptoms. Based on accuracy, sensitivity, and specificity, SVM and RF are the suggested models to be used for patients' classifications.
Copyright © 2021 Mian Haider Ali et al.