The efficacy of machine learning models in lung cancer risk prediction with explainability

Refat Khan Pathan; Israt Jahan Shorna; Md Sayem Hossain; Mayeen Uddin Khandaker; Huda I Almohammed; Zuhal Y Hamd

doi:10.1371/journal.pone.0305035

The efficacy of machine learning models in lung cancer risk prediction with explainability

PLoS One. 2024 Jun 13;19(6):e0305035. doi: 10.1371/journal.pone.0305035. eCollection 2024.

Authors

Refat Khan Pathan¹, Israt Jahan Shorna², Md Sayem Hossain³, Mayeen Uddin Khandaker^{4

5}, Huda I Almohammed⁶, Zuhal Y Hamd⁶

Affiliations

¹ Department of Computing and Information Systems, School of Engineering and Technology, Sunway University, Selangor, Malaysia.
² Shamsun Nahar Khan Nursing College, Chattogram, Bangladesh.
³ School of Computing Science, Faculty of Innovation and Technology, Taylor's University Lakeside Campus, Selangor, Malaysia.
⁴ Applied Physics and Radiation Technologies Group, CCDCU, School of Engineering and Technology, Sunway University, Selangor, Malaysia.
⁵ Faculty of Graduate Studies, Daffodil International University, Daffodil Smart City, Savar, Dhaka, Bangladesh.
⁶ Department of Radiological Sciences, College of Health and Rehabilitation Sciences, Princess Nourah Bint Abdulrahman University, Riyadh, Saudi Arabia.

Abstract

Among many types of cancers, to date, lung cancer remains one of the deadliest cancers around the world. Many researchers, scientists, doctors, and people from other fields continuously contribute to this subject regarding early prediction and diagnosis. One of the significant problems in prediction is the black-box nature of machine learning models. Though the detection rate is comparatively satisfactory, people have yet to learn how a model came to that decision, causing trust issues among patients and healthcare workers. This work uses multiple machine learning models on a numerical dataset of lung cancer-relevant parameters and compares performance and accuracy. After comparison, each model has been explained using different methods. The main contribution of this research is to give logical explanations of why the model reached a particular decision to achieve trust. This research has also been compared with a previous study that worked with a similar dataset and took expert opinions regarding their proposed model. We also showed that our research achieved better results than their proposed model and specialist opinion using hyperparameter tuning, having an improved accuracy of almost 100% in all four models.

Copyright: © 2024 Pathan et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

MeSH terms

Humans
Lung Neoplasms* / diagnosis
Machine Learning*
Risk Assessment / methods

Grants and funding

This study was supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2024R49), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia in the form of a grant awarded to H.I.A. The specific roles of this author are articulated in the ‘author contributions’ section.