The efficacy of machine learning models in lung cancer risk prediction with explainability

PLoS One. 2024 Jun 13;19(6):e0305035. doi: 10.1371/journal.pone.0305035. eCollection 2024.

Abstract

Among many types of cancers, to date, lung cancer remains one of the deadliest cancers around the world. Many researchers, scientists, doctors, and people from other fields continuously contribute to this subject regarding early prediction and diagnosis. One of the significant problems in prediction is the black-box nature of machine learning models. Though the detection rate is comparatively satisfactory, people have yet to learn how a model came to that decision, causing trust issues among patients and healthcare workers. This work uses multiple machine learning models on a numerical dataset of lung cancer-relevant parameters and compares performance and accuracy. After comparison, each model has been explained using different methods. The main contribution of this research is to give logical explanations of why the model reached a particular decision to achieve trust. This research has also been compared with a previous study that worked with a similar dataset and took expert opinions regarding their proposed model. We also showed that our research achieved better results than their proposed model and specialist opinion using hyperparameter tuning, having an improved accuracy of almost 100% in all four models.

MeSH terms

  • Humans
  • Lung Neoplasms* / diagnosis
  • Machine Learning*
  • Risk Assessment / methods

Grants and funding

This study was supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2024R49), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia in the form of a grant awarded to H.I.A. The specific roles of this author are articulated in the ‘author contributions’ section.