Comparative evaluation of support vector machines for computer aided diagnosis of lung cancer in CT based on a multi-dimensional data set

Comput Methods Programs Biomed. 2013 Aug;111(2):519-24. doi: 10.1016/j.cmpb.2013.04.016. Epub 2013 May 31.

Abstract

Lung cancer is one of the most common forms of cancer resulting in over a million deaths per year worldwide. In this paper, the usage of support vector machine (SVM) classification for lung cancer is investigated, presenting a systematic quantitative evaluation against Boosting, Decision trees, k-nearest neighbor, LASSO regressions, neural networks and random forests. A large database of 5984 regions of interest (ROIs) and 488 input features (including textural features, patient characteristics, and morphological features) were used to train the classifiers and evaluate for their performance. The evaluation for classifiers' performance was based on a tenfold cross validation framework, receiver operating characteristic curve (ROC), and Matthews correlation coefficient. Area under curve (AUC) of SVM, Boosting, Decision trees, k-nearest neighbor, LASSO, neural networks, random forests were 0.94, 0.86, 0.73, 0.72, 0.91, 0.92, and 0.85, respectively. It was proved that SVM classification offered significantly increased classification performance compared to the reference methods. This scheme may be used as an auxiliary tool to differentiate between benign and malignant SPNs of CT images in future.

Keywords: CT image; Curvelet; Solitary pulmonary nodule; Support vector machine; Texture extraction.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Area Under Curve
  • Databases, Factual
  • Decision Trees
  • Diagnosis, Computer-Assisted / instrumentation
  • Diagnosis, Computer-Assisted / methods*
  • Early Detection of Cancer / methods
  • Female
  • Humans
  • Lung Neoplasms / diagnosis*
  • Male
  • Models, Statistical
  • Neural Networks, Computer
  • ROC Curve
  • Radiographic Image Interpretation, Computer-Assisted
  • Regression Analysis
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Support Vector Machine*
  • Tomography, X-Ray Computed / methods