Computer-aided diagnosis of pulmonary nodules on CT scans: improvement of classification performance with nodule surface features

Med Phys. 2009 Jul;36(7):3086-98. doi: 10.1118/1.3140589.

Abstract

The purpose of this work is to develop a computer-aided diagnosis (CAD) system to differentiate malignant and benign lung nodules on CT scans. A fully automated system was designed to segment the nodule from its surrounding structured background in a local volume of interest (VOI) and to extract image features for classification. Image segmentation was performed with a 3D active contour method. The initial contour was obtained as the boundary of a binary object generated by k-means clustering within the VOI and smoothed by morphological opening. A data set of 256 lung nodules (124 malignant and 132 benign) from 152 patients was used in this study. In addition to morphological and texture features, the authors designed new nodule surface features to characterize the lung nodule surface smoothness and shape irregularity. The effects of two demographic features, age and gender, as adjunct to the image features were also investigated. A linear discriminant analysis (LDA) classifier built with features from stepwise feature selection was trained using simplex optimization to select the most effective features. A two-loop leave-one-out resampling scheme was developed to reduce the optimistic bias in estimating the test performance of the CAD system. The area under the receiver operating characteristic curve, A(z), for the test cases improved significantly (p < 0.05) from 0.821 +/- 0.026 to 0.857 +/- 0.023 when the newly developed image features were included with the original morphological and texture features. A similar experiment performed on the data set restricted to primary cancers and benign nodules, excluding the metastatic cancers, also resulted in an improved test A(z), though the improvement did not reach statistical significance (p = 0.07). The two demographic features did not significantly affect the performance of the CAD system (p > 0.05) when they were added to the feature space containing the morphological, texture, and new gradient field and radius features. To investigate if a support vector machine (SVM) classifier can achieve improved performance over the LDA classifier, we compared the performance of the LDA and SVMs with various kernels and parameters. Principal component analysis was used to reduce the dimensionality of the feature space for both the LDA and the SVM classifiers. When the number of selected principal components was varied, the highest test A(z) among the SVMs of various kernels and parameters was slightly higher than that of the LDA in one-loop leave-one-case-out resampling. However, no SVM with fixed architecture consistently performed better than the LDA in the range of principal components selected. This study demonstrated that the authors' proposed segmentation and feature extraction techniques are promising for classifying lung nodules on CT images.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Age Factors
  • Algorithms
  • Area Under Curve
  • Diagnosis, Computer-Assisted*
  • Discriminant Analysis
  • Female
  • Humans
  • Image Interpretation, Computer-Assisted / methods*
  • Imaging, Three-Dimensional
  • Lung Neoplasms / diagnosis*
  • Lung Neoplasms / diagnostic imaging*
  • Lung Neoplasms / pathology
  • Male
  • Neoplasm Metastasis / diagnosis
  • Neoplasm Metastasis / diagnostic imaging
  • Neoplasm Metastasis / pathology
  • Principal Component Analysis
  • Sex Factors
  • Tomography, X-Ray Computed / methods*