This study explores an automatic diagnosis method to predict unnecessary nodule biopsy from a small, unbalanced, and pathologically proven database. The automatic diagnosis method is based on a convolutional neural network (CNN) model. Because of the small and unbalanced samples, the presented method aims to improve the transfer learning capability via the VGG16 architecture and optimize the related transfer learning parameters. For comparison purpose, a traditional machine learning method is implemented, which extracts the texture features and classifies the features by support vector machine (SVM). The database includes 68 biopsied nodules, 16 are pathologically proven benign and the remaining 52 are malignant. To consider the volumetric data by the CNN model, each image slice from each nodule volume is selected randomly until all image slices of each nodule are utilized. The leave-one-out and 10-folder cross validations are applied to train and test the randomly selected 68 image slices (one image slice from one nodule) in each experiment, respectively. The averages over all the experimental outcomes are the final results. The experiments revealed that the features from both the medical and the natural images share the similarity of focusing on simpler and less-abstract objects, leading to the conclusion that not the more the transfer convolutional layers, the better the classification results. Transfer learning from other larger datasets can supply additional information to small and unbalanced datasets to improve the classification performance. The presented method has shown the potential to adapt CNN architecture to improve the prediction of unnecessary nodule biopsy from small, unbalanced, and pathologically proven volumetric dataset.
Keywords: Convolutional neural networks; Decrease unnecessary biopsy; Lung cancer screening; Small and unbalanced dataset; Transfer learning.