Objectives: Convolutional neural networks (CNNs) are a subtype of artificial neural network that have shown strong performance in computer vision tasks including image classification. To date, there has been limited application of CNNs to chest radiographs, the most frequently performed medical imaging study. We hypothesize CNNs can learn to classify frontal chest radiographs according to common findings from a sufficiently large data set.
Materials and methods: Our institution's research ethics board approved a single-center retrospective review of 35,038 adult posterior-anterior chest radiographs and final reports performed between 2005 and 2015 (56% men, average age of 56, patient type: 24% inpatient, 39% outpatient, 37% emergency department) with a waiver for informed consent. The GoogLeNet CNN was trained using 3 graphics processing units to automatically classify radiographs as normal (n = 11,702) or into 1 or more of cardiomegaly (n = 9240), consolidation (n = 6788), pleural effusion (n = 7786), pulmonary edema (n = 1286), or pneumothorax (n = 1299). The network's performance was evaluated using receiver operating curve analysis on a test set of 2443 radiographs with the criterion standard being board-certified radiologist interpretation.
Results: Using 256 × 256-pixel images as input, the network achieved an overall sensitivity and specificity of 91% with an area under the curve of 0.964 for classifying a study as normal (n = 1203). For the abnormal categories, the sensitivity, specificity, and area under the curve, respectively, were 91%, 91%, and 0.962 for pleural effusion (n = 782), 82%, 82%, and 0.868 for pulmonary edema (n = 356), 74%, 75%, and 0.850 for consolidation (n = 214), 81%, 80%, and 0.875 for cardiomegaly (n = 482), and 78%, 78%, and 0.861 for pneumothorax (n = 167).
Conclusions: Current deep CNN architectures can be trained with modest-sized medical data sets to achieve clinically useful performance at detecting and excluding common pathology on chest radiographs.