Importance: Convolutional neural networks (CNN) have shown performance equal to trained dermatologists in differentiating benign from malignant skin lesions. To improve clinicians' management decisions, additional classifications into diagnostic categories might be helpful.
Methods: A convenience sample of 100 pigmented/non-pigmented skin lesions was used for a cross-sectional two-level reader study including 96 dermatologists (level I: dermoscopy only; level II: clinical close-up images, dermoscopy, and textual information). Dermoscopic images were classified by a binary CNN trained to differentiate melanocytic from non-melanocytic lesions (FotoFinder Systems, Bad Birnbach, Germany). Primary endpoint was the accuracy of the CNN's classification in comparison with dermatologists reviewing level-II information. Secondary endpoints included dermatologists' accuracies according to their level of experience and the CNN's area under the curve (AUC) of receiver operating characteristics (ROC).
Results: The CNN revealed an accuracy and ROC AUC with corresponding 95 % confidence intervals (CI) of 91.0 % (83.8 % to 95.2 %) and 0.981 (0.962 to 1). In level I, dermatologists showed a mean accuracy of 83.7 % (82.5 % to 84.8 %). With level II information, the accuracy improved to 87.8 % (86.7 % to 88.9 %; p < 0.001). When comparing accuracies of CNN and dermatologists in level II, the CNN's accuracy was higher (91.0 % versus 87.8 %, p < 0.001). For experts with level II information results were on par with the CNN (91.0 % versus 90.4 %, p = 0.368).
Conclusions: The tested CNN accurately differentiated melanocytic from non-melanocytic skin lesions and outperformed dermatologists. The CNN may support clinicians and could be used in an ensemble approach combined with other CNN models.
Keywords: Convolutional neural network; Ensemble approach; Melanocytic; Two-step dermoscopy algorithm.
Copyright © 2024. Published by Elsevier Ltd.