Computerizing the first step of the two-step algorithm in dermoscopy: A convolutional neural network for differentiating melanocytic from non-melanocytic skin lesions

Julia K Winkler; Katharina S Kommoss; Anastasia S Vollmer; Andreas Blum; Wilhelm Stolz; T Kränke; R Hofmann-Wellenhof; Alexander Enk; Ferdinand Toberer; Holger A Haenssle

doi:10.1016/j.ejca.2024.114297

Computerizing the first step of the two-step algorithm in dermoscopy: A convolutional neural network for differentiating melanocytic from non-melanocytic skin lesions

Eur J Cancer. 2024 Oct:210:114297. doi: 10.1016/j.ejca.2024.114297. Epub 2024 Aug 25.

Affiliations

¹ Department of Dermatology, University of Heidelberg, Heidelberg, Germany. Electronic address: [email protected].
² Department of Dermatology, University of Heidelberg, Heidelberg, Germany.
³ Public, Private and Teaching Practice, Konstanz, Germany.
⁴ Department of Dermatology, Allergology and Environmental Medicine II, Hospital Thalkirchner Street, Munich, Germany.
⁵ Department of Dermatology and Venerology, Medical University of Graz, Graz, Austria.

PMID: 39217816
DOI: 10.1016/j.ejca.2024.114297

Abstract

Importance: Convolutional neural networks (CNN) have shown performance equal to trained dermatologists in differentiating benign from malignant skin lesions. To improve clinicians' management decisions, additional classifications into diagnostic categories might be helpful.

Methods: A convenience sample of 100 pigmented/non-pigmented skin lesions was used for a cross-sectional two-level reader study including 96 dermatologists (level I: dermoscopy only; level II: clinical close-up images, dermoscopy, and textual information). Dermoscopic images were classified by a binary CNN trained to differentiate melanocytic from non-melanocytic lesions (FotoFinder Systems, Bad Birnbach, Germany). Primary endpoint was the accuracy of the CNN's classification in comparison with dermatologists reviewing level-II information. Secondary endpoints included dermatologists' accuracies according to their level of experience and the CNN's area under the curve (AUC) of receiver operating characteristics (ROC).

Results: The CNN revealed an accuracy and ROC AUC with corresponding 95 % confidence intervals (CI) of 91.0 % (83.8 % to 95.2 %) and 0.981 (0.962 to 1). In level I, dermatologists showed a mean accuracy of 83.7 % (82.5 % to 84.8 %). With level II information, the accuracy improved to 87.8 % (86.7 % to 88.9 %; p < 0.001). When comparing accuracies of CNN and dermatologists in level II, the CNN's accuracy was higher (91.0 % versus 87.8 %, p < 0.001). For experts with level II information results were on par with the CNN (91.0 % versus 90.4 %, p = 0.368).

Conclusions: The tested CNN accurately differentiated melanocytic from non-melanocytic skin lesions and outperformed dermatologists. The CNN may support clinicians and could be used in an ensemble approach combined with other CNN models.

Keywords: Convolutional neural network; Ensemble approach; Melanocytic; Two-step dermoscopy algorithm.

MeSH terms

Algorithms*
Cross-Sectional Studies
Dermatologists
Dermoscopy* / methods
Diagnosis, Differential
Female
Humans
Image Interpretation, Computer-Assisted / methods
Melanocytes / pathology
Melanoma* / diagnostic imaging
Melanoma* / pathology
Neural Networks, Computer*
ROC Curve
Skin Neoplasms* / diagnostic imaging
Skin Neoplasms* / pathology