Deep learning algorithms for melanoma detection using dermoscopic images: A systematic review and meta-analysis

Artif Intell Med. 2024 Sep:155:102934. doi: 10.1016/j.artmed.2024.102934. Epub 2024 Jul 25.

Abstract

Background: Melanoma is a serious risk to human health and early identification is vital for treatment success. Deep learning (DL) has the potential to detect cancer using imaging technologies and many studies provide evidence that DL algorithms can achieve high accuracy in melanoma diagnostics.

Objectives: To critically assess different DL performances in diagnosing melanoma using dermatoscopic images and discuss the relationship between dermatologists and DL.

Methods: Ovid-Medline, Embase, IEEE Xplore, and the Cochrane Library were systematically searched from inception until 7th December 2021. Studies that reported diagnostic DL model performances in detecting melanoma using dermatoscopic images were included if they had specific outcomes and histopathologic confirmation. Binary diagnostic accuracy data and contingency tables were extracted to analyze outcomes of interest, which included sensitivity (SEN), specificity (SPE), and area under the curve (AUC). Subgroup analyses were performed according to human-machine comparison and cooperation. The study was registered in PROSPERO, CRD42022367824.

Results: 2309 records were initially retrieved, of which 37 studies met our inclusion criteria, and 27 provided sufficient data for meta-analytical synthesis. The pooled SEN was 82 % (range 77-86), SPE was 87 % (range 84-90), with an AUC of 0.92 (range 0.89-0.94). Human-machine comparison had pooled AUCs of 0.87 (0.84-0.90) and 0.83 (0.79-0.86) for DL and dermatologists, respectively. Pooled AUCs were 0.90 (0.87-0.93), 0.80 (0.76-0.83), and 0.88 (0.85-0.91) for DL, and junior and senior dermatologists, respectively. Analyses of human-machine cooperation were 0.88 (0.85-0.91) for DL, 0.76 (0.72-0.79) for unassisted, and 0.87 (0.84-0.90) for DL-assisted dermatologists.

Conclusions: Evidence suggests that DL algorithms are as accurate as senior dermatologists in melanoma diagnostics. Therefore, DL could be used to support dermatologists in diagnostic decision-making. Although, further high-quality, large-scale multicenter studies are required to address the specific challenges associated with medical AI-based diagnostics.

Keywords: Deep learning; Human-machine comparison; Human-machine cooperation; Melanoma; Systematic review.

Publication types

  • Meta-Analysis
  • Systematic Review
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Deep Learning*
  • Dermoscopy* / methods
  • Humans
  • Melanoma* / diagnosis
  • Melanoma* / pathology
  • Skin / diagnostic imaging
  • Skin / pathology
  • Skin Neoplasms* / diagnosis
  • Skin Neoplasms* / pathology