Ecologically sustainable benchmarking of AI models for histopathology

NPJ Digit Med. 2024 Dec 24;7(1):378. doi: 10.1038/s41746-024-01397-x.

Abstract

Deep learning (DL) holds great promise to improve medical diagnostics, including pathology. Current DL research mainly focuses on performance. DL implementation potentially leads to environmental consequences but approaches for assessment of both performance and carbon footprint are missing. Here, we explored an approach for developing DL for pathology, which considers both diagnostic performance and carbon footprint, calculated as CO2 or equivalent emissions (CO2eq). We evaluated various DL architectures used in computational pathology, including a large foundation model, across two diagnostic tasks of low and high complexity. We proposed a metric termed 'environmentally sustainable performance' (ESPer), which quantitatively integrates performance and operational CO2eq during training and inference. While some DL models showed comparable diagnostic performance, ESPer enabled prioritizing those with less carbon footprint. We also investigated how data reduction approaches can improve the ESPer of individual models. This study provides an approach facilitating the development of environmentally friendly, sustainable medical AI.