A recent study by Suissa and colleagues explored the clinical relevance of a medical image segmentation metric (Dice metric) commonly used in the field of artificial intelligence (AI). They showed that pixel-wise agreement for physician identification of structures on ultrasound images is variable, and a relatively low Dice metric (0.34) correlated to a substantial agreement on subjective clinical assessment. We highlight the need to bring structure and clinical perspective to the evaluation of medical AI, which clinicians are best placed to direct.
Keywords: artificial intelligence; evaluation; medical devices; regional anaesthesia; regulation; standardisation; ultrasound.
Copyright © 2024 British Journal of Anaesthesia. Published by Elsevier Ltd. All rights reserved.