In clinical applications, using erroneous segmentations of medical images can have dramatic consequences. Current approaches dedicated to medical image segmentation automatic quality control do not predict segmentation quality at slice-level (2D), resulting in sub-optimal evaluations. Our 2D-based deep learning method simultaneously performs quality control at 2D-level and 3D-level for cardiovascular MR image segmentations. We compared it with 3D approaches by training both on 36,540 (2D) / 3842 (3D) samples to predict Dice Similarity Coefficients (DSC) for 4 different structures from the left ventricle, i.e., trabeculations (LVT), myocardium (LVM), papillary muscles (LVPM) and blood (LVC). The 2D-based method outperformed the 3D method. At the 2D-level, the mean absolute errors (MAEs) of the DSC predictions for 3823 samples, were 0.02, 0.02, 0.05 and 0.02 for LVM, LVC, LVT and LVPM, respectively. At the 3D-level, for 402 samples, the corresponding MAEs were 0.02, 0.01, 0.02 and 0.04. The method was validated in a clinical practice evaluation against semi-qualitative scores provided by expert cardiologists for 1016 subjects of the UK BioBank. Finally, we provided evidence that a multi-level QC could be used to enhance clinical measurements derived from image segmentations.
Keywords: CMR Image segmentation; Deep learning; Medical image segmentation automatic quality control; Multi-dimensional quality control.
Copyright © 2021 Elsevier B.V. All rights reserved.