Background: Clinical data used to train deep learning models are often not clean data. They can contain imperfections in both the imaging data and the corresponding segmentations.
Purpose: This study investigates the influence of data imperfections on the performance of deep learning models for parotid gland segmentation. This was done in a controlled manner by using synthesized data. The insights this study provides may be used to make deep learning models better and more reliable.
Methods: The data were synthesized by using the clinical segmentations, creating a pseudo ground-truth in the process. Three kinds of imperfections were simulated: incorrect segmentations, low image contrast, and artifacts in the imaging data. The severity of each imperfection was varied in five levels. Models resulting from training sets from each of the five levels were cross-evaluated with test sets from each of the five levels.
Results: Using synthesized data led to almost perfect parotid gland segmentation when no error was added. Lowering the quality of the parotid gland segmentations used for training substantially lowered the model performance. Additionally, lowering the image quality of the training data by decreasing the contrast or introducing artifacts made the resulting models more robust to data containing those respective kinds of data imperfection.
Conclusion: This study demonstrated the importance of good-quality segmentations for deep learning training and it shows that using low-quality imaging data for training can enhance the robustness of the resulting models.
Keywords: data imperfection; deep learning; parotid gland; segmentation; synthesized data.
© 2023 The Authors. Medical Physics published by Wiley Periodicals LLC on behalf of American Association of Physicists in Medicine.