Typically, deep learning models for image segmentation tasks are trained using large datasets of images annotated at the pixel level, which can be expensive and highly time-consuming. A way to reduce the amount of annotated images required for training is to adopt a semi-supervised approach. In this regard, generative deep learning models, concretely Generative Adversarial Networks (GANs), have been adapted to semi-supervised training of segmentation tasks. This work proposes MaskGDM, a deep learning architecture combining some ideas from EditGAN, a GAN that jointly models images and their segmentations, together with a generative diffusion model. With careful integration, we find that using a generative diffusion model can improve EditGAN performance results in multiple segmentation datasets, both multi-class and with binary labels. According to the quantitative results obtained, the proposed model improves multi-class image segmentation when compared to the EditGAN and DatasetGAN models, respectively, by [Formula: see text] and [Formula: see text]. Moreover, using the ISIC dataset, our proposal improves the results from other models by up to [Formula: see text] for the binary image segmentation approach.
Keywords: Semantic segmentation; diffusion model; semi-supervised.