CycleMix: Mixing Source Domains for Domain Generalization in Style-Dependent Data

Aristotelis Ballas 0000-0003-1683-8433 Department of Informatics and TelematicsHarokopio UniversityOmirou 9, TavrosAthensGreece [email protected]  and  Christos Diou 0000-0002-2461-1928 Department of Informatics and TelematicsHarokopio UniversityOmirou 9, TavrosAthensGreece [email protected]
(2024)
Abstract.

As deep learning-based systems have become an integral part of everyday life, limitations in their generalization ability have begun to emerge. Machine learning algorithms typically rely on the i.i.d. assumption, meaning that their training and validation data are expected to follow the same distribution, which does not necessarily hold in practice. In the case of image classification, one frequent reason that algorithms fail to generalize is that they rely on spurious correlations present in training data, such as associating image styles with target classes. These associations may not be present in the unseen test data, leading to significant degradation of their effectiveness. In this work, we attempt to mitigate this Domain Generalization (DG) problem by training a robust feature extractor which disregards features attributed to image-style but infers based on style-invariant image representations. To achieve this, we train CycleGAN models to learn the different styles present in the training data and randomly mix them together to create samples with novel style attributes to improve generalization. Experimental results on the PACS DG benchmark validate the proposed method111Code available at: https://github.com/aristotelisballas/cyclemix..

domain generalization, out-of-distribution classification, deep learning
copyright: acmcopyrightjournalyear: 2024doi: XXXXXXX.XXXXXXXconference: 13th Hellenic Conference on Artificial Intelligence (SETN 2024); September 11-13, 2024; Piraeus, Greeceprice: 15.00isbn: 978-1-4503-XXXX-X/18/06

1. Introduction

The past few years have been marked by an explosion in the use of Artificial Intelligence systems. Spanning from industry (Verdouw et al., 2021), medicine (McKinney et al., 2020), academia (Bengio et al., 2013) and even general public use (Ray, 2023), AI systems seem to have established themselves in our daily lives. However, despite their success, widely used and state-of-the-art models still fail to exhibit generalizable attributes when evaluated on data that do not adhere to the i.i.d. assumption. During their training, prominent neural network architectures, such as deep convolutional neural networks (CNNs), often learn to infer based on spurious correlations present in the data (e.g. backgrounds, features attributed to the image style, etc.) and not truly class-representative properties (Recht et al., 2019; Ballas and Diou, 2024b). Domain Generalization (Zhou et al., 2023) attempts to provide insight into the above issues, by building models which are trained on multiple source data domains but are able to generalize to previously unseen data (target data domains).

Refer to caption
Figure 1. Illustration of the proposed CycleMix augmentation method. Before passing through a feature extractor (e.g ResNet-50), the styles of each domain source domain are mixed together in order to create samples from novel domains. As the magniutude of each style component is random in each minibatch, the model is constantly provided with previously unseen data samples during training.
\Description

Our proposed framework.

In our work, we aim to produce a model that maintains its performance on test image data distributions with different styles than the ones present in the training distribution. We therefore propose augmenting the styles of a model’s training images and creating novel style image domains which could push a CNN to extract meaningful and domain-invariant representations. Our initial findings are validated on PACS (Li et al., 2017), a widely-used DG benchmark, which contains images from 4 different style domains.

To this end, we:

  • Train domain translational Generative Adversarial Networks (GANs) for capturing the style attributes of each source domain,

  • Randomly mix the styles present in the source domains and produce images from novel style domains and

  • Validate our method on a widely-used publicly available DG dataset.

In the sections to come, we briefly: introduce the DG problem setup along with all relevant notations, reference the most important works in DG, present the experimental setup and results and finally, conclude our paper.

1.1. Domain Generalization

Let 𝒟:={𝒟i}i=1Sassign𝒟superscriptsubscriptsubscript𝒟𝑖𝑖1𝑆\mathcal{D}:=\{\mathcal{D}_{i}\}_{i=1}^{S}caligraphic_D := { caligraphic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT a set of S𝑆Sitalic_S Source training domains over an input space 𝒳𝒳\mathcal{X}caligraphic_X. We then observe nisubscript𝑛𝑖n_{i}italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT training data points from domain 𝒟isubscript𝒟𝑖\mathcal{D}_{i}caligraphic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, consisting of an input t 𝐱j(i)subscriptsuperscript𝐱𝑖𝑗\mathbf{x}^{(i)}_{j}bold_x start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT and label yj(i)subscriptsuperscript𝑦𝑖𝑗y^{(i)}_{j}italic_y start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, i.e. (𝐱j(i),yj(i))𝒟isimilar-tosubscriptsuperscript𝐱𝑖𝑗subscriptsuperscript𝑦𝑖𝑗subscript𝒟𝑖(\mathbf{x}^{(i)}_{j},y^{(i)}_{j})\sim\mathcal{D}_{i}( bold_x start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_y start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ∼ caligraphic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. Similarly, let T:={Ti}i=1Tassign𝑇superscriptsubscriptsubscript𝑇𝑖𝑖1𝑇T:=\{T_{i}\}_{i=1}^{T}italic_T := { italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT be a set of T𝑇Titalic_T unknown Target domains, while we assume that there exists a single global labeling function h(x)𝑥h(x)italic_h ( italic_x ) that maps input observations to their labels, yj(i)=h(𝐱j(i))superscriptsubscript𝑦𝑗𝑖superscriptsubscript𝐱𝑗𝑖y_{j}^{(i)}=h(\mathbf{x}_{j}^{(i)})italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT = italic_h ( bold_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ), for all domains i𝑖iitalic_i and samples j𝑗jitalic_j. The aim of Domain Generalization (DG) is to produce a model with parameters θΘ𝜃Θ\theta\in\Thetaitalic_θ ∈ roman_Θ which generalizes to both source domains 𝒟𝒟\mathcal{D}caligraphic_D and unseen target domains 𝒯𝒯\mathcal{T}caligraphic_T.

2. Related Work

Domain Generalization has emerged as one of the most difficult problems in ML today, finding applications in multiple, varying fields. DG methods can be broadly categorized in the following groups (Wang et al., 2022):

  • Data manipulation: applied algorithms focus on producing generalizable models by increasing the diversity of existing training data (Zhou et al., 2021).

  • Representation learning: given a predictive function hhitalic_h, which can be decomposed as h=fg𝑓𝑔h=f\circ gitalic_h = italic_f ∘ italic_g into a representation learning function g𝑔gitalic_g and f𝑓fitalic_f, methods in this group focus on improving the feature extraction capabilities of g𝑔gitalic_g. The most common approach is regularizing the loss functions (Arjovsky et al., 2020) or manipulating existing neural network architectures (Ballas and Diou, 2024a, 2023).

  • Learning strategy: the problem of DG has also been studied under numerous machine learning paradigms, such as self-supervised learning, meta-learning, gradient operations, ensemble-learning, etc (Kim et al., 2021).

Our proposed method falls under the first category of data manipulation methods. The implementation of data augmentation methods with regards to model generalizabilty has been certainly explored in the past. Most notably the authors of (Xu et al., 2021) explore the benefits of applying random convolutions on training images and using them as new data points during model training. Several works have also employed adaptive instance normalization (AdaIN) (Huang and Belongie, 2017) for transferring styles between data samples. For example, SagNets (Nam et al., 2021) aim to make predictions on the content of an image and disregard features attributed to image-style by training style-biased networks, while (Lee et al., 2022) trains a model to learn robust colorization techniques for improved model robustness. In another interesting work, the authors of (Zhou et al., 2021) propose MixStyle, an algorithm that mixes the styles of training instances in each mini-batch to increase the sample diversity of source domains. Another method that produces surprisingly good results is Mixup (Xu et al., 2020), which improves model robustness by training the network on convex combinations of pairs of examples and their labels. Further common augmentation methods or regularization strategies include CutMix (Yun et al., 2019) and Cutout (may2020improved), where patches of images are respectively, either cut and pasted among training samples or dropped entirely.

In our work, we propose capturing the style attributes of each domain by employing translational Generative Adversarial Networks and not relying on the features present in each separate sample. By mixing the style-attributes of each domain we are able to create completely novel samples in each mini-batch and improve the robustness of a vanilla feature extractor.

3. Methodology

In this section we provide a brief overview of the CycleGAN algorithm which was utilized in our research for translating images between domains. Once defined, we present the proposed methodology of CycleMix for synthesizing images with novel styles .

3.1. Cycle-Consistent Adversarial Networks

CycleGANs (Zhu et al., 2017) were initially proposed for learning translational image mappings between two domains X𝑋Xitalic_X and Y𝑌Yitalic_Y, in the absence of paired examples. Formally, let D1subscript𝐷1D_{1}italic_D start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and D2subscript𝐷2D_{2}italic_D start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT be source domains for which we aim to learn a mapping G:D1D2:𝐺subscript𝐷1subscript𝐷2G:D_{1}\rightarrow D_{2}italic_G : italic_D start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT → italic_D start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, such that the distribution of images drawn from G(D1)𝐺subscript𝐷1G(D_{1})italic_G ( italic_D start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) are identical with the distribution of images from D2subscript𝐷2D_{2}italic_D start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. The novelty of CycleGAN’s lies in the addition of an inverse mapping F:D1D2:𝐹subscript𝐷1subscript𝐷2F:D_{1}\rightarrow D_{2}italic_F : italic_D start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT → italic_D start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and Cycle Consistency Loss, to the already established Adversarial Loss of GAN’s. The addition of the Cycle Consistency loss enforces the model to reconstruct a translated image into its original domain.

Refer to caption
Figure 2. Indicative examples of the trained CycleGAN translations between domains in the PACS dataset. Despite the relatively small size of the samples, the CycleGAN models are able to capture the style attributes in each domain.
\Description

CycleGan

3.2. CycleMix

In our work, we leverage CycleGANs to learn style-mappings between source domains, in the context of Domain Generalization. Our intuition is that by randomly mixing the styles present in source domains and thus synthesizing novel samples, a feature extractor can derive robust representations and learn the features which remain invariant across styles. Specifically, given S𝑆Sitalic_S source domains we train S(S1)/2𝑆𝑆12S(S-1)/2italic_S ( italic_S - 1 ) / 2 CycleGANs for learning all the possible domain mappings (a single CycleGAN also contains the trained model yielding the inverse mapping between the two domains). An indicative example of translated images between domains is presented in Fig. 2. Having captured the styles of each source domain with the above mappings, we use them as augmentation functions on training data. Specifically, given an image 𝐱(i)superscript𝐱𝑖\mathbf{x}^{(i)}bold_x start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT drawn from a source domain Disubscript𝐷𝑖D_{i}italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, we translate the image to the remaining source domains and then randomly blend them into a single final image. The entire augmentation operation is as follows:

(1) 𝐱(i)=𝐱(i)+j=1jiSajGij(𝐱(i))superscript𝐱superscript𝑖superscript𝐱𝑖superscriptsubscript𝑗1𝑗𝑖𝑆subscript𝑎𝑗subscript𝐺𝑖𝑗superscript𝐱𝑖\mathbf{x}^{(i)^{\prime}}=\mathbf{x}^{(i)}+\sum_{\begin{subarray}{c}j=1\\ j\neq i\end{subarray}}^{S}a_{j}\cdot G_{ij}(\mathbf{x}^{(i)})bold_x start_POSTSUPERSCRIPT ( italic_i ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT = bold_x start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT + ∑ start_POSTSUBSCRIPT start_ARG start_ROW start_CELL italic_j = 1 end_CELL end_ROW start_ROW start_CELL italic_j ≠ italic_i end_CELL end_ROW end_ARG end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ⋅ italic_G start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( bold_x start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT )

where Gijsubscript𝐺𝑖𝑗G_{ij}italic_G start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT the GAN for translating images from domain i𝑖iitalic_i to domain j𝑗jitalic_j and ajsubscript𝑎𝑗a_{j}italic_a start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT a random parameter corresponding to the magnitude of style mixing for each source domain222The sum of all parameters ajsubscript𝑎𝑗a_{j}italic_a start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT add up to 1 and are randomly sampled in each minibatch iteration.. After this mixing operation, each image is normalized (i.e., the same preprocessing is applied as for the vanilla model input).

In our experiments we only randomly augment half of the images present in a mini-batch to preserve the information provided by the initial source domains. An example of style mixed images, along with an illustration of our proposed method, is provided in Fig. 1. After augmentation, we pass the mixed images through a feature extractor and train a classifier to attain their final labels.

4. Experiments

4.1. Experimental Setup

To validate our approach, we use the publicly available and widely-adopted PACS (Li et al., 2017) dataset. PACS contains images from 4 separate style domains. As its name suggests, samples can either originate from the Photo, Art Painting, Cartoon or Sketch domain and can be one of 7 classes. As in standard DG experimental setups, we follow the leave-one-domain-out cross-validation protocol (Li et al., 2017; Gulrajani and Lopez-Paz, 2021), meaning that in each training iteration we train a model on 3 source domains and evaluate on a single target domain. For our experiments, we use the DomainBed (Gulrajani and Lopez-Paz, 2021) codebase and train a ResNet-50 feature extractor on a single NVIDIA A100 GPU card.

Table 1. Top-1% accuracy results on the PACS (left) and VLCS (right) datasets. The columns denote the target domains. The top results are highlighted in bold while the second best are underlined.
Method Kunst Cartoon Photo Sketch Avg
ERM 85.4 75.4 95.9 77.1 84.8
CUTOUT 81.8 81.8 95.8 78.1 84.2
CUTMIX 79.9 75.9 96.6 76.0 82.1
SagNet 84.5 79.5 95.7 78.1 84,4
MIXUP 86.5 76.6 97.7 76.5 84.3
CycleMix 87.7 82.0 96.6 79.9 86.6

4.2. Results

To evaluate the effectiveness of CycleMIX as an augmentation approach, we rank its improvement against established techniques such as CUTOUT, CUTMIX and MIXUP. We also use SagNets as a baseline to demonstrate the efficacy of our method, along with a vanilla ResNet-50333Standard image augmentations are used during data loading for each method, such as random resized crop, random horizontal flip, color jitter and normalization.. The results of our experiments are presented in Table 1.

It is apparent that the proposed method has a clear advantage over the baselines, as it surpasses the second best performing model by an average of around 2%. With the exception of Photo, CycleMix yields the best results in every other target domain.

5. Conclusion

In this work we propose CycleMix, a method aimed to alleviate the problems poised by style biased predictions in the DG setting. We argue that mixing the different styles present in a convolutional neural network’s training data, a model can be pushed to focus on the invariant features and extract robust representations. The above claim is supported by results on PACS, a dataset containing images from 4 distinct style distributions, where our method surpasses previously proposed algorithms. However, one of the key limitations is that an increase in the number of source domains corresponds to an increased number of trained CycleGANs, which can prove computationally infeasible. In future work, we intend to train StarGAN models which were proposed for multi-domain image-to-image translation and explore our methodology on additional datasets.

Acknowledgements.
The work leading to these results has received funding from the European Union’s Horizon 2020 research and innovation programme under Grant Agreement No. 965231 project REBECCA.

References

  • (1)
  • Arjovsky et al. (2020) Martin Arjovsky, Léon Bottou, Ishaan Gulrajani, and David Lopez-Paz. 2020. Invariant Risk Minimization. arXiv:1907.02893 (March 2020).
  • Ballas and Diou (2023) Aristotelis Ballas and Christos Diou. 2023. CNNs with Multi-Level Attention for Domain Generalization. In Proceedings of the 2023 ACM International Conference on Multimedia Retrieval (Thessaloniki, Greece) (ICMR ’23). Association for Computing Machinery, New York, NY, USA, 592–596.
  • Ballas and Diou (2024a) Aristotelis Ballas and Christos Diou. 2024a. Multi-Scale and Multi-Layer Contrastive Learning for Domain Generalization. IEEE Transactions on Artificial Intelligence (2024), 1–14.
  • Ballas and Diou (2024b) Aristotelis Ballas and Christos Diou. 2024b. Towards Domain Generalization for ECG and EEG Classification: Algorithms and Benchmarks. IEEE Transactions on Emerging Topics in Computational Intelligence 8, 1 (2024), 44–54.
  • Bengio et al. (2013) Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation Learning: A Review and New Perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35, 8 (aug 2013), 1798–1828.
  • Gulrajani and Lopez-Paz (2021) Ishaan Gulrajani and David Lopez-Paz. 2021. In Search of Lost Domain Generalization. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net.
  • Huang and Belongie (2017) Xun Huang and Serge Belongie. 2017. Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization. In 2017 IEEE International Conference on Computer Vision (ICCV). IEEE, Nashville, TN, USA, 1510–1519.
  • Kim et al. (2021) Daehee Kim, Youngjun Yoo, Seunghyun Park, Jinkyu Kim, and Jaekoo Lee. 2021. SelfReg: Self-supervised Contrastive Regularization for Domain Generalization. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, Montreal, QC, Canada, 9599–9608.
  • Lee et al. (2022) Hyejin Lee, Daehee Kim, Daeun Lee, Jinkyu Kim, and Jaekoo Lee. 2022. Bridging the Domain Gap Towards Generalization in Automatic Colorization. In Computer Vision – ECCV 2022, Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni Maria Farinella, and Tal Hassner (Eds.). Springer Nature Switzerland, Cham, 527–543.
  • Li et al. (2017) D. Li, Y. Yang, Y. Song, and T. M. Hospedales. 2017. Deeper, Broader and Artier Domain Generalization. In 2017 IEEE International Conference on Computer Vision (ICCV). IEEE Computer Society, Los Alamitos, CA, USA, 5543–5551.
  • McKinney et al. (2020) Scott Mayer McKinney, Marcin Sieniek, Varun Godbole, Jonathan Godwin, Natasha Antropova, Hutan Ashrafian, Trevor Back, Mary Chesus, Greg S Corrado, Ara Darzi, et al. 2020. International evaluation of an AI system for breast cancer screening. Nature 577, 7788 (2020), 89–94.
  • Nam et al. (2021) H. Nam, H. Lee, J. Park, W. Yoon, and D. Yoo. 2021. Reducing Domain Gap by Reducing Style Bias. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, Los Alamitos, CA, USA, 8686–8695.
  • Ray (2023) Partha Pratim Ray. 2023. ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. Internet of Things and Cyber-Physical Systems 3 (2023), 121–154.
  • Recht et al. (2019) Benjamin Recht, Rebecca Roelofs, Ludwig Schmidt, and Vaishaal Shankar. 2019. Do imagenet classifiers generalize to imagenet?. In International conference on machine learning. PMLR, Curran Associates Inc., Red Hook, NY, USA, 5389–5400.
  • Verdouw et al. (2021) Cor Verdouw, Bedir Tekinerdogan, Adrie Beulens, and Sjaak Wolfert. 2021. Digital twins in smart farming. Agricultural Systems 189 (2021), 103046.
  • Wang et al. (2022) Jindong Wang, Cuiling Lan, Chang Liu, Yidong Ouyang, Tao Qin, Wang Lu, Yiqiang Chen, Wenjun Zeng, and S Yu Philip. 2022. Generalizing to unseen domains: A survey on domain generalization. IEEE transactions on knowledge and data engineering 35, 8 (2022), 8052–8072.
  • Xu et al. (2020) Minghao Xu, Jian Zhang, Bingbing Ni, Teng Li, Chengjie Wang, Qi Tian, and Wenjun Zhang. 2020. Adversarial Domain Adaptation with Domain Mixup. In The Thirty-Fourth AAAI Conference on Artificial Intelligence. AAAI Press, Washington, DC, USA, 6502–6509.
  • Xu et al. (2021) Zhenlin Xu, Deyi Liu, Junlin Yang, Colin Raffel, and Marc Niethammer. 2021. Robust and Generalizable Visual Representation Learning via Random Convolutions. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net.
  • Yun et al. (2019) S. Yun, D. Han, S. Chun, S. Oh, Y. Yoo, and J. Choe. 2019. CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE Computer Society, Los Alamitos, CA, USA, 6022–6031.
  • Zhou et al. (2023) Kaiyang Zhou, Ziwei Liu, Yu Qiao, Tao Xiang, and Chen Change Loy. 2023. Domain Generalization: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 45, 4 (apr 2023), 4396–4415.
  • Zhou et al. (2021) Kaiyang Zhou, Yongxin Yang, Yu Qiao, and Tao Xiang. 2021. Domain Generalization with MixStyle. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net.
  • Zhu et al. (2017) Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. In 2017 IEEE International Conference on Computer Vision (ICCV). IEEE, Nashville, TN, USA, 2242–2251.