Revision as of 19:43, 9 July 2024 edit Michael Hardy (talk \| contribs) Administrators 210,009 edits →‎Evaluation ← Previous edit		Revision as of 19:44, 9 July 2024 edit undo Michael Hardy (talk \| contribs) Administrators 210,009 edits →‎InfoGAN Next edit →
Line 298: The idea of InfoGAN is to decree that every latent vector in the latent space can be decomposed as <math>(z, c)</math>: an incompressible noise part <math>z</math>, and an informative label part <math>c</math>, and encourage the generator to comply with the decree, by encouraging it to maximize <math>I(c, G(z, c))</math>, the [[mutual information]] between <math>c</math> and <math>G(z, c)</math>, while making no demands on the mutual information <math>z</math> between <math>G(z, c)</math>. Unfortunately, <math>I(c, G(z, c))</math> is intractable in general, The key idea of InfoGAN is Variational Mutual Information Maximization:<ref>{{Cite journal \|last1=Barber \|first1=David \|last2=Agakov \|first2=Felix \|date=2003-12-09 \|title=The IM algorithm: a variational approach to Information Maximization \|url=https://dl.acm.org/doi/abs/10.5555/2981345.2981371 \|journal=Proceedings of the 16th International Conference on Neural Information Processing Systems \|series=NIPS'03 \|location=Cambridge, MA, USA \|publisher=MIT Press \|pages=201–208 }}</ref> indirectly maximize it by maximizing a lower bound<math display="block"> {\hat {I}}(G,Q)=\mathbb {E} _{z\sim \mumu_Z, ~~_{Z},~~c\sim \mu _{C}}[\ln Q(c\|\mid G(z,c))]; \quad I(c, G(z, c)) \geq \sup_Q \hat I(G, Q)</math>where <math> Q</math> ranges over all [[Markov kernel]]s of type <math> Q: \Omega_Y \to \mathcal P(\Omega_C)</math>. The InfoGAN game is defined as follows:<ref>{{Cite journal \|last1=Chen \|first1=Xi \|last2=Duan \|first2=Yan \|last3=Houthooft \|first3=Rein \|last4=Schulman \|first4=John \|last5=Sutskever \|first5=Ilya \|last6=Abbeel \|first6=Pieter \|date=2016 \|title=InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets \|url=https://proceedings.neurips.cc/paper/2016/hash/7c9d0b1f96aebd7b5eca8c3edaa19ebb-Abstract.html \|journal=Advances in Neural Information Processing Systems \|publisher=Curran Associates, Inc. \|volume=29\|arxiv=1606.03657 }}</ref><blockquote>Three probability spaces define an InfoGAN game: Line 308: There are 3 players in 2 teams: generator, Q, and discriminator. The generator and Q are on one team, and the discriminator on the other team. The objective function is<math display="block">L(G, Q, D) = L_{GAN}(G, D) - \lambda \hat I(G, Q)</math>where <math> L_{GAN}(G, D) = \mathbb{E}_{x\sim \mu_{\text{ref}}, }[\ln D(x)] + \mathbb{E}_{z\sim \mu_Z}[\ln (1-D(G(z, c)))]</math> is the original GAN game objective, and <math> \hat I(G, Q) = \mathbb E_{z\sim\mu_Z, c\sim\mu_C}[\ln Q(c \|\mid G(z, c))]</math> Generator-Q team aims to minimize the objective, and discriminator aims to maximize it:<math display="block">\min_{G, Q} \max_D L(G, Q, D)</math></blockquote>

Generative adversarial network: Difference between revisions