Learning What and Where to Draw

Reed, Scott; Akata, Zeynep; Mohan, Santosh; Tenka, Samuel; Schiele, Bernt; Lee, Honglak

Computer Science > Computer Vision and Pattern Recognition

arXiv:1610.02454 (cs)

[Submitted on 8 Oct 2016]

Title:Learning What and Where to Draw

Authors:Scott Reed, Zeynep Akata, Santosh Mohan, Samuel Tenka, Bernt Schiele, Honglak Lee

View PDF

Abstract:Generative Adversarial Networks (GANs) have recently demonstrated the capability to synthesize compelling real-world images, such as room interiors, album covers, manga, faces, birds, and flowers. While existing models can synthesize images based on global constraints such as a class label or caption, they do not provide control over pose or object location. We propose a new model, the Generative Adversarial What-Where Network (GAWWN), that synthesizes images given instructions describing what content to draw in which location. We show high-quality 128 x 128 image synthesis on the Caltech-UCSD Birds dataset, conditioned on both informal text descriptions and also object location. Our system exposes control over both the bounding box around the bird and its constituent parts. By modeling the conditional distributions over part locations, our system also enables conditioning on arbitrary subsets of parts (e.g. only the beak and tail), yielding an efficient interface for picking part locations. We also show preliminary results on the more challenging domain of text- and location-controllable synthesis of images of human actions on the MPII Human Pose dataset.

Comments:	In NIPS 2016
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
Cite as:	arXiv:1610.02454 [cs.CV]
	(or arXiv:1610.02454v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1610.02454

Submission history

From: Scott Reed [view email]
[v1] Sat, 8 Oct 2016 00:27:57 UTC (9,657 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2016-10

Change to browse by:

cs
cs.NE

References & Citations

DBLP - CS Bibliography

listing | bibtex

Scott E. Reed
Zeynep Akata
Santosh Mohan
Samuel Tenka
Bernt Schiele

…

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:Learning What and Where to Draw

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Learning What and Where to Draw

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators