Generative Neural Visual Artist (GeNeVA)

The Generative Neural Visual Artist (GeNeVA) task

The GeNeVA task involves a Teller giving a sequence of linguistic instructions to a Drawer for the ultimate goal of image generation.

The Teller is able to gauge progress through visual feedback of the generated image. This is a challenging task because the Drawer needs to learn how to map complex linguistic instructions to realistic objects on a canvas, maintaining not only object properties but relationships between objects (e.g., relative location). The Drawer also needs to modify the existing drawing in a manner consistent with previous images and instructions, so it needs to remember previous instructions. All of these involve understanding a complex relationship between objects in the scene and how those relationships are expressed in the image in a way that is consistent with all instructions given.

An example instruction sequence for the GeNeVA task is shown in the figure on the right.

We introduce the GeNeVA task and a model (GeNeVA-GAN) for performing this task in our paper Tell, Draw, and Repeat: Generating and modifying images based on continual linguistic instruction.

Read the paper

GeNeVA – Examples

Images generated by the GeNeVA-GAN model described in Tell, Draw, and Repeat: Generating and modifying images based on continual linguistic instruction on CoDraw (top row) and i-CLEVR (bottom row) datasets are shown with the provided instructions below:
Example images generated by GeNeVA-GAN

GeNeVA – Datasets

For re-creating the CoDraw and i-CLEVR datasets used for the GeNeVA task, you will have to download the data files and then run the dataset generation code as shown in the following links:

Download the GeNeVA Data Files GeNeVA Datasets - Generation Code

GeNeVA – Pre-trained Models

The pre-trained models for the object detector and localizer model that we used for evaluating the metrics can be downloaded from the following links:

CoDraw Object Detector and Localizer i-CLEVR Object Detector and Localizer

GeNeVA – Training and Evaluation Code

Code to train and evaluate the GeNeVA-GAN model for the GeNeVA task can be obtained from the following link:

GeNeVA - Training and Evaluation Code

Reference

If you use the GeNeVA task, code, or datasets as part of any published research, please cite the following paper:

Alaaeldin El-Nouby, Shikhar Sharma, Hannes Schulz, Devon Hjelm, Layla El Asri, Samira Ebrahimi Kahou, Yoshua Bengio, and Graham W. Taylor. “Tell, Draw, and Repeat: Generating and modifying images based on continual linguistic instruction”. Proceedings of the IEEE International Conference on Computer Vision (ICCV). 2019.

@InProceedings{El-Nouby_2019_ICCV,
    author    = {El{-}Nouby, Alaaeldin and
                 Sharma, Shikhar and
                 Schulz, Hannes and
                 Hjelm, Devon and
                 El Asri, Layla and
                 Ebrahimi Kahou, Samira and
                 Bengio, Yoshua and
                 Taylor, Graham W.},
    title     = {Tell, Draw, and Repeat: Generating and modifying images based on continual linguistic instruction},
    booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
    month     = {Oct},
    year      = {2019}
}