{"id":585388,"date":"2019-05-09T10:01:00","date_gmt":"2019-05-09T17:01:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-project&p=585388"},"modified":"2019-09-23T07:19:51","modified_gmt":"2019-09-23T14:19:51","slug":"generative-neural-visual-artist-geneva","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/generative-neural-visual-artist-geneva\/","title":{"rendered":"Generative Neural Visual Artist (GeNeVA)"},"content":{"rendered":"

\"The<\/p>\n

The Generative Neural Visual Artist (GeNeVA) task<\/strong><\/h3>\n
\n

The GeNeVA task involves a Teller<\/em> giving a sequence of linguistic instructions to a Drawer<\/em> for the ultimate goal of image generation.<\/p>\n

The Teller<\/em> is able to gauge progress through visual feedback of the generated image. This is a challenging task because the Drawer<\/em> needs to learn how to map complex linguistic instructions to realistic objects on a canvas, maintaining not only object properties but relationships between objects (e.g., relative location). The Drawer<\/em> also needs to modify the existing drawing in a manner consistent with previous images and instructions, so it needs to remember previous instructions. All of these involve understanding a complex relationship between objects in the scene and how those relationships are expressed in the image in a way that is consistent with all instructions given.<\/p>\n

An example instruction sequence for the GeNeVA task is shown in the figure on the right.<\/p>\n

We introduce the GeNeVA task and a model (GeNeVA-GAN) for performing this task in our paper Tell, Draw, and Repeat: Generating and modifying images based on continual linguistic instruction<\/a>.<\/p>\n\n\t\n\t\tRead the paper\t<\/a>\n\n\t\n<\/div>\n


\n

GeNeVA – Examples<\/strong><\/h3>\n

Images generated by the GeNeVA-GAN model described in Tell, Draw, and Repeat: Generating and modifying images based on continual linguistic instruction<\/a> on CoDraw (top row) and i-CLEVR (bottom row) datasets are shown with the provided instructions below:
\n\"Example<\/p>\n


\n

GeNeVA – Datasets<\/strong><\/h3>\n

For re-creating the CoDraw and i-CLEVR datasets used for the GeNeVA task, you will have to download the data files and then run the dataset generation code as shown in the following links:<\/p>\n

\n\t\n\t\tDownload the GeNeVA Data Files\t<\/a>\n\n\t \n\t\n\t\tGeNeVA Datasets - Generation Code\t<\/a>\n\n\t<\/p>\n


\n

GeNeVA – Pre-trained Models<\/strong><\/h3>\n

The pre-trained models for the object detector and localizer model that we used for evaluating the metrics can be downloaded from the following links:<\/p>\n

\n\t\n\t\tCoDraw Object Detector and Localizer\t<\/a>\n\n\t \n\t\n\t\ti-CLEVR Object Detector and Localizer\t<\/a>\n\n\t<\/p>\n


\n

GeNeVA – Training and Evaluation Code<\/strong><\/h3>\n

Code to train and evaluate the GeNeVA-GAN model for the GeNeVA task can be obtained from the following link:<\/p>\n\n\t\n\t\tGeNeVA - Training and Evaluation Code\t<\/a>\n\n\t\n


\n

Reference<\/strong><\/h3>\n

If you use the GeNeVA task, code, or datasets as part of any published research, please cite the following paper:<\/p>\n

Alaaeldin El-Nouby, Shikhar Sharma, Hannes Schulz, Devon Hjelm, Layla El Asri, Samira Ebrahimi Kahou, Yoshua Bengio, and Graham W. Taylor.\u00a0“Tell, Draw, and Repeat: Generating and modifying images based on continual linguistic instruction”.\u00a0Proceedings of the IEEE International Conference on Computer Vision (ICCV)<\/em>. 2019.<\/p>\n

@InProceedings{El-Nouby_2019_ICCV,\r\n    author    = {El{-}Nouby, Alaaeldin and\r\n                 Sharma, Shikhar and\r\n                 Schulz, Hannes and\r\n                 Hjelm, Devon and\r\n                 El Asri, Layla and\r\n                 Ebrahimi Kahou, Samira and\r\n                 Bengio, Yoshua and\r\n                 Taylor, Graham W.},\r\n    title     = {Tell, Draw, and Repeat: Generating and modifying images based on continual linguistic instruction},\r\n    booktitle = {The IEEE International Conference on Computer Vision (ICCV)},\r\n    month     = {Oct},\r\n    year      = {2019}\r\n}<\/code><\/pre>\n
\n","protected":false},"excerpt":{"rendered":"

The Generative Neural Visual Artist (GeNeVA) task The GeNeVA task involves a Teller giving a sequence of linguistic instructions to a Drawer for the ultimate goal of image generation. The Teller is able to gauge progress through visual feedback of the generated image. This is a challenging task because the Drawer needs to learn how […]<\/p>\n","protected":false},"featured_media":587743,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"footnotes":""},"research-area":[13556,13562,13554],"msr-locale":[268875],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-585388","msr-project","type-msr-project","status-publish","has-post-thumbnail","hentry","msr-research-area-artificial-intelligence","msr-research-area-computer-vision","msr-research-area-human-computer-interaction","msr-locale-en_us","msr-archive-status-active"],"msr_project_start":"","related-publications":[585919,556554],"related-downloads":[586126,610173],"related-videos":[],"related-groups":[],"related-events":[],"related-opportunities":[],"related-posts":[],"related-articles":[],"tab-content":[],"slides":[],"related-researchers":[{"type":"user_nicename","display_name":"Shikhar Sharma","user_id":36557,"people_section":"Section name 1","alias":"shsh"},{"type":"user_nicename","display_name":"Hannes Schulz","user_id":37188,"people_section":"Section name 1","alias":"haschulz"},{"type":"guest","display_name":"Alaaeldin El-Nouby","user_id":585409,"people_section":"Section name 1","alias":""},{"type":"guest","display_name":"Layla El Asri","user_id":604137,"people_section":"Section name 1","alias":""},{"type":"guest","display_name":"Samira Ebrahimi Kahou","user_id":585412,"people_section":"Section name 1","alias":""},{"type":"guest","display_name":"Yoshua Bengio","user_id":585415,"people_section":"Section name 1","alias":""},{"type":"guest","display_name":"Graham Taylor","user_id":585418,"people_section":"Section name 1","alias":""}],"msr_research_lab":[437514],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/585388"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-project"}],"version-history":[{"count":60,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/585388\/revisions"}],"predecessor-version":[{"id":610155,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/585388\/revisions\/610155"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/587743"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=585388"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=585388"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=585388"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=585388"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=585388"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}