{"version":"1.0","provider_name":"Microsoft Research","provider_url":"https:\/\/www.microsoft.com\/en-us\/research","author_name":"Alexis Hagen","author_url":"https:\/\/www.microsoft.com\/en-us\/research\/people\/v-alhage\/","title":"Vision and language pretraining in the absence of caption annotations","type":"rich","width":600,"height":338,"html":"
Novel object captioning surpasses human performance on benchmarks<\/a><\/blockquote>