{"id":563061,"date":"2019-03-28T03:37:19","date_gmt":"2019-03-28T10:37:19","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-project&p=563061"},"modified":"2022-09-06T10:01:24","modified_gmt":"2022-09-06T17:01:24","slug":"conversational-agents-that-see","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/conversational-agents-that-see\/","title":{"rendered":"Conversational Agents that See"},"content":{"rendered":"\t\t\t
\n\t\t\t
\n\t\t\t\t\t
\n\t\t

Conversational Agents that See<\/h3>

The dream of a personal, digital assistant has been with us for decades\u2014a digital entity which knows our habits, activities and preferences, and which can converse with us, anticipate our needs, and carry out tasks for us.\u00a0Recently, we have seen a resurgence of enthusiasm around this concept, particularly in the form of conversational systems on mobile phones, such as Cortana, Siri and Google Now. While these are in still in their infancy, they signal new ambitions to realise the vision of artificially intelligent, conversational agents.<\/p>

In this project we explore what it might mean to augment such agents with the ability to see.\u00a0By partnering with experts in computer vision, speech and machine learning, we ask whether the ability for an agent to see a person\u2019s activities and context might make its capabilities more effective.\u00a0 After all, what we see often qualifies what we say by providing a shared context for conversation. Looking at the context around us brings into the conversation objects of interest, other people, aspects of the environment, ongoing activities and so on. Agents can start to recognise and notice things that are happening in the world around a person.\u00a0Conversely, conversation qualifies what we see\u2014it can help clarify and add meaning to people, places, objects and events, for example.\u00a0In both cases, we would expect that adding vision to agents would provide a better understanding of a person and their relationship to the world around them.<\/p>

\"\"<\/p>

This project is exploring a number of different use scenarios where computer vision and speech converge to better serve the user.\u00a0 Our approach involves user-centred design which means we use a mix of literature reviews, interview studies, focus groups and ethnographic techniques to generate and refine our ideas.\u00a0 We are also building prototype applications based on those scenarios to test and evolve our concepts.\u00a0 These range from supporting people in capturing and organising objects from the physical world, to supporting different kinds of navigational tasks in both indoor and outdoor environments.<\/p>

This work was in collaboration with the\u00a0Machine Learning and Perception\u00a0group Microsoft Research Redmond and Cambridge.<\/p>

\t<\/div>\n\t<\/p>\t\t\t<\/div>\n\t\t<\/div>\n\t\t\n","protected":false},"excerpt":{"rendered":"

In this project we explore what it might mean to augment digital agents such as Microsoft Cortana with the ability to see.\u00a0By partnering with experts in computer vision, speech and machine learning, we ask whether the ability for an agent to see a person\u2019s activities and context might make its capabilities more effective.<\/p>\n","protected":false},"featured_media":563082,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"footnotes":""},"research-area":[13556,13562,13554],"msr-locale":[268875],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-563061","msr-project","type-msr-project","status-publish","has-post-thumbnail","hentry","msr-research-area-artificial-intelligence","msr-research-area-computer-vision","msr-research-area-human-computer-interaction","msr-locale-en_us","msr-archive-status-active"],"msr_project_start":"","related-publications":[376952,168327],"related-downloads":[],"related-videos":[],"related-groups":[],"related-events":[],"related-opportunities":[],"related-posts":[],"related-articles":[],"tab-content":[],"slides":[{"attachment_id":563076,"headline":"","cta":"","url":"","cta_style":"","slideshow_type":"feature"},{"attachment_id":563079,"headline":"","cta":"","url":"","cta_style":"","slideshow_type":"feature"},{"attachment_id":563085,"headline":"","cta":"","url":"","cta_style":"","slideshow_type":"feature"},{"attachment_id":563088,"headline":"","cta":"","url":"","cta_style":"","slideshow_type":"feature"}],"related-researchers":[{"type":"user_nicename","display_name":"Abigail Sellen","user_id":31112,"people_section":"Section name 1","alias":"asellen"},{"type":"user_nicename","display_name":"Richard Banks","user_id":33361,"people_section":"Section name 1","alias":"rbanks"},{"type":"user_nicename","display_name":"Antonio Criminisi","user_id":41790,"people_section":"Section name 1","alias":"acriminisi"},{"type":"user_nicename","display_name":"Matthew Johnson","user_id":32830,"people_section":"Section name 1","alias":"matjoh"}],"msr_research_lab":[199561],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/563061"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-project"}],"version-history":[{"count":7,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/563061\/revisions"}],"predecessor-version":[{"id":875571,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/563061\/revisions\/875571"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/563082"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=563061"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=563061"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=563061"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=563061"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=563061"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}