{"id":1084857,"date":"2024-10-01T09:29:27","date_gmt":"2024-10-01T16:29:27","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/"},"modified":"2024-12-13T14:27:03","modified_gmt":"2024-12-13T22:27:03","slug":"interactive-multimodal-ai-systems-imais","status":"publish","type":"msr-group","link":"https:\/\/www.microsoft.com\/en-us\/research\/group\/interactive-multimodal-ai-systems-imais\/","title":{"rendered":"Interactive Multimodal AI Systems (IMAIS)"},"content":{"rendered":"<section class=\"mb-3 moray-highlight\">\n\t<div class=\"card-img-overlay mx-lg-0\">\n\t\t<div class=\"card-background  has-background- card-background--full-bleed\">\n\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"450\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/04\/SIGMA-simulator_sample.png\" class=\"attachment-full size-full\" alt=\"Multimodal interactive systems have the ability to assist us in our daily lives\" style=\"\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/04\/SIGMA-simulator_sample.png 1400w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/04\/SIGMA-simulator_sample-300x96.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/04\/SIGMA-simulator_sample-1024x329.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/04\/SIGMA-simulator_sample-768x247.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/04\/SIGMA-simulator_sample-240x77.png 240w\" sizes=\"auto, (max-width: 1400px) 100vw, 1400px\" \/>\t\t<\/div>\n\t\t<!-- Foreground -->\n\t\t<div class=\"card-foreground d-flex mt-md-n5 my-lg-5 px-g px-lg-0\">\n\t\t\t<!-- Container -->\n\t\t\t<div class=\"container d-flex mt-md-n5 my-lg-5 \">\n\t\t\t\t<!-- Card wrapper -->\n\t\t\t\t<div class=\"w-100 w-lg-col-5\">\n\t\t\t\t\t<!-- Card -->\n\t\t\t\t\t<div class=\"card material-md-card py-5 px-md-5\">\n\t\t\t\t\t\t<div class=\"card-body \">\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\n<h1 class=\"wp-block-heading\" id=\"interactive-multimodal-ai-systems\">Interactive Multimodal             AI Systems<\/h1>\n\n\n\n<p><\/p>\n\n\t\t\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/div>\n<\/section>\n\n\n\n\n\n<p>The Interactive Multimodal AI Systems focuses on creating interactive systems and experiences that blend the richness and complexity of people and their real, physical world with advanced technology. We seek to leverage multimodal generative AI models that incorporate multiple sensing modalities such as video, and speech, as well as models of spatial reasoning, human behavior and affect.<\/p>\n\n\n\n<p>Our work is driven by enabling interactions through various modalities including but not limited to vision and language such as touch, gestures, speech, sound, gaze, smell, and other physiological signals. Beyond sensing, we consider display technologies such as computer graphics, audio, and augmented reality as integral to delivering the next great computing experience. These systems span a wide range of devices, including handheld, stationary, head-mounted, on-body, and midair, and are explored across diverse environments from 2D screens to 3D immersive spaces. We recognize the value of building and evaluating systems to validate our basic research and reveal new avenues of innovation.<\/p>\n\n\n\n<div style=\"height:30px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n","protected":false},"excerpt":{"rendered":"<p>The Interactive Multimodal AI Systems focuses on creating interactive systems and experiences that blend the richness and complexity of people and their real, physical world with advanced technology. We seek to leverage multimodal generative AI models that incorporate multiple sensing modalities such as video, and speech, as well as models of spatial reasoning, human behavior [&hellip;]<\/p>\n","protected":false},"featured_media":1029219,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr_group_start":"","footnotes":""},"research-area":[13556,243062,13562,13563,13551,13554],"msr-group-type":[243694],"msr-locale":[268875],"msr-impact-theme":[266208,261667,261670],"class_list":["post-1084857","msr-group","type-msr-group","status-publish","has-post-thumbnail","hentry","msr-research-area-artificial-intelligence","msr-research-area-audio-acoustics","msr-research-area-computer-vision","msr-research-area-data-platform-analytics","msr-research-area-graphics-and-multimedia","msr-research-area-human-computer-interaction","msr-group-type-group","msr-locale-en_us"],"msr_group_start":"","msr_detailed_description":"","msr_further_details":"","msr_hero_images":[],"msr_research_lab":[199563,199565],"related-researchers":[{"type":"user_nicename","display_name":"Andy Wilson","user_id":31159,"people_section":"Research team","alias":"awilson"},{"type":"user_nicename","display_name":"Judith Amores","user_id":42003,"people_section":"Research team","alias":"judithamores"},{"type":"user_nicename","display_name":"Sean Andrist","user_id":36443,"people_section":"Research team","alias":"sandrist"},{"type":"user_nicename","display_name":"Dan Bohus","user_id":31581,"people_section":"Research team","alias":"dbohus"},{"type":"user_nicename","display_name":"Javier Hernandez","user_id":38413,"people_section":"Research team","alias":"javierh"},{"type":"user_nicename","display_name":"Nick Saw","user_id":31405,"people_section":"Research team","alias":"chitsaw"},{"type":"user_nicename","display_name":"Bala Thoravi Kumaravel","user_id":42222,"people_section":"Research team","alias":"balkumaravel"}],"related-publications":[1132935],"related-downloads":[],"related-videos":[],"related-projects":[],"related-events":[],"related-opportunities":[],"related-posts":[],"tab-content":[],"msr_impact_theme":["Discovery","Empowerment","Resilience"],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-group\/1084857","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-group"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-group"}],"version-history":[{"count":6,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-group\/1084857\/revisions"}],"predecessor-version":[{"id":1089876,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-group\/1084857\/revisions\/1089876"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/1029219"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1084857"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1084857"},{"taxonomy":"msr-group-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-group-type?post=1084857"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1084857"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1084857"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}