{"id":966927,"date":"2023-09-07T17:49:22","date_gmt":"2023-09-08T00:49:22","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&p=966927"},"modified":"2024-03-25T04:54:43","modified_gmt":"2024-03-25T11:54:43","slug":"mindagent-emergent-gaming-interaction","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/mindagent-emergent-gaming-interaction\/","title":{"rendered":"MindAgent: Emergent Gaming Interaction"},"content":{"rendered":"

Large Language Models (LLMs) have the capacity of performing complex <\/span>scheduling in a multi-agent system and can coordinate these agents into com<\/span>pleting sophisticated tasks that require extensive collaboration. However, despite <\/span>the introduction of numerous gaming frameworks, the community has insufficient <\/span>benchmarks rather than building general multi-agents collaboration infrastructure <\/span>that encompass both LLM and human-NPCs communications. In this work, we <\/span>propose a novel infrastructure –<\/span> MindAgent<\/span> – to evaluate planning and coordina<\/span>tion emergent capabilities for gaming interaction. In particular, our infrastructure <\/span>leverages existing gaming framework to require understanding of the coordina<\/span>tor for a considerable multi-agents, collaborate with human players via un-<\/span>finetuned proper instructions, and establish an in-context learning with feedback <\/span>on few-shot prompt way. Furthermore, we introduce<\/span> CuisineWorld<\/span>, a new gam<\/span>ing scenario and related benchmark that dispatch a multi-agent collaboration effi<\/span>ciency and supervise multiple agents playing the game simultaneously. We con<\/span>duct comprehensive evaluations with new auto-metric<\/span> CoS<\/span> for calculating the col<\/span>laboration efficiency. Finally, our infrastructure can be deployed into real-world <\/span>gaming scenarios in a customized VR game \u201dCuisineWorld\u201d and adapted in exist<\/span>ing border gaming \u201dMinecraft\u201d domain. We hope our findings on LLMs and the <\/span>new infrastructure for general-purpose scheduling and coordination can help shed <\/span>light on how such skills can be obtained by learning from large text corpora.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"

Large Language Models (LLMs) have the capacity of performing complex scheduling in a multi-agent system and can coordinate these agents into completing sophisticated tasks that require extensive collaboration. However, despite the introduction of numerous gaming frameworks, the community has insufficient benchmarks rather than building general multi-agents collaboration infrastructure that encompass both LLM and human-NPCs communications. […]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"msr-content-type":[3],"msr-research-highlight":[],"research-area":[13556,13545],"msr-publication-type":[193716],"msr-product-type":[],"msr-focus-area":[],"msr-platform":[],"msr-download-source":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-966927","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-artificial-intelligence","msr-research-area-human-language-technologies","msr-locale-en_us"],"msr_publishername":"","msr_edition":"","msr_affiliation":"","msr_published_date":"2024-1-9","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"Generated the model and infrastructure with Microsoft Gaming US.","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/MindAgent_final.pdf","id":"980802","title":"mindagent_final","label_id":"243109","label":0}],"msr_related_uploader":[{"type":"url","viewUrl":"false","id":"false","title":"https:\/\/arxiv.org\/pdf\/2309.09971.pdf","label_id":"243112","label":0},{"type":"file","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/MindAgent_final.pdf","id":"980802","title":"mindagent_final","label_id":"243118","label":0}],"msr_attachments":[{"id":980802,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/10\/MindAgent_final.pdf"},{"id":969909,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/MindAgent-650dee0f7fd72.pdf"},{"id":967923,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/GamingInteraction.pdf"},{"id":967017,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/MindAgent-64ff093a498c7.pdf"},{"id":966945,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/MindAgent-64fd883912140.pdf"},{"id":966942,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/MindAgent-64fd2b9066ac8.pdf"},{"id":966930,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/MindAgent.pdf"}],"msr-author-ordering":[{"type":"text","value":"Steven Gong","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Qiuyuan Huang","user_id":36356,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Qiuyuan Huang"},{"type":"text","value":"Xiaojian Ma","user_id":0,"rest_url":false},{"type":"text","value":"Hoi Vo","user_id":0,"rest_url":false},{"type":"text","value":"Zane Durante","user_id":0,"rest_url":false},{"type":"text","value":"Yusuke Noda","user_id":0,"rest_url":false},{"type":"text","value":"Zilong Zheng","user_id":0,"rest_url":false},{"type":"text","value":"Song-chun Zhu","user_id":0,"rest_url":false},{"type":"text","value":"Demetri Terzopoulos","user_id":0,"rest_url":false},{"type":"text","value":"Feifei Li","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Jianfeng Gao","user_id":32246,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Jianfeng Gao"}],"msr_impact_theme":[],"msr_research_lab":[199565],"msr_event":[],"msr_group":[144931],"msr_project":[788159,965577],"publication":[],"video":[],"download":[],"msr_publication_type":"inproceedings","related_content":{"projects":[{"ID":788159,"post_title":"Agent AI","post_name":"agent-ai","post_type":"msr-project","post_date":"2023-09-25 21:53:00","post_modified":"2024-02-28 07:03:22","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/agent-ai\/","post_excerpt":"Agent-based multimodal AI systems are becoming a ubiquitous presence in our everyday lives. A promising direction for making these systems more interactive is to embody them as agents within specific environments. The grounding of large foundation models to act as agents within specific environments can provide a way of incorporating visual and contextual information into an embodied system. For example, a system that can perceive user actions, human behavior, environment objects, audio expressions, and the…","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/788159"}]}},{"ID":965577,"post_title":"Emergent Interaction Agent","post_name":"gaming-interaction","post_type":"msr-project","post_date":"2023-05-22 22:38:00","post_modified":"2023-12-17 10:09:27","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/gaming-interaction\/","post_excerpt":"We collaborate with X-Box and Mesh team, explored a new gaming infrastructure and designed the dynamic real-time system for human-player and NPCs with GPT-X in the multi-agent platform. GitHub: MindAgent (opens in new tab) ArXiv: https:\/\/arxiv.org\/abs\/2309.09971 (opens in new tab) Demo: MindAgent.mp4 (opens in new tab) Gaming Interaction Infrastructure: We are very excited to share the good news. Our project \u201cMindAgent: Emergent Gaming Interaction (opens in new tab)\u201d is public recently. We seek to develop…","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/965577"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/966927","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":9,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/966927\/revisions"}],"predecessor-version":[{"id":1015521,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/966927\/revisions\/1015521"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=966927"}],"wp:term":[{"taxonomy":"msr-content-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-content-type?post=966927"},{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=966927"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=966927"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=966927"},{"taxonomy":"msr-product-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-product-type?post=966927"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=966927"},{"taxonomy":"msr-platform","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-platform?post=966927"},{"taxonomy":"msr-download-source","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-download-source?post=966927"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=966927"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=966927"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=966927"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=966927"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=966927"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=966927"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=966927"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}