{"id":1138863,"date":"2025-05-19T09:00:11","date_gmt":"2025-05-19T16:00:11","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=1138863"},"modified":"2025-11-26T14:38:25","modified_gmt":"2025-11-26T22:38:25","slug":"magentic-ui-an-experimental-human-centered-web-agent","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/magentic-ui-an-experimental-human-centered-web-agent\/","title":{"rendered":"Magentic-UI, an experimental human-centered web agent"},"content":{"rendered":"\n
\"This<\/figure>\n\n\n\n

Modern productivity is rooted in the web\u2014from searching for information and filling in forms to navigating dashboards. Yet, many of these tasks remain manual and repetitive. Today, we are introducing Magentic-UI, a new open-source research prototype of a human-centered<\/em> agent that is meant to help researchers study open questions on human-in-the-loop approaches and oversight mechanisms for AI agents. This prototype collaborates with users on web-based tasks<\/strong> and operates in real time over a web browser. Unlike other computer use agents that aim for full autonomy, Magentic-UI offers a transparent and controllable experience for tasks that are action-oriented <\/em>and <\/em>require activities beyond just performing simple web searches.<\/p>\n\n\n\n

Magentic-UI builds on Magentic-One (opens in new tab)<\/span><\/a>, a powerful multi-agent team we released last year, and is powered by AutoGen (opens in new tab)<\/span><\/a>, our leading agent framework. It is available under MIT license at https:\/\/github.com\/microsoft\/Magentic-UI (opens in new tab)<\/span><\/a> and on Azure AI Foundry Labs (opens in new tab)<\/span><\/a>, the hub where developers, startups, and enterprises can explore groundbreaking innovations from Microsoft Research. Magentic-UI is integrated with Azure AI Foundry models and agents. Learn more about how to integrate Azure AI agents into the Magentic-UI multi-agent architecture by following this code sample (opens in new tab)<\/span><\/a>. <\/p>\n\n\n\n\t

\n\t\t\n\n\t\t

\n\t\tPODCAST SERIES<\/span>\n\t<\/p>\n\t\n\t

\n\t\t\t\t\t\t
\n\t\t\t\t\n\t\t\t\t\t\"Illustrated\n\t\t\t\t<\/a>\n\t\t\t<\/div>\n\t\t\t\n\t\t\t
\n\n\t\t\t\t\t\t\t\t\t

AI Testing and Evaluation: Learnings from Science and Industry<\/h2>\n\t\t\t\t\n\t\t\t\t\t\t\t\t

Discover how Microsoft is learning from other domains to advance evaluation and testing as a pillar of AI governance.<\/p>\n\t\t\t\t\n\t\t\t\t\t\t\t\t

\n\t\t\t\t\t
\n\t\t\t\t\t\t\n\t\t\t\t\t\t\tListen now\t\t\t\t\t\t<\/a>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t<\/div>\n\t<\/div>\n\t<\/div>\n\t\n\n\n

Magentic-UI can perform tasks that require browsing the web, writing and executing Python and shell code, and understanding files. Its key features include:<\/p>\n\n\n\n

    \n
  1. Collaborative planning with users (co-planning)<\/strong>. Magentic-UI allows users to directly modify its plan through a plan editor or by providing textual feedback before Magentic-UI executes any actions. <\/li>\n\n\n\n
  2. Collaborative execution with users (co-tasking)<\/strong>. Users can pause the system and give feedback in natural language or demonstrate it by directly taking control of the browser.<\/li>\n\n\n\n
  3. Safety with human-in-the-loop (action guards)<\/strong>. Magentic-UI seeks user approval before executing potentially irreversible actions, and the user can specify how often Magentic-UI needs approvals. Furthermore, Magentic-UI is sandboxed for the safe operation of tools such as browsers and code executors.<\/li>\n\n\n\n
  4. Safety with human-in-the-loop<\/strong>. Magentic-UI seeks user approval before executing potentially irreversible actions, and the user can specify how often Magentic-UI needs approvals. Furthermore, Magentic-UI is sandboxed for the safe operation of tools such as browsers and code executors. <\/li>\n\n\n\n
  5. Learning from experience (plan learning)<\/strong>. Magentic-UI can learn and save plans from previous interactions to improve task completion for future tasks. <\/li>\n<\/ol>\n\n\n\n
    \"A<\/a>
    Figure 1: Screenshot of Magentic-UI actively performing a task. The left side of the screen shows Magentic-UI stating its plan and progress to accomplish a user\u2019s complex goal. The right side shows the browser Magentic-UI is controlling. <\/em><\/figcaption><\/figure>\n\n\n\n

    How is Magentic-UI human-centered?<\/h2>\n\n\n\n

    While many web agents promise full autonomy, in practice users can be left unsure of what the agent can do, what it is currently doing, and whether they have enough control to intervene when something goes wrong or doesn\u2019t occur as expected. By contrast, Magentic-UI considers user needs at every stage of interaction. We followed a human-centered design methodology in building Magentic-UI by prototyping and obtaining feedback from pilot users during its design. <\/p>\n\n\n\n

    \"Co-planning<\/a>
    Figure 2: Co-planning – Users can collaboratively plan with Magentic-UI.<\/em><\/figcaption><\/figure>\n\n\n\n

    For example, after a person specifies and before Magentic-UI even begins to execute, it creates a clear step-by-step plan that outlines what it would do to accomplish the task. People can collaborate with Magentic-UI to modify this plan and then give final approval for Magentic-UI to begin execution. This is crucial as users may have expectations of how the task should be completed; communicating that information could significantly improve agent performance<\/a>. We call this feature co-planning.<\/p>\n\n\n\n

    During execution, Magentic-UI shows in real time what specific actions it\u2019s about to take. For example, whether it is about to click on a button or input a search query. It also shows in real time what it observed on the web pages it is visiting. Users can take control of the action at any point in time and give control back to the agent. We call this feature co-tasking.<\/p>\n\n\n\n

    \"Co-tasking.<\/a>
    Figure 3: Co-tasking – Magentic-UI provides real-time updates about what it is about to do and what it already did, allowing users to collaboratively complete tasks with the agent.<\/em><\/figcaption><\/figure>\n\n\n\n
    \"Action-guards.<\/a>
    Figure 4: Action-guards \u2013 Magentic-UI will ask users for permission before executing actions that it deems consequential or important. <\/em><\/figcaption><\/figure>\n\n\n\n

    Additionally, Magentic-UI asks for user permission before performing actions that are deemed irreversible, such as closing a tab or clicking a button with side effects. We call these \u201caction guards\u201d. The user can also configure Magentic-UI\u2019s action guards to always ask for permission before performing any action. If the user deems an action risky (e.g., paying for an item), they can reject it. <\/p>\n\n\n\n

    \n
    \n
    \"This<\/a><\/figure>\n<\/div>\n\n\n\n
    \n
    \"This<\/a><\/figure>\n<\/div>\n<\/div>\n\n\n\n
    Figure 5: Plan learning \u2013 Once a task is successfully completed, users can request Magentic-UI to learn a step-by-step plan from this experience.<\/em><\/figcaption><\/figure>\n\n\n\n

    After execution, the user can ask Magentic-UI to reflect on the conversation and infer and save a step-by-step plan for future similar tasks. Users can view and modify saved plans for Magentic-UI to reuse in the future in a saved-plans gallery. In a future session, users can launch Magentic-UI with the saved plan to either execute the same task again, like checking the price of a specific flight, or use the plan as a guide to help complete similar tasks, such as checking the price of a different type of flight. <\/p>\n\n\n\n

    Combined, these four features\u2014co-planning, co-tasking, action guards, and plan learning\u2014enable users to collaborate effectively with Magentic-UI.<\/p>\n\n\n\n

    Architecture<\/h2>\n\n\n\n

    Magentic-UI\u2019s underlying system is a team of specialized agents adapted from AutoGen\u2019s Magentic-One<\/a> system. The agents work together to create a modular system:<\/p>\n\n\n\n