Agent governance Archives | Microsoft Copilot Blog http://approjects.co.za/?big=en-us/microsoft-copilot/blog/copilot-studio/cs-topic/agent-governance/ Mon, 18 May 2026 21:31:44 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.4 The in-depth guide to managing real-time voice agents at scale http://approjects.co.za/?big=en-us/microsoft-copilot/blog/copilot-studio/the-in-depth-guide-to-managing-real-time-voice-agents-at-scale/ Tue, 19 May 2026 16:00:00 +0000 Explore an in-depth guide to managing customer-facing real-time voice agents with Copilot Studio, from governance foundations to production readiness.

The post The in-depth guide to managing real-time voice agents at scale appeared first on Microsoft Copilot Blog.

]]>

Governance built into the foundation of your agent program is what separates a successful production deployment from one that stalls—or fails publicly. This guide explains how to design, manage, and scale customer-facing, real-time voice agents using Microsoft Copilot Studio, with a focus on governance, reliability, and enterprise readiness.


Imagine a customer calling your contact center about a billing dispute. A real-time voice agent answers, identifies the customer, references their account history, resolves the issue, and—when needed—hands off to a live agent with full context preserved. Human agents focus on exceptions, not routine queries.

Now imagine that same scenario without agent governance. The agent was built, published directly to production, and never tested for escalation. Monitoring was not enabled. The first signal of a problem is a customer complaint—or a data exposure.

Customer-facing agents are becoming the front door for how organizations engage with customers, handling intent and outcomes across conversational AI experiences. What began as chat has evolved into always-on agents that resolve issues, take action, and now support real-time voice across digital and contact center environments using platforms like Copilot Studio. The opportunity is massive—but so is the cost of getting the foundation wrong. Just as self-service and Q&A agents redefined support at scale, this shift will fundamentally reshape how companies operate.

Why real-time voice agents require a different governance lens

Most organizations already govern internal AI tools designed for known users and controlled environments. Customer-facing agents operate under fundamentally different conditions. There are unknown users, public channels, brand exposure, and direct access to customer data and downstream systems. Failures in these customer experience events mean operational, regulatory, and reputational consequences.

This is why governance cannot be treated as a final approval step. As real-time voice agents scale, governance must be built into how they are designed, deployed, monitored, and evolved from the start. Organizations that treat governance as an accelerant—rather than a constraint—can move faster and more confidently than those who bolt it on later.

Principle: Governance as a design principle can streamline approval, which leads to accelerated scale and adoption.

Why real-time voice agents raise the stakes

Text‑based agents require governance, but real‑time voice introduces stricter operational constraints. Latency budgets are tighter, failures are immediately apparent to customers, and interruption handling, turn‑taking, session state, and escalation behavior directly affect service reliability.

Voice agents are typically deployed in high‑impact scenarios such as billing, orders, and service disruptions, where they integrate with Dynamics 365 Contact Center workflows. In these environments, agents must identify callers, reference active cases, execute actions, and escalate predictably.

For real‑time voice, escalation is a first‑class system requirement. Handoffs to human agents must preserve full conversational context and session state, and be validated under load before production traffic is routed.

Model selection also becomes operationally significant. Copilot Studio real‑time voice agents use purpose‑fit models to balance latency, quality, and reliability while remaining governed through a centralized control plane.

What good looks like: A production voice agent deployment has been tested for escalation behavior, latency under load, and handoff context preservation before any customer traffic is routed to it. Monitoring is active from day one, not added after the first incident.

A governance framework for the full agent lifecycle

Governing customer-facing agents effectively requires capabilities that span the full agent lifecycle. This is especially critical for business-to-consumer (B2C) agents, which operate in always-on, customer-facing contexts and must handle real-time interactions, actions, and sensitive data at scale—particularly in high‑stakes modalities like voice.

Copilot Studio provides this governance as a managed agent platform, enforcing controls through managed operations and managed security across the full lifecycle. That goes from build access and data connectivity to release, monitoring, and auditability. Rather than relying on documentation or custom wiring, governance is centralized in the Microsoft Power Platform control plane and consistently applied across chat, voice, and contact center scenarios.

The following five‑stage governance framework reflects how managed capabilities come together across the full lifecycle of customer-facing agents:

  1. Govern the builder
  2. Govern the build
  3. Govern the release
  4. Govern the runtime
  5. Govern the lifecycle

Stage 1: Govern the builder

Before a single topic is created, agent governance starts with who is allowed to build and what they are allowed to connect.

  • Define builder roles and environments. Specify who can create agents and which environments they can work in, using role‑based access in the Power Platform admin center.
  • Set data access boundaries early. Apply data loss prevention (DLP) policies before development to determine which connectors and data sources agents can use.
  • Maintain environment separation. Use distinct development, test, and production environments to validate changes before deploying them to customer‑facing scenarios.
  • Standardize on managed solutions. Package agents in managed solutions to support versioning, controlled promotion, and rollback across environments.

What good looks like: A new agent builder requests access and is provisioned into a dedicated development environment. DLP policies are pre-applied. They cannot publish to any customer-facing channel without an administrator approval step.

Stage 2: Govern the build

How an agent is built determines how safe and predictable it is in production.

  • Configure authentication by channel. Decide whether sessions are authenticated (Microsoft Entra ID or supported identity provider [IdP]) or anonymous, and design data access accordingly. (For public-facing scenarios like 800 numbers and public websites, anonymous real-time voice sessions are common.)
  • Set generative AI behavior explicitly. Define and check grounding, topic scope, and allowed behaviors rather than relying on default settings.
  • Validate escalation paths. Test and verify handoff to live agents with full conversation context preserved for all voice scenarios.
  • Apply content moderation intentionally. Define clear engagement boundaries, enforce agent governance and policy controls, and rigorously red‑team and validate edge cases before deploying to production.

What good looks like: Testing escalation paths before publishing an agent to a customer-facing channel, so you can go live with more confidence. Catching errors before the first live escalation is critical to creating a good customer experience.

Stage 3: Govern the release

Moving an agent from development to production requires controlled, auditable steps.

  • Standardize promotion paths. Promote agents through dev, test, and production using managed solutions and Power Platform pipelines with an auditable change history.
  • Apply preproduction validation gates. Require checks for conversation quality, escalation behavior, latency under load, and data access before publishing.
  • Plan and test rollback. Define and validate rollback procedures for production issues prior to go‑live.
  • Separate publish authorization. Require explicit approval to publish agents to customer‑facing channels, independent of build permissions.

What good looks like: An agent must pass a defined pre-production checklist and receive administrator approval to publish before any customer traffic reaches it. Every version promotion is tracked in the solution history.

Stage 4: Govern the runtime

Once an agent is live, governance shifts from control to visibility and response.

  • Enable runtime observability. Turn on conversation transcripts and analytics in Copilot Studio before routing customer traffic.
  • Define operational thresholds. Monitor metrics such as escalation rate, resolution rate, latency, and session completion, with alerts for deviations.
  • Establish incident response. Define processes for detecting, triaging, and mitigating production issues in voice agents integrated with Dynamics 365 Contact Center.
  • Monitor usage and capacity. Track session volume, message consumption, and capacity limits to support scaling and stability.

What good looks like: Early detection through active monitoring. Voice agents that interact with customers without active monitoring are operating without a safety net. Issues that could persist for hours without analytics can be caught in minutes with these guards in place.

Stage 5: Govern the lifecycle

Voice agents are not static. They evolve as scenarios expand, customer needs change, and the platform advances. Managing change safely is as important as the initial deployment.

  • Version agent configuration. Track changes to topics, actions, authentication, and generative AI settings using application lifecycle management (ALM) and source control.
  • Validate changes preproduction. Test all updates in non‑production environments to avoid regressions in core scenarios, including voice flows and escalation behavior.
  • Coordinate releases operationally. Communicate deployment windows to IT and contact center operations teams.
  • Evolve governance as scale grows. Reassess role-based access control (RBAC), DLP policies, environment strategy, and publishing permissions as agent count and channel coverage expand.

Platform capabilities that support agent governance

Copilot Studio provides a centralized control plane for building, operating, and governing customer‑facing agents. The platform capabilities below directly enable the governance framework described above and should be configured before scaling B2C deployments:

  • Power Platform admin center: Central governance surface for environments, DLP policies, user access, and capacity management; the primary enforcement layer for agent governance.
  • Environment management: Separate development, test, and production environments to support validation and controlled promotion of customer‑facing agents.
  • Data loss prevention (DLP) policies: Environment‑level connector controls that define which data sources and services agents can access before any connections are established.
  • Managed solutions and Power Platform pipelines: Package agents as managed solutions and promote them through environments with version tracking, rollback support, and an auditable change history.
  • Microsoft Entra ID and channel authentication: Configure customer‑facing authentication using Entra ID or supported identity providers to enable secure, scoped access to customer data.
  • Generative AI controls and content moderation: Per‑agent configuration for grounding, topic scope, allowed behaviors, and content filtering, applied deliberately prior to public deployment.
  • Conversation transcripts and analytics: Built‑in logging and analytics providing runtime visibility into agent behavior, escalation patterns, and coverage gaps.
  • Dynamics 365 Contact Center integration: Native escalation to live agents with case context preservation and unified conversation history for voice deployments.
  • Azure Speech: Underlying speech infrastructure for real‑time voice agents, with implications for latency, reliability, and capacity planning.
  • Dataverse security model: Row‑level and business‑unit security controls governing agent access to customer records in Dynamics‑integrated scenarios.

Security, privacy, and compliance for customer-facing agents

For IT and security teams, governance of customer-facing agents must also address data handling, regulatory requirements, and audit readiness. These are not secondary concerns—they’re often the first gate any enterprise B2C deployment must pass through.

Customer data and PII in voice interactions

Real-time voice agents generate conversation transcripts that may contain personally identifiable information. Establish clear retention policies for these transcripts before deployment. Define who has access to conversation logs, how long they are retained, and whether they are subject to deletion requests under applicable privacy regulations.

Regulatory considerations

Depending on your industry and geography, customer-facing AI agents may be subject to requirements under General Data Protection Regulation (GDPR), California Consumer Privacy Act (CCPA), or sector-specific regulations in financial services or healthcare. Review applicable requirements with your legal and compliance teams before deploying agents to regulated customer scenarios. DLP policies in the Power Platform admin center are a key compliance control.

Audit logging and compliance evidence

Power Platform and Copilot Studio support audit logging through Microsoft Purview and the Power Platform admin center. Ensure audit logging is enabled before production deployment and that logs are retained according to your organization’s compliance requirements.

Credential and secret management

Agents that connect to external systems require credentials and connection strings. Do not store secrets in agent configuration directly. Use environment variables in Power Platform or Azure Key Vault references to manage credentials securely, with access controlled through role assignments.

Note for architects: Security and compliance review should be a gate in Stage 3 (govern the release), not an afterthought discovered during audit. Engage your security and compliance teams in the pre-production validation checklist.


Five anti-patterns that derail production AI deployments

Organizations that have scaled B2C agents successfully tend to have avoided the same set of avoidable mistakes. These are the patterns most likely to cause problems once customer traffic is live.

  1. Skipping environment separation: Building and publishing agents in the same environment, or directly in production, allows untested changes to reach customers and is one of the most common causes of early deployment issues.
  2. Publishing voice agents without tested escalation: Escalation to a live agent is a core part of voice agent design. Untested handoff paths that fail to preserve customer context degrade the experience more than having no agent at all.
  3. Granting broad DLP exceptions under schedule pressure: Temporarily relaxing DLP policies often becomes permanent, introducing data access risk and audit gaps that are difficult to remediate later.
  4. Treating monitoring as a postlaunch activity: When transcripts, analytics, and alerts are not enabled before go‑live, production issues surface through customer complaints rather than operational signals.
  5. Building openended agents without defined scope: Broad, general‑purpose agents are harder to test, govern, and improve than agents scoped to specific customer scenarios with clear success criteria.

How to operationalize voice agents

As teams move from pilots to production, a small set of patterns consistently differentiates voice agent deployments that scale.

  • Start with well‑defined customer scenarios rather than broad open‑ended agents. Clear scope simplifies risk assessment, testing, and measurement. A voice agent designed for order status or billing inquiries is easier to govern and iterate on than one intended to answer arbitrary customer questions.
  • Treat real‑time voice as an extension of existing digital agent governance, not an exception. Teams that have already governed chat‑based agents in Copilot Studio are well positioned to apply the same controls to voice, while accounting for stricter latency, escalation, and runtime requirements.
  • Design escalation as a primary flow, not a fallback. Agents integrated with Dynamics 365 Contact Center should preserve full conversational and case context on handoff. Predictable escalation maintains continuity; dropped context undermines trust.
  • As programs scale, three governance questions remain central:
    • Which customer scenarios are appropriate for automation versus human handling?
    • Where does real‑time voice materially improve the experience versus add operational complexity?
    • How quickly can production issues be detected and resolved once agents are live?

Using Copilot Studio as a governance foundation for agents

Copilot Studio and Power Platform provide a centralized environment for building, operating, and governing agents, which becomes increasingly important as deployments expand from internal use cases to customer‑facing channels.

Establish governance once in Copilot Studio, and scale it across chat, voice, and backend‑driven agents without fragmentation. As a centralized control plane, the platform helps you enforce consistent policies and maintain operational oversight as agents expand across channels, regions, and customer scenarios.

For organizations already using Copilot Studio, many of the governance capabilities described here are available today. Support for real-time voice agents in Copilot Studio is now generally available in North America, with deployments delivered first through Dynamics 365 Contact Center. Language support, additional regions, and broader publishing channels will expand over time as part of Copilot Studio’s ongoing roadmap.

Learn more in the announcement blog for real-time voice agents.

Governance readiness checklist for customer-facing voice agents

Before deploying a customer-facing or real-time voice agent to production, verify governance readiness across these core dimensions.

Access and environment

  • Separate development, test, and production environments are provisioned
  • Role-based access is configured—developers cannot publish directly to production
  • Advanced connector policy is applied to all environments before development begins
  • Publishing permissions for customer-facing channels require administrator approval

Build and configuration

  • Authentication and identity are configured appropriately for the channel (authenticated or anonymous)
  • Generative AI settings, grounding, and content moderation are configured deliberately
  • Credential and secret management uses environment variables or Azure Key Vault references
  • The agent is packaged in a managed solution with tracked versioning

Testing and release

  • Escalation paths to live agents have been tested with context preservation verified
  • Latency and behavior have been validated under simulated load
  • A pre-production validation checklist has been completed and signed off
  • A rollback procedure has been defined and tested
  • Audit logging is enabled and log retention meets compliance requirements

Runtime and operations

  • Conversation transcripts and analytics are active before first customer interaction
  • Operational thresholds (escalation rate, session completion rate) are defined with alerts
  • An incident response procedure is defined and communicated to operations teams
  • Usage and consumption monitoring is in place for capacity planning
  • A change management process is defined for updating live agents

Getting started with customer-facing agents

Organizations ready to operationalize B2C agents should begin with the following steps:

  • Align on priority scenarios. Agree on customer scenarios, scope, success criteria, and escalation requirements before any development begins.
  • Set up environments and governance. Configure separate dev, test, and production environments and apply DLP policies before granting developer access. Define role‑based access and require administrator approval for publishing to customer‑facing channels.
  • Engage security and compliance early. Review applicable regulatory requirements and establish data retention policies for conversation transcripts.
  • Build and validate deliberately. Start with a scoped agent, use managed solutions, and be sure to test and verify escalation paths.
  • Confirm readiness before golive. Complete the governance readiness checklist and enable monitoring and escalation thresholds prior to routing customer traffic.

With the right foundation in place, teams can scale customer‑facing and real‑time voice agents—while maintaining the reliability, security, and operational integrity IT teams are responsible for protecting.

Resources for governing AI agents

The post The in-depth guide to managing real-time voice agents at scale appeared first on Microsoft Copilot Blog.

]]>
New and improved: Agent governance, intelligent workflows, and connected app experiences http://approjects.co.za/?big=en-us/microsoft-copilot/blog/copilot-studio/new-and-improved-agent-governance-intelligent-workflows-and-connected-app-experiences/ Mon, 11 May 2026 16:00:00 +0000 See what's new in Copilot Studio, April 2026: updates to workflows, increased control over agent operations, and an expanded agent usage estimator.

The post New and improved: Agent governance, intelligent workflows, and connected app experiences appeared first on Microsoft Copilot Blog.

]]>

As organizations scale their use of AI agents, IT teams face a familiar challenge: how do you expand automation without losing control? Individual agents can be powerful, but as they connect through workflows and integrate across systems, requirements for visibility, governance, and predictability become much more complex. And capability must be grounded in confidence.

The April 2026 updates in Microsoft Copilot Studio focus on building that confidence across the platform. From increasing visibility and governance for admins to expanding intelligent workflow capabilities, these features help you move from isolated automation to connected, reliable systems.

Build and scale agents with better visibility and control

As agents expand across organizations and business processes, admins need clear visibility into how they’re performing, how they’re secured, and what they’ll cost to run. These updates help you manage agents more effectively without adding more friction—or risk.

See agent performance and status more clearly

Copilot Studio now surfaces agent status directly in the authoring experience, giving you immediate insight into each agent’s security and protection posture. You can quickly identify issues like authentication gaps or policy impacts and investigate them at the source. This helps reduce guesswork and speed up resolution.

As you gain clearer visibility into agent performance, you can also share those insights more safely. The Analytics Viewer role, now generally available, introduces read-only access to an agent’s Analytics page.

The Analytics Viewer role allows us to provide meaningful performance insights to business and operational stakeholders while maintaining strict production governance. It cleanly separates operational visibility from agent configuration and publishing rights.

Mohamed Arhab, Solution Architect, City of Montreal

Allowing analysts and stakeholders to monitor performance, without giving them the ability to modify the agent, helps resolve a long-standing tradeoff between visibility and control. Now it’s easier to share insights broadly while maintaining clear separation of responsibilities.

Speaking of extending visibility and control, there’s more good news: Microsoft Agent 365 is now generally available. Agent 365 is the centralized control plane for managing agents across your environment. This brings together visibility into agent inventory, permissions, behavior, and activity in one place so that you can monitor and govern agents consistently, not just where they’re built.

For Copilot Studio customers, this means the agents you create can be managed alongside agents from Microsoft 365 and partner ecosystems, with shared policies, security controls, and lifecycle oversight. As Agent 365 continues to expand its integrations and multi-agent capabilities, it further strengthens Copilot Studio’s role as the place where agents are built—while governance scales across the full system. Learn more about Agent 365.

Plan and scale with clearer cost visibility

The expanded agent usage estimator now includes Dynamics 365 agents, such as Sales Qualification Agent and Customer Service Agent. By forecasting Copilot credit consumption across both Copilot Studio and Dynamics 365 scenarios in one place, you can model usage more accurately and scale deployments—helping avoid unexpected cost surprises.

With these recent admin updates, the result is fewer bottlenecks, better-informed decisions, and a clearer path to scaling agents across your organization.

Expand workflows into intelligent, governed automation systems

In Copilot Studio, workflows are step-by-step automation processes that complete actions or tasks in a deterministic, reliable way. As workflows become the backbone of business automation, these new updates help you extend their capabilities—bringing in more AI-powered reasoning, centralized governance, and a growing ecosystem of tools in a way that’s reliable and secure by design.

Design and validate workflows with more clarity

One powerful way to make your workflows more adaptable and effective is by embedding Copilot Studio agents directly into them. Using agent nodes inside workflows means that instead of just performing the task with rigid logic, the workflow can delegate reasoning, decisions, or output generation to an agent at any prescribed step of the process.

This makes workflows more resilient to real-world situations—which have a lot of variability—while still following the defined structure that make IT teams less nervous.

In addition to embedding agents, you can now also add and configure AI actions directly within the flow to understand requests, route work, and generate content dynamically. And with the ability to test individual steps using sample inputs, teams can validate behavior earlier, debug more effectively, and refine workflows before they’re deployed.

In practice: Unifi, North America’s largest provider of aviation ground handling services, used Copilot Studio and Power Platform to automate legal contract review by combining agents with deterministic workflows. Instead of relying on a single agent, they broke the process into coordinated steps that extract, classify, and validate key terms across documents. This system reduced contract processing from days to minutes and delivers the same level of performance as much more expensive, off-the-shelf products built specifically for the legal industry.

The result is a workflow experience that’s more adaptable and more predictable to operate. This helps give teams—both makers and administrators—more confidence in creating more sophisticated automation that doesn’t sacrifice clarity or control.

Scale workflows across systems with built-in governance

Speaking of clarity and control, there are also new updates to workflows that help you scale automation without introducing new governance risks.

Workflows can now connect to a broader ecosystem of tools, including model context protocol (MCP) server-enabled tools (preview), which makes it easier to take action across systems while staying within Microsoft security, permission, and compliance boundaries. This allows workflows to execute tasks and involve users for review and approval within governed processes.

We’ve also introduced a centralized, admin-controlled environment for Workflows Agent. This makes it easier to apply data loss prevention (DLP) policies consistently and maintain visibility across automation, so workflows remain compliant by design, even as they scale.

Together, these updates make it easier to move from isolated automations to connected, intelligent systems. With those systems, you can scale workflows across your organization with greater confidence, control, and flexibility.

Bring business apps directly into your agents

As agents become part of everyday work, a common gap emerges: they can generate insight, but acting on that insight often requires switching tools, re-creating context, or handing work off across systems. Support for apps in agents, now generally available, helps to close that gap.

Turn intent into action inside Copilot Chat

Agents built in Copilot Studio can now surface rich, interactive app experiences directly in Copilot Chat, allowing users to review data, update records, approve requests, or create assets in place. Instead of switching tools or re-creating context, work happens seamlessly within the flow of conversation. This helps reduce friction and empowers teams to move faster from insight to execution.

Animated UI showing Adobe Express embedded in Microsoft 365 Copilot chat, where a user accesses design templates and visuals directly within the conversation.

Work across the systems your business already runs on

Apps in agents bring together Microsoft and partner applications—from Power Apps to Dynamics 365 and beyond—so agents can take action across the systems your teams already use. These experiences are built and orchestrated in Copilot Studio, where you define how agents interact with apps, data, and workflows to support real business processes.

Extend and scale with trusted integrations

Through the Agent Store, you can adopt ready-made agent experiences or extend your own with partner-built integrations—while maintaining enterprise-grade security, permissions, and admin control. Options include:

  • Adobe Express (seen above)
  • Box
  • Figma
  • Monday.com
  • Wix

These options (and more) make it easier to scale agent usage across your organization without losing oversight.

These capabilities, all generally available now, help teams shift agents from being informational tools to operational ones. They bring real business actions into Copilot Studio agents in a way that’s both more functional for users and manageable for IT—helping teams complete work efficiently while maintaining the governance needed to scale.

Learn more about apps in agents.

What else is new and improved in Copilot Studio

  • Evaluation insights and automation updates now make it easier to generate test cases from analytics, simulate multi-turn interactions, and automate evaluations through APIs and connectors. You can turn real user conversations into targeted test sets, better reflect complex, real-world scenarios, and run evaluations programmatically. Together, these capabilities help you operationalize agent quality and maintain confidence as you scale.
  • Custom metrics for outcome-based measurement help you track what actually matters to your business, not just usage. Define success in your own terms—like resolution rates or conversions—and automatically evaluate conversations against those outcomes, making it easier to understand impact, align stakeholders, and make data-driven decisions.
  • Work IQ API is now available in public preview to bring Copilot’s intelligence layer—grounded in organizational context, memory, and signals—into your own agents and workflows. With built-in orchestration and enterprise-grade security, you can build agents that understand what’s happening across your business without managing raw data or complex integrations.
  • Agent-to-agent (A2A) communication is now supported in Work IQ, allowing agents to collaborate as peers and delegate tasks using shared organizational context. This makes it easier to build multi-agent systems that can coordinate work, maintain context across interactions, and deliver more grounded, role-aware outcomes.
  • GPT-5.5 Thinking is now available in Copilot Studio early release cycle environments as GPT-5.5 Reasoning, further expanding model choice with its more advanced analysis capabilities. This model is also rolling out across Microsoft 365 Copilot in Copilot Chat, Word, Excel, and PowerPoint.

Stay up to date on all things Copilot Studio

More is coming across voice channels, workflows, and the building experience. Check out all the updates as we ship them, as well as new features releasing in the next few months here: What’s new in Microsoft Copilot Studio.

To learn more about Microsoft Copilot Studio and how it can transform productivity within your organization, visit the Copilot Studio website or sign up for our free trial today.

The post New and improved: Agent governance, intelligent workflows, and connected app experiences appeared first on Microsoft Copilot Blog.

]]>
New and improved: Multi-agent orchestration, connected experiences, and faster prompt iteration http://approjects.co.za/?big=en-us/microsoft-copilot/blog/copilot-studio/new-and-improved-multi-agent-orchestration-connected-experiences-and-faster-prompt-iteration/ Wed, 01 Apr 2026 16:00:00 +0000 Learn what's new in Copilot Studio: Multi-agent systems are now generally available, plus recent updates to the Prompt Editor and governance controls.

The post New and improved: Multi-agent orchestration, connected experiences, and faster prompt iteration appeared first on Microsoft Copilot Blog.

]]>

Microsoft Copilot Studio helps organizations move beyond isolated AI experiences and build connected systems of agents that can scale, adapt, and deliver real business value. Recent enhancements focus on making it easier for agents to work together across tools and data sources, while giving makers more control over how those agents behave in production.

What you’ll see this month: New generally available capabilities for multi-agent coordination across Microsoft Fabric, the Microsoft 365 Agents SDK, and open Agent-to-Agent (A2A) protocols—all of which help agents collaborate across your ecosystem and perform more valuable work. Plus, you’ll find updates to prompt authoring, model choice, and governance controls that can help make it faster to build and refine high-quality agent experiences with confidence.

Agents that work together across your entire ecosystem

The challenge in scaling AI inside an organization isn’t creating a useful agent. It’s about getting many agents—across teams and tools—to work together in a way that’s reliable and repeatable.

In many organizations, data teams might build one kind of agent, app teams another, and productivity teams yet another. Each agent can be valuable on its own, but once a workflow needs knowledge from one system, reasoning from another, and action in a third—teams often run into brittle handoffs and custom integration work. This slows agent adoption and makes it harder to move from promising pilots to real business impact.

This month, Copilot Studio takes a meaningful step forward: several multi-agent capabilities are rolling out to general availability over the next few weeks, giving your teams new ways to connect and orchestrate agents across your ecosystem. These updates include Microsoft Fabric integration, Microsoft 365 Agents SDK orchestration, and Agent-to-Agent (A2A) communication—all designed to help your agents operate together as a coordinated system rather than in isolated silos.

Multi-agent support for Microsoft Fabric

With multi-agent support, your Copilot Studio agents can work with Fabric agents to reason over enterprise data and analytics at scale. That means you can connect business-facing agent experiences more directly to the data estate they already rely on, without treating every data-intensive scenario like a one-off engineering project. Instead of working with limited or disconnected data, these agents will be able to operate with full business context—helping make their outputs more accurate, relevant, and actionable.

Multi-agent support for the Microsoft 365 Agents SDK

Using the Microsoft 365 Agents SDK, teams can now orchestrate Copilot Studio agents alongside agents built for Microsoft 365 experiences. Instead of recreating the same logic across multiple agents (think retrieving data, applying business rules, or completing common tasks), you’ll be able to reuse and combine existing capabilities. This makes it easier to compose cross-app workflows from what’s already been built, reducing duplication and keeping experiences more efficient and consistent.

Agent-to-Agent (A2A) support

With A2A support, Copilot Studio agents can directly communicate with and delegate work to other agents—first-party, second-party, or third-party—using an open protocol that allows universal access. This matters because the future of enterprise AI will not belong to a single stack. Organizations need to build agents on platforms that can participate in a broader ecosystem, not just operate within one product boundary. Copilot Studio A2A provides that interoperability and power.

The impact of multi-agent systems

We’ve already seen the power of this approach with the Ask Microsoft web agent, one of our early “customer zero” implementations. As site traffic and knowledge sources grew, the single-agent architecture began to strain, creating slower response times. Using Copilot Studio, the team upgraded the agent to a modern architecture with generative orchestration and multi-agent coordination.

Now, multiple sub-agents handle different parts of the site—Microsoft Azure, Microsoft 365, pricing, trials, and more—while the main agent orchestrates them to provide fast, coherent, multi-turn responses. This setup allows Ask Microsoft to answer complex questions involving multiple products or services, and to tailor responses based on where the customer is on the site.

Building a more advanced assistant with Copilot Studio has meaningfully raised the bar for our customer experience and enabled us to scale faster across products to deliver real business impact

Alyse Muttera, Director of eCommerce Programs at Microsoft

To show how this approach works in other organizations, consider a common scenario at a bank. The loan department has one agent handling mortgage applications, while the banking department runs a separate agent for account inquiries. A customer, however, expects a single seamless experience.

Multi-agent orchestration lets each specialized agent manage its area of expertise while coordinating responses behind the scenes. For instance, if a customer asks about a mortgage payment and their account balance in the same interaction, the system delivers a cohesive, context-aware answer that combines insights from both agents—no juggling multiple interfaces required.

When specialized agents work together behind the scenes, customers can get a unified experience and employees can get time back.

That’s exactly the kind of impact Coca‑Cola Beverages Africa is realizing today by using Copilot Studio agents and Microsoft Dynamics 365 to autonomously run planning cycles and automate workflows end to end, saving planners 1 to 1.5 hours every day.

These features will be fully available to all eligible customers as of April 2026. Three capabilities, one outcome: agents that can operate more like a system and less like a collection of disconnected point solutions.

Build prompts faster while maintaining control

As agent experiences grow more sophisticated, the quality of the prompt an agent maker uses matters more. A great prompt yields more powerful results from agents than a good prompt, and fine-tuning prompts is key to unlocking them.

But in practice, prompt iteration has historically felt disjointed and slow. Makers previously balanced their flow of work with jumping into a separate editor, making a small change, testing it, and then repeating the process again. That friction can add up quickly, especially when teams are tuning prompts for specialized business scenarios.

The new immersive Prompt Builder, now generally available, helps reduce that friction by bringing prompt editing directly into each agent’s Tools tab. You can update instructions, switch models, add inputs or knowledge, and test changes—all in one place. Instead of breaking context every time you want to refine an agent’s behavior, you can iterate while staying grounded in the agent you’re building.

This matters most in real-world scenarios where prompt behavior is tied to domain knowledge and policy nuance. For example, a team building an agent to support clinical documentation might need to refine instructions, swap in a better knowledge source, and test outputs against terminology that is common in healthcare but more likely to trigger default safeguards. Doing that from one workspace can make iteration faster and help lower the effort required to get a production-ready result.

More options for prompts: Content moderation and model choice

Speaking of triggering default safeguards, Copilot Studio has also added content moderation settings for prompts, now generally available in supported regions. This gives makers more control over harmful content sensitivity on managed models, including turning down that sensitivity to help unblock legitimate scenarios in industries like healthcare, insurance, and law enforcement, where default settings may be overly restrictive for the content being processed.

For even more control over prompts, the Prompt Tool now supports Anthropic Claude Opus 4.6 and Claude Sonnet 4.5 in paid experimental preview in the United States. That gives makers more choice in matching the right model to the right prompt, rather than forcing every scenario into the same tradeoff profile. This feature is great for teams that want more flexibility in how they balance performance, reasoning depth, and cost.

All together, these improvements help teams move faster on prompt iteration while maintaining the control and flexibility required in production scenarios.

What else is new and improved in Copilot Studio

We have also recently released several additional updates across automation, meetings, retrieval quality, and model support.

  • ServiceNow and Azure DevOps connector quality improvements are now generally available. These help agents better understand operational questions, retrieve the right ticket or work item data, and return more complete, actionable answers automatically.
  • Evaluation automation APIs are now generally available through Microsoft Power Platform APIs and connectors. These APIs help make it easier to run evaluations programmatically and integrate quality checks into continuous integration and continuous delivery (CI/CD) workflows.
  • Agents for Microsoft Teams meetings can now access real-time meeting transcripts and group chat. This supports scenarios like answering questions during the meeting, surfacing relevant information, or helping track decisions and follow-ups as they happen.
  • Model context protocol (MCP) apps and Apps SDK support have expanded how agents connect to your external work apps, helping to make it easier to integrate business systems and enable agents to take action across your broader ecosystem—not just respond with information.
  • Additional model support, including Grok 4.1 Fast, GPT-5.3 Thinking, and GPT-5.4 Instant in paid experimental preview, gives makers more options as they tune experiences for speed, cost, and capability.

Overall, these updates reflect a continuing broader shift in Copilot Studio: moving from building individual AI experiences to building connected, governed systems that can fit more naturally into how work already happens. As you scale up your organization’s use of multi-agent ecosystems, these will help your teams reach further across channels and knowledge sources to more accurately fulfill your business needs.

Stay up to date on all things Copilot Studio

More is coming in April 2026 across voice channels, workflows, and the building experience. Check out all the updates as we ship them, as well as new features releasing in the next few months here: What’s new in Microsoft Copilot Studio.

To learn more about Microsoft Copilot Studio and how it can transform productivity within your organization, visit the Copilot Studio website or sign up for our free trial today.

The post New and improved: Multi-agent orchestration, connected experiences, and faster prompt iteration appeared first on Microsoft Copilot Blog.

]]>
Addressing the OWASP Top 10 Risks in Agentic AI with Microsoft Copilot Studio http://approjects.co.za/?big=en-us/microsoft-copilot/blog/copilot-studio/addressing-the-owasp-top-10-risks-in-agentic-ai-with-microsoft-copilot-studio/ Mon, 30 Mar 2026 16:00:00 +0000 Agentic AI introduces new security risks.

The post Addressing the OWASP Top 10 Risks in Agentic AI with Microsoft Copilot Studio appeared first on Microsoft Copilot Blog.

]]>
Agentic AI is moving fast from pilots to production. That shift changes the security conversation. These systems do not just generate content. They can retrieve sensitive data, invoke tools, and take action using real identities and permissions. When something goes wrong, the failure is not limited to a single response. It can become an automated sequence of access, execution, and downstream impact.

Security teams are already familiar with application risk, identity risk, and data risk. Agentic systems collapse those domains into one operating model. Autonomy introduces a new problem: a system can be “working as designed” while still taking steps that a human would be unlikely to approve, because the boundaries were unclear, permissions were too broad, or tool use was not tightly governed.

The OWASP Top 10 for Agentic Applications (2026) outlines the top ten risks associated with autonomous systems that can act across workflows using real identities, data access, and tools.

This blog is designed to do two things: First, it explores the key findings of the OWASP Top 10 for Agentic Applications. Second, it highlights examples of practical mitigations for risks surfaced in the paper, grounded in Agent 365 and foundational capabilities in Microsoft Copilot Studio.

OWASP helps secure agentic AI around the world

OWASP (the Open Worldwide Application Security Project) is an online community led by a nonprofit foundation that publishes free and open security resources, including articles, tools, and documentation used across the application security industry. In the years since the organization’s founding, OWASP Top 10 lists have become a common baseline in security programs.

In 2023, OWASP identified a security gap that needed urgent attention: traditional application security guidance wasn’t fully addressing the nascent risks stemming from the integration of LLMs and existing applications and workflows. The OWASP Top 10 for Agentic Applications was designed to offer concise, practical, and actionable guidance for builders, defenders, and decision-makers. It is the work of a global community spanning industry, academia, and government, built through an “expert-led, community-driven approach” that includes open collaboration, peer review, and evidence drawn from research and real-world deployments.

Microsoft has been a supporter of the project for quite some time, and members of the Microsoft AI Red Team helped review the Agentic Top 10 before it was published. Pete Bryan, Principal AI Security Research Lead, on the Microsoft AI Red Team, and Daniel Jones, AI Security Researcher on the Microsoft AI Red Team, also served on the OWASP Agentic Systems and Interfaces Expert Review Board.

Agentic AI delivers a whole range of novel opportunities and benefits. However, unless it is designed and implemented with security in mind, it can also introduce risk. OWASP Top 10s have been the foundation of security best practice for years. When the Microsoft AI Red Team gained the opportunity to help shape a new OWASP list focused on agentic applications, we were excited to share our experiences and perspectives. Our goal was to help the industry as a whole create safe and secure agentic experiences.

Pete Bryan, Principal AI Security Research Lead

The 10 failure modes OWASP sees in agentic systems

Read as a set, the OWASP Top 10 for Agentic Applications makes one point again and again: agentic failures are rarely “bad output.” But they are bad outcomes. Many risks show up when an agent can interpret untrusted content as instruction, chain tools, act with delegated identity, and keep going across sessions and systems. Here is a quick breakdown of the types of risk called out in greater detail in the Top 10:

Agent goal hijack (ASI01): Redirecting an agent’s goals or plans through injected instructions or poisoned content.

Tool misuse and exploitation (ASI02): Misusing legitimate tools through unsafe chaining, ambiguous instructions, or manipulated tool outputs.

Identity and privilege abuse (ASI03): Exploiting delegated trust, inherited credentials, or role chains to gain unauthorized access or actions.

Agentic supply chain vulnerabilities (ASI04): Compromised or tampered third-party agents, tools, plugins, registries, or update channels.

Unexpected code execution (ASI05): Turning agent-generated or agent-invoked code into unintended execution, compromise, or escape.

Memory and context poisoning (ASI06): Corrupting stored context (memory, embeddings, RAG stores) to bias future reasoning and actions.

Insecure inter-agent communication (ASI07): Spoofing, intercepting, or manipulating agent-to-agent messages due to weak authentication or integrity checks.

Cascading failures (ASI08): A single fault propagating across agents, tools, and workflows into system-wide impact.

Human–agent trust exploitation (ASI09): Abusing user trust and authority bias to get unsafe approvals or extract sensitive information.

Rogue agents (ASI10): Agents drifting or being compromised in ways that cause harmful behavior beyond intended scope.

For security teams, knowing that these issues are top of mind across the global community of agentic AI users is only the first half of the equation. What comes next is addressing each of them through properly implemented controls and guardrails.

Build observable, governed, and secure agents with Microsoft Copilot Studio

In agentic AI, the risk isn’t just what an agent is designed to do, but how it behaves once deployed. That’s why governance and security must span both in development (where intent, permissions, and constraints are defined), and operation (where behavior must be continuously monitored and controlled). For organizations building and deploying agents, Copilot Studio provides a secure foundation to create trustworthy agentic AI. From the earliest stages of the agent lifecycle, built in capabilities help ensure agents are safe and secure by design. Once deployed, IT and security teams can observe, govern, and secure agents across their lifecycle.

In development, Copilot Studio establishes clear behavioral boundaries. Agents are built using predefined actions, connectors, and capabilities, limiting exposure to arbitrary code execution (ASI05), unsafe tool invocation (ASI02), or uncontrolled external dependencies (ASI04). By constraining how agents interact with systems, the platform reduces the risk of unintended behavior, misuse, or redirection through indirect inputs. Copilot Studio also emphasizes containment and recoverability. Agents run in isolated environments, cannot modify their own logic without republishing (ASI10), and can be disabled or restricted when necessary (ASI07, ASI08). For example, if a deployed support agent is coaxed (via an indirect input) to “add a new action that forwards logs to an external endpoint,” it can’t quietly rewrite its own logic or expand its toolset on the fly; changes require republishing, and the agent can be disabled or restricted immediately if concerns arise. These safeguards prevent localized agent failures from propagating across systems and reinforce a key principle: agents should be treated as managed, auditable applications, not unmanaged automation.

To support governance and security during operation, Microsoft Agent 365 will be generally available on May 1. Currently in preview, Agent 365 enables organizations to observe, govern, and secure agents across their lifecycle, providing IT and security teams with centralized visibility, policy enforcement, and protection capabilities for agentic AI.

Once agents are deployed, Security and IT teams can use Agent 365 to gain visibility into agent usage, manage how agents are used, and enforce organizational guardrails across their environment. This includes insights into agent usage, performance, risks, and connections to enterprise data and tools. Teams can also implement policies and controls to help ensure safe and compliant operations. For example, if an agent accesses a sensitive document, IT and security teams can detect the activity in Agent 365, investigate the associated risk, and quickly restrict access or disable the agent before any impact occurs. Key capabilities include:

Access and identity controls alongside policy enforcement to ensure agents operate within the appropriate user or service context, helping reduce the risk of privilege escalation and applying guardrails like access packages and usage restrictions (ASI03).

Data security and compliance controls to prevent sensitive data leakage and detect risky or non-compliant interactions (ASI09).

Threat protection to identify vulnerabilities (ASI04) and detect incidents such as prompt injection (ASI01), tool misuse (ASI02), or compromised agents (ASI10).

Together, these capabilities provide continuous oversight and enable rapid response when agent behavior deviates from expected boundaries.

Keep learning about agentic AI security

Agentic AI changes not just what software can do, but how it operates, introducing autonomy, delegated authority, and the ability to act across systems. The shift places new demands on how systems are designed, secured, and operated. Organizations that treat agents as privileged applications, with clear identities, scoped permissions, continuous oversight, and lifecycle governance, are better positioned to manage and reduce risk as they adopt agentic AI. Establishing governance early allows teams to scale innovation confidently, rather than retroactively building controls after the agents are embedded in workflows. Here are some resources to look over as the next step in your journey:

OWASP Top 10 for Agentic Applications (2026): The baseline: top risks for agentic systems, with examples and mitigations.

Microsoft AI Red Team: How Microsoft stress-tests AI systems and what teams can learn from that practice.

Microsoft Security for AI: Microsoft’s approach to protecting AI across identity, data, threat protection, and compliance.

Microsoft Agent 365: The enterprise control plane for observing, governing, and securing agents.

Microsoft AI Agents Hub: Role-based readiness resources and guidance for building agents.

To learn more about Microsoft Security solutions, visit our website. Bookmark the Security blog to keep up with our expert coverage on security matters. Also, follow us on LinkedIn (Microsoft Security) and X (@MSFTSecurity) for the latest news and updates on cybersecurity.

OWASP Top 10 for Agentic Applications content © OWASP Foundation. This content is licensed under CC BY-SA 4.0. For more information, visit https://creativecommons.org/licenses/by-sa/4.0/

The post Addressing the OWASP Top 10 Risks in Agentic AI with Microsoft Copilot Studio appeared first on Microsoft Copilot Blog.

]]>
Powering Frontier Transformation with Copilot and agents http://approjects.co.za/?big=en-us/microsoft-copilot/blog/copilot-studio/powering-frontier-transformation-with-copilot-and-agents/ Mon, 09 Mar 2026 13:00:00 +0000 Wave 3 marks a new version of Microsoft 365 Copilot, moving beyond assistance to embedded agentic capabilities.

The post Powering Frontier Transformation with Copilot and agents appeared first on Microsoft Copilot Blog.

]]>
Frontier Transformation starts with a simple idea: AI must do more than optimize what already exists. It must unlock new levels of creativity, innovation, and growth. And it must show up inside real work, grounded in real context, and solve real problems for people and organizations. We’ve found that to do this, the two most important elements are intelligence and trust. Intelligence ensures AI is contextual, relevant, and grounded. Trust ensures AI can scale safely, securely, and responsibly. Our announcements today show how intelligence and trust together turn AI from experimentation into durable, enterprise-wide value.

Wave 3 of Microsoft 365 Copilot

Wave 3 marks a new version of Microsoft 365 Copilot, moving beyond assistance to embedded agentic capabilities. And this is just the start, with much more product innovation to follow in the months ahead.

Copilot Cowork

Working closely with Anthropic, we have brought the technology that powers Claude Cowork into Microsoft 365 Copilot. It’s this multimodel advantage that makes Copilot different. Your work is not limited by one brand of models. Copilot hosts the best innovation from across the industry and chooses the right model for the job regardless of who built it. This is a pattern of work that will only become more powerful as new models and ways of working emerge.

Copilot Cowork brings long‑running, multi‑step work into Microsoft 365 Copilot, moving beyond prompts and responses toward execution that unfolds over time. And, with Work IQ, it has the full context of your work, not just fragments of data, so it can reason over all relevant materials. Instead of asking Copilot to generate a single artifact, Cowork allows you to delegate meaningful work and stay in the loop as that work progresses.

With Cowork, Copilot can break down complex requests into steps, reason across tools and files, and carry work forward with visible progress and opportunities to steer. Tasks are no longer confined to a single turn or a single app. They can run for minutes or hours, coordinating actions and producing real outputs along the way.

Cowork is built with enterprise needs in mind. Work is observable. Actions are transparent. Documents are immediately enterprise knowledge that’s protected and ready to share. Progress can be reviewed, guided, or stopped. And everything operates within Microsoft’s security, identity, and governance framework, so organizations can adopt these capabilities with confidence.

By combining Anthropic’s agentic model for multi-step tasks with Microsoft 365, Cowork delivers a managed, enterprise‑grade experience that pairs powerful reasoning with the controls enterprises expect. This is the promise of Copilot: the best AI innovation from across the industry delivered quickly with the intelligence of Work IQ and trust of Microsoft’s Enterprise Data Protection. Cowork is being tested with a limited set of customers as a research preview and will be available through the Frontier program in March.

Join the Frontier program to get access to Microsoft’s latest AI innovations.

Microsoft 365 Copilot in Word, Excel, PowerPoint, and Outlook 

Today, many AI tools treat the creation of an artifact as a single-shot task. They connect to Microsoft 365 data but miss key context. They create content that doesn’t follow how apps natively work. They create version sprawl by producing files that are locally downloaded. And they do not respect the existing confidentiality protections within an organization.

Wave 3 of Copilot will now work alongside you in WordExcelPowerPoint, and Outlook, creating, editing, and refining high-quality content from start to finish inside a document, spreadsheet, presentation, or email. And it uses Work IQ to stay grounded in the context of your work, so edits always reflect what is current and relevant across your files, meetings, chats, and relationships.

Copilot does the heavy lifting by updating existing work: refining a Word document into a polished draft, improving Excel spreadsheets with real formulas, producing slides in PowerPoint that match how your organization builds decks—including understanding layouts, object styles, and brand kits— and drafting and refining emails directly in Outlook. And because this work happens inside the apps where people already work, every change is transparent, reviewable, and reversible as you iterate.

During preview, we described these capabilities as “Agent Mode.” As we moved toward general availability, it became clear that this isn’t a separate mode at all—it’s core to how this next wave of Copilot works.

Microsoft 365 Copilot enforces existing Microsoft 365 permissions and sensitivity labels and saves files to OneDrive and SharePoint—with tenant-level controls—so protected content isn’t processed when extraction isn’t allowed. This means organizations can apply governance, audit, compliance, and retention policies at scale.

These new Copilot experiences are generally available in Excel and Word, with PowerPoint and Outlook starting to roll out over the coming months.

Agents in chat

Not all work starts inside a document or an app. Often, it begins conversationally—with a question, an idea, or a rough intent that needs to be turned into action.

That’s why, in Wave 3, chat in Copilot is the entry point for chat‑first creation and execution. From chat, you can create documents, spreadsheets, and presentations directly from a conversation, or ask Copilot to take common workplace actions—like scheduling a meeting or drafting and sending an email to your team—without copying and pasting between tools or switching contexts. These end‑to‑end workflows move work forward immediately and set Copilot apart.

Chat in Copilot is where the ecosystem comes together. Built‑in agents for Word, Excel, PowerPoint, and Outlook let you move easily from conversation into app‑native work. And with agents in Copilot supporting open standards like Apps SDK and MCP Apps, your apps can now surface directly within chat—enabling live, interactive experiences where work actually happens. From sales and customer service insights in Microsoft Dynamics 365, to custom apps built with Microsoft Power Apps, to partner experiences from Adobe, Monday.com, and Figma, Copilot brings your critical tools and insights together in one place.

Copilot also makes it easy for people across your organization to build agents that support their day‑to‑day work using Agent Builder. Meanwhile, IT and business leaders can create more sophisticated business process agents with Microsoft Copilot Studio—from employee onboarding to procurement. Recent updates to Copilot Studio help organizations evaluate agent quality, coordinate multiple agents, and ensure agents work together across systems—while remaining observable, governable, and secure at enterprise scale. 

Copilot works directly inside apps when work is underway, and agents in chat provide the starting point when work begins with a conversation.

Excel, Word, and PowerPoint Agents are rolling out to generally availability in chat in Copilot. Schedule from chat and custom instructions are available today and send email from chat is rolling out with broad availability this spring. 

Multi‑model intelligence

Wave 3 also advances Microsoft’s commitment to model choice in Copilot, so intelligence can show up in the right way for the work at hand, without requiring you to think about models at all.

Many AI tools lock users into a single vendor’s models. Others force people to choose between tools, experiences, or modes depending on the task. That fragmentation creates friction for individuals and complexity for organizations. Leaders end up managing overlapping tools, inconsistent experiences, and rising costs as teams bring their own AI into the business.

At the same time, IT and business decision‑makers are forced into long‑lived vendor bets, even as the pace of model innovation accelerates and better capabilities emerge elsewhere. The result is broken context for users, unnecessary overhead for organizations, and the burden of model selection pushed onto people who just want to get work done.

In contrast, Microsoft 365 Copilot brings leading models from multiple providers directly into the work experience. With Wave 3, Claude is now available in mainline chat in Copilot via the Frontier program, alongside the latest generation of OpenAI models, which continue to roll out with new releases. This means users can access advanced reasoning and multistep capabilities in their everyday Copilot conversations, not just specialized tools. Copilot automatically applies the right model for the task, all grounded in your enterprise context and protected by Microsoft’s security and governance controls.

Agent 365

As organizations adopt agents as part of everyday work, the challenge shifts from experimentation to operating them with trust, safety, and control at scale. IDC projects agent use will increase by an order of magnitude over the next few years, with hundreds of millions—and soon billions—of agents operating across enterprises.That scale creates a new dilemma for IT and security leaders: how to manage agents across the organization without rebuilding infrastructure, weakening security posture, or slowing innovation. This is exactly the scenario Agent 365 was designed for.

Agent 365 is the control plane for agents. In practical terms, it gives IT and security leaders one place to observe, secure, and govern every agent across the organization, and it provides the confidence to move from agent experimentation to enterprise-scale operations. Agent 365 extends the management, security, and governance processes organizations already use for employees to agents, so they can stay in control as agents become part of daily work.

The idea is simple: there is no need to reinvent the wheel. The fastest path to getting agents under control is to manage them in a similar manner to managing users, using familiar Microsoft solutions including the Microsoft Admin Center for agent management and Microsoft Security solutions like Defender, Entra, and Purview for agent security and governance.

Agent 365 will be generally available on May 1, priced at $15 per user per month.

Introducing Microsoft 365 E7: The Frontier Suite

Frontier transformation is real when both sides of the system move together: people and AI operating across the enterprise.

Microsoft 365 E7: The Frontier Suite closes the gap, equipping employees with AI across email, documents, meetings, spreadsheets, and business application surfaces, while giving IT and security leaders the observability and governance needed to operate AI at enterprise scale.

Copilot and agents work together with shared intelligence, understanding context, history, priorities, and constraints. Trust is built in by default—with user data, enterprise data, and agent actions protected through identity, policy, and observability—so AI can scale across the workforce without compromising security or compliance.

Microsoft 365 E7 will be available for purchase on May 1 at a retail price of $99 per user per month, and includes Microsoft 365 Copilot, Agent 365, Microsoft Entra Suite, and Microsoft 365 E5 with advanced Defender, Entra, Intune, and Purview security capabilities to help secure users, delivering comprehensive protection across agents and users.

Get started today

Wave 3 of Microsoft 365 Copilot marks a turning point in how AI shows up at work. Agentic capabilities are embedded directly into Word, Excel, PowerPoint, Outlook, and Copilot Chat, bringing multi‑model intelligence into everyday workflows. Agent 365 makes this shift operational by giving organizations a way to observe, govern, and secure agents as they move from experimentation to enterprise‑scale use. Microsoft 365 E7 brings it all together by unifying productivity, AI, identity, and security into a single foundation.

Together, these changes make frontier transformation real: intelligence that understands the context of work, and trust that allows AI to scale safely across the workforce. When intelligence and trust move together, AI stops being an experiment and starts becoming how work gets done.

  • Visit Microsoft365.com/copilot or download the Microsoft 365 app on your mobile device to get started.
  • For the latest research and insights on AI at work, visit WorkLab.
  • Learn from our engineering leaders how Microsoft delivers AI built for work at the Microsoft Frontier Transformation digital event on March 9, 2026, at 8:00 AM PT.

Footnotes

Microsoft 365 E7 is available with and without Teams.

1IDC Info Snapshot, sponsored by Microsoft, 1.3 Billion AI Agents by 2028, May 2025 #US53361825

The post Powering Frontier Transformation with Copilot and agents appeared first on Microsoft Copilot Blog.

]]>
New and improved: Agent evaluations, computer use, and advanced maker training http://approjects.co.za/?big=en-us/microsoft-copilot/blog/copilot-studio/new-and-improved-agent-evaluations-computer-use-and-advanced-maker-training/ Wed, 04 Mar 2026 19:15:00 +0000 Explore Copilot Studio feature updates that support secure, scalable agent development—from enhanced agent evaluations to improved automation tools.

The post New and improved: Agent evaluations, computer use, and advanced maker training appeared first on Microsoft Copilot Blog.

]]>

Microsoft Copilot Studio and Agent Builder in Microsoft 365 Copilot are designed to help customers reliably create agents that scale and deliver real, sustained business value—not just prototypes. Recent enhancements focus on making it easier to move from building an agent to running one confidently across complex, dynamic environments, with consistent quality and the ability to evolve as business needs change.

Discover the latest capabilities in agent evaluations, exciting updates for computer-using agents (including expanded model support), a new Agent Academy Operative training path, and more. Plus, learn how you can use these capabilities to help ensure your agents are ready for scale.

Build trust at scale with enhanced agent evaluations in Copilot Studio

Agents aren’t “set and forget.” Prompts evolve, models update, and data changes—which raises a critical question as agents take on real work: can we trust them at scale? Agent evaluations answer that question with evidence. They’re designed to turn expectations into measurable checks, help teams catch regressions early, and provide a repeatable way to assess agent quality as behavior and context evolve.

For example, a finance leader rolling out an agent for expense policy guidance or month‑end analysis needs to trust its behavior before moving beyond a pilot. With enhanced agent evaluations in Copilot Studio, teams can now validate performance using their own scenarios, policies, and production data—measuring quality, usability, and responsiveness across a full test set instead of isolated cases.

Side‑by‑side comparisons then help catch regressions before changes go live. Meanwhile, built‑in transparency and session replays support internal and external stakeholder review. The result is a clear, evidence‑based path from experimentation to trusted deployment.

Available in public preview, here’s a quick rundown of the latest eval enhancements.

Holistic and multi-dimensional agent evaluation

  • Set-level grading framework: You can now evaluate agents across an entire test set instead of individual test cases, enabling an accurate measure of overall quality. By consolidating results from multiple tasks, makers can better understand real-world performance by seeing how agents maintain quality across a range of scenarios.
  • Multiple graders per test set: With the ability to apply multiple grading approaches—such as quality, performance, and usability assessments—to the same test set, teams can gain a more complete evaluation without the complexity of managing separate test sets.
  • Comparative testing: Teams can compare multiple agent versions side by side, which can make it easier to spot regressions and validate improvements before pushing the best version live.

Improved transparency and control

  • User reactions and feedback: Makers can now provide quick feedback on evaluation results using a simple thumbs up or thumbs down action. This feedback helps Copilot Studio capture signals about evaluation accuracy, grader alignment, and edge cases, which means our team can continuously refine our evaluation models and improve result quality for agent makers.
  • Open activity map in evaluation: Direct integration with the activity map gives teams immediate insight into how agents executed tasks, helping identify where issues occurred faster and improve optimization.
  • Enterprise-grade auditing: Advanced session replays, action logs, and Microsoft Purview integration offer detailed visibility into agent behavior, helping makers preserve quality and streamline troubleshooting.

Streamlined workflow and data integration

  • CSV downloadable format: Makers can now download a ready-to-use comma-separated values (CSV) template that follows the exact structure required for importing test cases into evaluation. Instead of creating files from scratch—and running into formatting errors, missing columns, or failed imports—teams can rely on a validated template that can help shorten setup time and remove unnecessary friction.
  • Import production data into evaluation: Real-world production data can now be imported directly into evaluations, providing high-quality test sets that reflect actual user interactions. This is designed to improve evaluation accuracy and help makers tune agents more closely to their specific audiences.
  • Import and export of test sets, test cases, and results: Makers can import or export test sets, individual test cases, and evaluation results. This helps simplify teamwork and support repeatable testing across environments—essentials for enterprise-scale agent development.

Scale automation across real-world systems with nimbler computer use

Most organizations don’t lack ideas for automation. Instead, the challenge tends to be with fragmented systems, limited APIs, legacy desktop tools, and workflows that go across multiple departments. Replacing everything isn’t realistic. But maintaining brittle, script-based automation isn’t sustainable either.

Copilot Studio’s computer-using agents (CUAs) can address this gap by interacting directly with web and desktop interfaces, supporting automation across systems that weren’t designed to integrate. They facilitate automation in complex, dynamic environments where traditional robotic process automation (RPA) falls short.

Consider a customer support organization handling service requests across disconnected systems. When a customer submits a support request, a computer-using agent can:

  1. Retrieve customer and entitlement details from the customer relationship management (CRM) system.
  2. Create or update a case in the service management system.
  3. Pull relevant troubleshooting steps from a knowledge base.
  4. Update the case status and resolution checklist in Microsoft SharePoint.
  5. Notify the assigned service representative and escalate if service-level agreements (SLAs) are at risk.

This would be impossible with RPA alone because of the need to transcend systems. Although pieces could be automated, a person historically would need to initiate each step. With computer use, the organization can now accelerate this process and mitigate missed steps, without requiring a redesign of existing systems.

And the latest updates enhance the value of your computer-using agents, adding key capabilities that enable improved flexibility, security, and scalability:

  • Expanded model availability: We’ve added Claude Sonnet 4.5 as an additional model choice for CUAs. You can choose between Anthropic models and OpenAI’s Computer-Using Agent to get the best possible results for your task.
  • Built-in credentials: Simplify and secure authentication with built-in credentials that require minimal setup. Users simply input their username and password once, and Copilot Studio stores the credentials securely.
  • Enterprise-grade logging and auditing: New monitoring tools, integrated with Microsoft Purview, enhance computer-using agent session visibility. This includes detailed logs of agent activity and session replays with screenshots that support traceability and compliance processes.
  • Cloud PC pool: Powered by Windows 365 for Agents, this scalable, managed cloud infrastructure integrates with Microsoft Entra and Intune. These PC pools auto-scale based on workload demand, helping you handle spikes without over-provisioning.

We know the more tools that help drive operational efficiency while maintaining control over automated workflows, the more confident teams can be about adopting computer use. That’s why these updates help elevate computer-using agents as a more reliable, adaptable solution for enterprises looking to scale their use of agentic automation.

Learn to build multi-agent systems with the Agent Academy Operative path

Finished the Recruit training from the Copilot Studio Agent Academy and looking to go deeper? The new Operative path unlocks the next level of training for agent makers who are ready to build their skills. It’s designed for practitioners who already have their first agent working and want to expand their skills to build more sophisticated, production-ready solutions.

The Operative path walks learners through building a complex, multi-agent hiring automation system, using it as an applied learning example that can be adapted to any business scenario.

Along the way, participants develop critical skills such as writing clear and effective agent instructions, selecting and evaluating AI models, and applying advanced prompt patterns, agent flow integration, and Model Context Protocol (MCP). The curriculum also emphasizes operational readiness, including feedback loops, telemetry, and AI safety throughout the agent lifecycle.

By the end of the path, learners can gain a deeper understanding of how to design, build, and architect scalable multi-agent systems that can evolve with business needs. For creators ready to move from basic agents to more advanced, reliable solutions, the Operative path provides a practical and structured next step.

What else is new and improved in Copilot Studio

Now, let’s take a quick look at some other exciting updates—all generally available (GA)—that further enhance your Copilot Studio (and Agent Builder) experience:

  • Copy agents from Agent Builder into Copilot Studio to scale impact: Agents that start as individual ideas in Agent Builder and prove team-wide value can now be opened directly in Copilot Studio for a more extensive maker experience. This unlocks advanced features such as topics, automations, expanded publishing channels, and enterprise governance controls, including data loss prevention and application lifecycle management. For example, a support representative’s personal helper agent can be expanded into a shared tool that categorizes tickets, suggests responses, and routes issues to the right specialists—without rebuilding from scratch.
  • Query your agent inventory from Azure Resource Graph: The Microsoft Power Platform agent inventory, which organizes and displays all your published Copilot Studio and Agent Builder agents, is now generally available. Admins can query this inventory programmatically using Azure Resource Graph to access detailed data about both draft and published agents across the tenant, using Azure portal, CLI, PowerShell, or REST API.
  • Generate icons for your agents using AI in Agent Builder: Makers can now generate custom agent icons directly in Agent Builder using AI. Instead of browsing or creating artwork manually, they simply describe how the icon should look—using the agent’s description or a custom prompt—and get a unique icon designed to stand out in the Agent Store.
  • Try the Copilot Studio extension for Visual Studio Code: The Copilot Studio extension lets teams version, edit, and deploy agents directly from Visual Studio Code, making it easier to align with existing software development workflows.

The big takeaway: Stronger Copilot Studio tools for more scalable agent experiences

These updates aren’t just new features; they strengthen the tools teams rely on to create agents that scale with their business. By enhancing flexibility, security, and visibility, these updates are designed to make it easier to scale agents without starting over each time.

This continuity helps makers innovate quickly while IT teams maintain control over governance, compliance, and performance—bridging the gap between rapid iteration and enterprise-grade reliability. Why? Because at the end of the day, the best agents are those that are built to grow with your needs, and with these updates, that evolution becomes more attainable every month.

Stay up to date on all things Copilot Studio

Check out all the updates as we ship them, as well as new features releasing in the next few months here: What’s new in Microsoft Copilot Studio.

To learn more about Microsoft Copilot Studio and how it can transform productivity within your organization, visit the Copilot Studio website or sign up for our free trial today.

The post New and improved: Agent evaluations, computer use, and advanced maker training appeared first on Microsoft Copilot Blog.

]]>
New resources and guidance to plan, build, and operate enterprise-ready agents http://approjects.co.za/?big=en-us/microsoft-copilot/blog/copilot-studio/new-resources-and-guidance-to-plan-build-and-operate-enterprise-ready-agents/ Thu, 12 Feb 2026 17:00:00 +0000 Explore the new and redesigned guidance hubs to help your organization plan, build, and operate agents with clarity throughout the agent lifecycle.

The post New resources and guidance to plan, build, and operate enterprise-ready agents appeared first on Microsoft Copilot Blog.

]]>
As organizations move from early AI experiments to deploying agents at scale, they often ask: How do we architect agents responsibly, integrate them into existing systems, and run them reliably at scale?

To help teams like yours answer these complex questions faster and move with confidence, we’ve launched the new agent architecture guidance hub and a refreshed Microsoft Copilot Studio guidance hub. These on-demand resources offer end‑to‑end documentation across the agent lifecycle—from design and planning through operations, governance, and advanced architectural patterns.

Built on established practices from Microsoft engineering teams and real‑world deployments, these hubs give architects, developers, and IT a shared blueprint to work from. And they were designed to help your team make smarter architectural decisions, accelerate delivery with practical how‑to guidance, and scale safely with trusted governance, security, and responsible AI practices.

Whether you’re building your first agent or scaling across your enterprise, these hubs can help you start—and stay—on the right path.

Now, let’s explore what each hub offers and how to put them to work for your organization.

Meet the new agent architecture guidance hub

The new agent architecture guidance hub is a technology‑agnostic playbook for designing secure, reliable, and accountable agents. Unlike the Copilot Studio guidance hub and Azure Well‑Architected guidance, this hub focuses on the principles and patterns required to build scalable agent systems—regardless of platform, tools, or runtime.

Grounded in the same practices Microsoft 365‑grade agents use, this hub distills lessons from real‑world deployments into a single source of truth. It provides clear answers to foundational architecture questions, such as how your agents should be structured, how they should run, and how they should be governed at scale.

Use the agent architecture guidance hub to:

  • Identify fit for purpose by mapping your scenario to the right agent flows, components, and reference architectures.
  • Design for operability by building reliability in from the start, using deployment lifecycle and evaluation guidance.
  • Establish trust, traceability, and transparency through responsible AI practices, governance, auditability, and security practices.
  • Optimize search and tool‑use patterns by adopting retrieval, grounding, and tool‑execution approaches used in Microsoft 365 Copilot.

Discover the redesigned Copilot Studio guidance hub

The reimagined Copilot Studio guidance hub is your end‑to‑end playbook for designing, building, and operating agents in Copilot Studio. Unlike architecture‑level resources, such as the agent architecture guidance, this hub focuses on hands‑on implementation—so makers, developers, and IT admins know exactly how to execute their work inside the product.

The newly reorganized and expanded hub now mirrors the full lifecycle of an agent. It’s built around five practical stages—Plan, Implement, Manage, Improve, and Extend—so your team can quickly find the right guidance at the right moment, whether you’re starting fresh or scaling an existing deployment:

  • Stage 1: Plan. Align on business goals, define success measures, apply responsible AI considerations, and design effective language understanding before building anything. This helps to ensure every agent starts with a clear purpose, measurable outcomes, and a responsible foundation.
  • Stage 2: Implement. Focus on the design and build work inside Copilot Studio. Learn generative orchestration patterns, build topics effectively, integrate systems and APIs, and publish agents with confidence using patterns established to work in production.
  • Stage 3: Manage. Operate agents with governance, ALM, capacity planning, project security, testing guidance, and compliance best practices. This stage helps teams define the guardrails and decisions needed to maintain trust, reliability, and control over time.
  • Stage 4: Improve. Center continuous optimization around analytics, KPIs, and conversation insights to drive measurable improvements in accuracy, containment, deflection, and user satisfaction—turning real usage data into targeted enhancements.
  • Stage 5: Extend. Go beyond out‑of‑the‑box capabilities with hands‑on extension guidance. Use the Copilot Studio Kit and work with the Microsoft 365 Agents SDK to add custom logic, actions, and richer workflows tailored to your organization’s unique scenarios.

Together, these stages make this hub a practical, step-by‑step playbook for building agents in Copilot Studio that are useful, safe, and maintainable from day one—and that can scale as your needs grow.

Build agents with confidence

A maker working on a laptop in a common area in a workplace.

Successful agents require more than a powerful platform—you also need clearer choices, practical guardrails, and a way to spend less time reinventing the wheel. The new agent architecture guidance hub and Copilot Studio guidance hub (together with our other resources like the Copilot Studio adoption site and Copilot Studio community forum) make it easier to go from early experiments to confident, repeatable delivery.

Use the agent architecture guidance hub to clarify what to build and why. Then, turn to the Copilot Studio guidance hub when you’re ready to design, build, and operate those agents more effectively in Copilot Studio.

Whether you’re experimenting with your first agent or managing a collection of agents in Microsoft Copilot Studio, put these resources to work to make your next build easier, safer, and faster.

The post New resources and guidance to plan, build, and operate enterprise-ready agents appeared first on Microsoft Copilot Blog.

]]>
How to evaluate AI agents in Microsoft Copilot Studio http://approjects.co.za/?big=en-us/microsoft-copilot/blog/copilot-studio/how-to-evaluate-ai-agents/ Tue, 03 Feb 2026 17:00:00 +0000 Agent Evaluation in Copilot Studio helps makers move from early optimism to grounded confidence as agents grow in complexity and impact.

The post How to evaluate AI agents in Microsoft Copilot Studio appeared first on Microsoft Copilot Blog.

]]>
When makers first build an agent, their confidence increases as that agent takes shape. A few test prompts. Some promising answers. A sense that things are working. So, they share that agent with their team.

Then, reality arrives. 

The people who use the agent phrase questions differently. Conversations stretch across multiple turns. Context accumulates. Permissions prove table stakes. The right tools need to be invoked. Edge cases appear. Suddenly, the question becomes “can I actually trust how the agent behaves?”

Agent evaluations exist for this exact moment. AI agents do not behave the same way twice. Their responses shift with model updates, data changes, prompts, tools, and context. What works today may drift tomorrow.

Thankfully, agent evaluations reinforce confidence in the agents you build. Let’s walk through how you can make the most of this capability.

What exactly are agent evaluations?

Agent evaluations (or “evals”) are the standardized mechanism that make agent variability visible and manageable. Unlike debugging, evals are not a one-time check or a manual review. It is a consistent process that helps you stay ahead of what could go wrong and improve agent performance over time. 

By running evaluations, makers can launch agents into production knowing how they’ll behave, not how we hope they do. They can also ensure that an agent’s behavior remains stable over time.

As such, every maker should be evaluating all their agents. But this initiative can start with a few quick evaluations that require minimal setup, using default data and default grading to unlock quick signals.

However, as your agents mature, you’ll likely need to evolve this strategy, configuring additional evaluations that test behaviors in specialized scenarios.

Agent evaluation in 8 simple steps

Imagine you’re a maker that just built an internal human resources (HR) agent that helps employees understand leave policies, benefits, and when to escalate to HR systems. 

Here’s how you’d evaluate this agent in Microsoft Copilot Studio, from deciding what to evaluate to understanding real-world behaviors and confidently iterating:

Step 1: Decide what you’re evaluating

Before you can run an evaluation, you need to be clear about what you’re trying to validate. 

This starts with defining the scenario. What kind of behavior are we testing? What assumptions are we making about the user’s intent, the context, and the information the agent has available? A well-defined scenario sets the foundation for meaningful results.

With this information, you’ll need to define your scope. Some evaluations focus on a narrow behavior to get a precise signal. Others cover a wider range of interactions to reflect real usage. A narrower scope makes results easier to interpret, while a broader scope helps surface risks that only appear at scale. 

You’ll need to make these choices deliberately. By explicitly defining the scenario and scope, evaluations produce signals that are relevant, reliable, and aligned with how you expect people to use the agent in practice. And it can impact the success of your evaluation.

Step 2: Ground evaluation in real user behavior 

Once you’ve defined the scope, the next question emerges: “What are we evaluating against?” 

Strong evaluations start with realistic data. Not idealized prompts, but the messy, imperfect ways people actually ask questions. For your HR agent, this includes vague phrasing, partial information, and mixed intents like asking about leave while referencing a personal situation. 

You can bring data from multiple sources, including manually authored scenarios, AI-assisted generation to broaden coverage, imported datasets, and even historical or production conversations.

Add data from multiple sources to ensure agent evaluations capture nuance in its assessment

We recommend starting with a small but meaningful test set, focusing on the high-value scenarios that matter most to your business.

This data ensures that the evaluation inputs reflect real behavior, not the maker’s assumptions. But even with this data in place, you’ll likely ask: “How will this help me judge whether the agent behaved as expected?” This brings us to step three.

Step 3: Define your evaluation logic

Sometimes makers start with default grading to understand baseline behavior, before deciding what they want to measure more precisely. 

Meanwhile, others define more specific grading logic upfront based on what they already know and what they want to validate. 

Evaluation logic does not require full certaienty at the start. It provides a structured way to observe outcomes and refine what matters over time. 

Makers can choose from a collection of ready-to-use graders and even combine multiple graders within a single evaluation to get a richer, multi-dimensional view of agent behavior. 

Graders provide a richer, multi-dimensional view of agent behavior

For example, your HR agent configuration might include three separate graders:

  1. General quality grader to assess whether the response is complete and addresses the full question.
  2. Classification grader, where you describe the expected behavior as using natural language prompts.
  3. Capability grader to confirm the agent uses the right topic or tool at the right time.

Even better, you can make these expectations explicit: what matters, what does not, and what “good behavior” looks like in this scenario. By defining evaluation logic upfront, you’ll reduce ambiguity, make success observable and explainable, and shift quality from subjective judgment to measurable signal. 

Step 4: Set the right identity context 

Once you’ve outlined what you’re testing, you need to define when the evaluation should run. Specifically, which user profile should the agent act like is sending the questions when it’s being evaluated?

The user context you select determines the agent’s behavior, including what data it can retrieve and reason over. It also ensures evaluations catch permission‑related risks early, such as inappropriate data access.

So, making this choice explicit helps avoid a common source of false confidence. When results are reviewed later, makers can trust that successes and failures are grounded in the same access boundaries their users will experience.

For example, an HR agent that references internal policy articles may behave very differently if it’s responding to a full-time employee or a contractor.

Running the evaluation under only the intended user identity ensures evaluation results reflect real conditions rather than an idealized setup. This can help you identify and mitigate unexpected behavior, such as sharing your company’s healthcare options with a contractor.

Step 5: Evaluate the agent’s responses

Now, it’s time to run your evaluation. Based on the data you provided, Copilot Studio simulates real user prompts and the agent generates responses, curated to your prescribed user context. Each configured grader then evaluates a different aspect of the response, such as quality, correctness, or capability.

This evaluation process turns individual answers into structured signals. Together, these signals make agent behavior observable, repeatable, and explainable at scale. 

The maker is no longer relying on intuition or spot checks to assess their agent’s quality. They’ve created a disciplined feedback loop that replaces assumptions with evidence and transforms agent quality from a subjective impression into a measurable outcome. 

Step 6: Step back to see the bigger picture

Once your evals gather sufficient signals, your focus shifts outward: “What does this tell me overall?” 

Aggregated results provide a high-level view of quality, consistency, and trends across scenarios and graders. For the HR agent, this might reveal strong performance on common policy questions, but weaknesses around edge cases or escalation behavior. 

Aggregated results provide a high-level view of agent quality and behavior trends

With these signals, you can better prioritize. Not every failure matters equally. Patterns matter more than anomalies. And evaluation becomes a decision-support tool, not just a reporting surface. 

Step 7: Investigate why single cases pass or fail

High-level signals are useful, but confidence is sturdiest when it’s grounded in the details. 

When a maker drills into a specific test case, explainability comes to the foreground. They can see which grader triggered a failure, how the agent responded across turns, which knowledge sources it used, and whether it invoked the expected tool or topic. 

This is often the turning point. Instead of guessing why something went wrong, you can finally understand what actually happened. Was the agent’s instructions unclear? Was the data incomplete? Did the agent confidently answer the prompt when it should have escalated it? 

With this newfound understanding, you can make informed changes to your agent, adjusting instructions, data, or behavior based on what the evaluation revealed. 

Makers can drill-down into a single use case using Microsoft Copilot Studio's agent evaluations

Step 8: Validate progress through comparison 

Evaluation doesn’t end with a single run and a few gathered signals. Agents change over time. Instructions get updated. Data grows. Tools are added. 

With evaluations as an always-on motion, you can compare runs. You can check whether things are improving and catch regressions early. This ongoing view helps your team answer a simple but critical question: “Are we actually getting better?” 

For your HR agent, evaluations might confirm that an update made to the instructions reduced hallucinations without harming coverage. Confidence is no longer anecdotal. It is earned through evidence. 

Make agent evaluations your confidence loop

Evaluations don’t slow you down. They accelerate progress. Each iteration builds understanding and offers clarity. Each run reduces uncertainty. And each comparison strengthens trust, empowering you to build with confidence.

That confidence is what encourages teams to move from test to production, and from promising prototypes to agents that can be relied on in real business scenarios at scale. 

Ready to run your first agent evaluation? Get tactical guidance for configuring evals in Copilot Studio—complete with best practice evaluation methodologies.

New to Copilot Studio? Discover how you can transform your business by building, evaluating, managing, and scaling custom AI agents—all in one place.

The post How to evaluate AI agents in Microsoft Copilot Studio appeared first on Microsoft Copilot Blog.

]]>
6 core capabilities to scale agent adoption in 2026 http://approjects.co.za/?big=en-us/microsoft-copilot/blog/copilot-studio/6-core-capabilities-to-scale-agent-adoption-in-2026/ Mon, 26 Jan 2026 17:00:00 +0000 Learn six key capabilities organizations are using to scale Copilot Studio agent adoption in 2026—plus practical considerations for enterprise deployment.

The post 6 core capabilities to scale agent adoption in 2026 appeared first on Microsoft Copilot Blog.

]]>
Before 2025, most AI agents were still experimental: narrow in scope, manually triggered, and siloed to individuals or teams. Over the past 12 months, that’s changed dramatically. Organizations have moved from exploring AI to expecting measurable impact from their agents.

This shift marks the moment AI moved from helping people do work faster to helping organizations optimize their workflows.

Microsoft Copilot Studio has played a central role in this transition. It gives you more flexibility to evaluate and use the models best suited to your business as agent adoption scales.

In 2025, we laid the groundwork for what scalable, impactful agentic work should look like. In 2026, we believe the organizations that benefit most will be the ones that build on that foundation. These six trends define what organizations need to make agent adoption stick in 2026 and beyond:

  1. Ability for anyone to turn intent into agents
  2. Agents that can own workflows from end to end
  3. Power to coordinate agents for real outcomes
  4. Flexibility to control your agent models
  5. Agents that can act across your systems
  6. Capability to scale agents without sacrificing control

Organizations that have all six aren’t just experimenting with agents. They’re operationalizing them, turning curiosity into confidence, and transmuting innovation into sustained business value.

1. Ability for anyone to turn intent into agents

Historically, building an agent meant translating business intent into technical instructions. This process slowed adoption and limited who could participate. In 2025, that barrier fell away. Conversation became the agent-making interface in both Copilot Studio and the Agent Builder in Microsoft 365 Copilot Chat. Now, people can describe what they want done using natural language and create an agent to do it. These agents can interpret intent, context, and goals thanks to their underlying model and knowledge, not specially built code.

That shift is designed to empower everyone on your team to build agents. Sales leaders, operations managers, and human resource (HR) officials no longer need to wait for technical assistance to automate everyday work. Meanwhile, IT teams retain clarity and structure under the hood, with agents grounded in logic that can be reviewed, refined, and governed—all in Copilot Studio.

The results? Faster fast agent creation, broader participation, and fewer translation gaps between business needs and technical execution.

For example, a sales operations manager can now describe and publish an agent that:

  • Monitors pipeline changes, such as changed estimated close dates.
  • Flags deals that may be at risk, based on predefined criteria (e.g., no activity with stakeholders for over a month).
  • Notifies account owners with recommended next steps based on the type of flag.

The payoff: More people can build knowledgeable, context-aware, and helpful agents, which can translate to less bottlenecking on centralized teams and faster time to value.

2. Agents that can own workflows from end to end

For many teams, early adoption wins came from AI assistance: drafting content, summarizing meetings, answering questions. Useful, but incremental. In 2025, agents crossed an important threshold; they evolved from helping with work to handling it on your behalf. With agent flows and the Workflows Agent, agents can now own repeatable processes from end to end, automatically advancing work when required.

In other words, agents unlock new opportunities to streamline and scale how work gets done. An onboarding process no longer stalls due to a missed handoff. A request doesn’t linger in a queue waiting for manual follow-up. Agents move work along reliably with automated approvals, escalating to humans only when judgment is required. For leaders, that can mean faster cycle times and fewer hidden bottlenecks. For teams, it can translate to more time spent on decisions—not coordination.

For example, a company could use Copilot Studio to automate a multi-step process for expense submission, validation, and reimbursement. The process:

  • Triggers when an employee submits a wellness or reimbursement request.
  • Guides the employee through required forms and documentation in a single, user-friendly flow.
  • Validates submissions against global wellness policy rules and regional guidelines.
  • Routes requests across the appropriate software as a service (SaaS) tools and internal HR systems.
  • Escalates exceptions to a human only when needed.

The payoff: Faster resolutions using consistent criteria, less potential for human error, and a daily pain point made smoother with an agent.

3. Power to coordinate agents for real outcomes

Often, meaningful business outcomes don’t happen in a single step or system. As soon as agents move beyond simple tasks, coordination becomes increasingly challenging. Multi-agent systems addressed this complexity head-on in 2025, allowing agents to specialize, delegate, and collaborate toward shared goals.

Instead of designing one agent to handle every step, organizations can now compose agents that mirror how teams already work. One agent might monitor signals, while another gathers or validates information, and a third prepares recommendations or takes action.

Together, these agents are designed deliver outcomes that would be difficult for any single agent to manage alone. More importantly, they remove a layer of decision-making from the stakeholder. Instead of figuring out which system or agent holds the right answer, you can simply ask your question and let the agentic system coordinate the rest. Complex workflows become easier to reason about, evolve, and scale—without adding mental overhead for the people involved.

For example, a manufacturing company might use:

  • One agent grounded in internal policy and safety documentation.
  • Another agent trained on equipment manuals and training materials.
  • A third agent connected to supplier-provided expertise.
  • A coordinating agent that evaluates each question and routes it to the right source automatically.

The payoff: More clarity around which system or agent to use—just ask, and the right expertise can come together behind the scenes. This can help keep complex work cohesive, not cobbled together.

4. Flexibility to control your agent models

As agents moved into real business workflows, one reality became clear: not every task has the same requirements or permissions. Some scenarios call for deeper reasoning. Others prioritize repeatability and efficiency at scale. Still, others must meet strict regulatory, security, or data residency standards.

In 2025, Copilot Studio expanded model choice to meet those needs. It now supports Anthropic models, chat and reasoning-specific models, access to thousands of models through Microsoft Foundry, and bring-your-own-model options. You can select the right model for each workload while IT teams maintain policy alignment and oversight. This gives your organization flexibility in how agents behave and perform, without fragmenting the experience.

For example, an organization in a regulated field might use:

  • One model optimized for policy interpretation and complex reasoning.
  • Another tuned for cost efficiency in high-volume, repeatable requests.
  • Central governance to ensure each model is applied appropriately.

The payoff: Instead of compromising between performance and compliance, agents can be configured to match the realities of the work they support—and evolve as those requirements change.

5. Agents that can act across your systems

For years, AI has been good at suggesting what people should do, but it hasn’t been equipped to help make it happen. In 2025, capabilities like Model Context Protocol (MCP) and computer use began to close that gap. Agents can now connect to systems, navigate interfaces, and take action across tools—not just give recommendations.

This addresses one of the biggest gaps in early AI adoption by reducing the handoffs that drastically slow work. When agents can act across environments to update records, trigger workflows, and interact with real systems (like clicking around a website and filling out form fields), work moves forward automatically, at any time of day. This can help reduce delays, manual errors, and the risk that important follow-ups get lost between tools or teams.

For example, an operations agent could autonomously:

  • Identify a supply issue based on predefined signals.
  • Update the system of record with the latest status.
  • Fill out and file a ticket to initiate remediation.
  • Notify relevant stakeholders with context and next steps.

The payoff: Faster response times, fewer handoffs, and agents that operate across real-world systems, not just chat windows.

6. Capability to scale agents without sacrificing control

Widespread agent adoption raises a familiar concern: How do you prevent innovation from outpacing governance? Leaders want to move quickly, but not at the expense of visibility, security, or cost control. In 2025, Copilot Studio addressed that gap by bringing lifecycle management, agent evaluations, and enterprise controls directly into the agent experience.

Organizations can now understand which agents are in use, how they’re performing, and what they cost across environments. Admin controls are designed to align agent behavior with intended use, while agent evaluations support ongoing quality and improvement. Paired with Microsoft Agent 365, organizations get a unified view of agents across Microsoft 365 Copilot and Copilot Studio, giving business and IT leaders the clarity needed to scale with confidence.

For example, IT leaders can:

  • See which agents are used, by whom, and at what cost.
  • Evaluate agent quality and performance over time.
  • Communicate performance insights to business leaders to help increase buy-in, investment, and adoption.
  • Apply consistent governance without slowing innovation.

The payoff: Agents can move from pilots to production faster, with fewer surprises and clearer business impact.

How to turn agentic momentum into results

The question for 2026 isn’t whether agents will be used—it’s how deliberately they’ll be put to work. Over the past year, the foundations for scalable agent adoption came together. The opportunity now is to move from experimentation to widespread execution.

We believe organizations that’ll get the most value in the year ahead will do three things consistently:

  1. Broaden who builds by empowering business teams to create and refine agents in partnership with IT teams, who provide guardrails without stifling creativity.
  2. Standardize how agents are shared and reused, so successful patterns move beyond individual productivity into team and enterprise workflows.
  3. Measure what matters as a matter of course, using visibility into usage, quality, and cost to guide where agents are expanded, improved, or retired.

When business and IT teams operate from the same foundation, agents stop being side projects and start becoming part of how work happens. That’s how teams move faster, reduce rework, and work together with AI and automation to create true business transformation.

Where to start—and how to go further

Your best agentic year isn’t defined by how many agents you build, but by how many people rely on them to get work done. Copilot Studio gives you the foundation to do exactly that. Now, 2026 is about building out, driving adoption, and scaling up.

Try this three-step plan for building and scaling your agent strategy with Copilot Studio:

  1. Get quick wins. Start by focusing on business-to-employee (B2E) assistive agents. Try downloading the Employee Self-Service Agent from the Agent Store.
  2. Create a Center of Excellence (COE). Set up a central team that can help triage cross-team needs and get the broader organization comfortable with agents. This could be a representative from every department, or made up of agent champions (regardless of where they sit in their org). A great COE can help reduce geographic silos and bring consistency to an AI strategy.
  3. Measure and reward adoption. What gets measured gets focus and investment. Compare the situation today with the situation post-agent adoption. Did the agent provide value? Has it improved what you set out to change? Prove the progress, and then you can move onto the next process.

Get started today and turn agent curiosity into capability, confidence, and commitment this year.

The post 6 core capabilities to scale agent adoption in 2026 appeared first on Microsoft Copilot Blog.

]]>
Why Microsoft Copilot Studio is the foundation for agentic business transformation http://approjects.co.za/?big=en-us/microsoft-copilot/blog/copilot-studio/why-microsoft-copilot-studio-is-the-foundation-for-agentic-business-transformation/ Tue, 18 Nov 2025 16:00:00 +0000 Explore new Microsoft Copilot Studio updates to shape agent behavior, enforce organizational standards, and support agentic business transformation.

The post Why Microsoft Copilot Studio is the foundation for agentic business transformation appeared first on Microsoft Copilot Blog.

]]>

Today’s leading organizations are going through an agentic business transformation. This change takes AI from concept to measurable impact, by automating existing workflows and using agents to enhance productivity and reinvent entire functions. Copilot Studio, Copilot’s agent platform, provides a fully managed solution for accomplishing this.

Using Copilot Studio, organizations around the world can quickly bring the benefits of AI to their business. Copilot Studio empowers companies to streamline and automate their processes with agentic workflows, create single-purpose agents to solve specific problems, and develop multi-agent solutions that drive measurable business outcomes at scale. The result: a scalable, secure, and governable foundation that supports the needs of IT administrators and business owners measuring return on investment (ROI). This system accelerates agentic transformation by delivering speed-to-value without sacrificing quality or control.

At the same time, with Microsoft 365 Copilot, users can easily use AI to improve their personal and team productivity. This tailored experience for Microsoft 365 Copilot users offers a fast, guided way to set up agents to support your work and automate everyday tasks, removing them from your plate.

Today, we’re excited to share new capabilities in Copilot Studio that support all of these scenarios and groups that use our product, making it easier for makers and administrators to shape agent behavior, enforce organizational standards, and extend functionality with AI.

End-user improvements

Our Copilot Studio experience for building agents and workflows, as well as our agent building capabilities in Microsoft 365 Copilot, continue to support agent creation for all users, from professional makers and IT administrators doing enterprise AI transformation, to employees building agents and workflows for their personal use. Recent updates focus on making the process simpler and more efficient.

What’s new in Microsoft 365 Copilot

  • Redesigned creation experience: Build and refine agents through an improved conversational interface that guides users and taps into an expanded set of work-related knowledge sources.
  • File generation with natural language: Agents built in Microsoft 365 Copilot, can now create Word, Excel, and PowerPoint files in seconds using natural language commands.
  • Seamless upgrade path: Copy agents from Microsoft 365 Copilot to Copilot Studio in one click, unlocking advanced AI agent customization.
  • Workflows agent in Microsoft 365 Copilot: Create, build, and manage workflows using natural language in chat. Boost productivity with quick scenarios like daily triage, weekly digests, and lightweight approvals—all directly within Copilot.
Microsoft Copilot Studio shows a user creating an agent named ‘Project Horizon Tracker’ with options to add tools, sources, and configure capabilities while uploading work content for the agent to access.

Maker improvements

IT application developers and other professional makers in the business can already build sophisticated agents in Copilot Studio without needing to code. Copilot Studio includes capabilities such as connecting and acting across more than 1,400 systems of record via Model Context Protocol (MCP), Power Platform connectors, and the Microsoft Graph. It also includes broad and deep tooling like autonomously writing and executing code, delivering rich out-of-the-box agent analytics and ROI measurement, and more, all built on the Microsoft governance and security platform. We’re excited to share new capabilities that give makers even more flexibility and control to design enterprise agents tailored to their unique organizational needs.

  • Choose your own model: Select from leading options like OpenAI’s GPT‑5, Anthropic’s Sonnet 4.5, and Opus 4.1 to power your agents. This empowers you to tailor agent intelligence to fit your specific business scenario, optimize performance, experiment with new capabilities, and deliver agents that meet your organization’s unique needs.
  • Ensure agents are ready for launch, and don’t regress over time, with Evaluations: Built-in evaluation tools help you test agents against real-world scenarios, compare versions, and track performance with clear metrics. Evaluations can give teams greater confidence that their investments are performing as expected.
  • Computer use: Agents can now automate tasks across apps and websites, using secure Windows 365 experiences—from hosted browsers for quick web automation to IT-managed Cloud PC pools for rapid scalability.

Admin improvements

As agents become central to automating work and transforming workflows, Copilot Studio is introducing new governance and protection capabilities designed to help organizations maintain strong oversight.

  • Expanded agent analytics: Clear insights into connected and child agent performance, detailed visibility into Copilot Credits consumption and limits, AI-generated summaries of top analytics insights, and interrogating analytics using natural language.
  • Real-time protection: Copilot Studio integrates with Microsoft Defender and other trusted security platforms, providing continuous monitoring and protection against threats like prompt injection—helping every agent run more safely.
  • Microsoft Entra Agent ID: Every agent made in Copilot Studio now gets a unique Microsoft Entra Agent ID, making it simple to register, manage, and govern your entire agent fleet.

Agent 365 and Copilot Studio: Unified control for agents

Agents are handling more responsibilities across enterprise operations and Copilot Studio is your launchpad for building them. With the introduction of Agent 365—the control plane for agents, the rich governance and management capabilities we offer today including sharing controls, advanced connector policies, agent inventory, zoned environment management, and more, will also be surfaced in the Agent 365 platform when using agents built in Copilot Studio.

Additionally, in Copilot Studio, makers can now build agents that use the new Agent 365 MCP servers. These servers allow agents to schedule meetings in Microsoft Teams, draft documents in Word, send emails in Outlook, and update customer relationship management (CRM) records in Microsoft Dynamics 365. This supports delivery of intelligent, compliant workflows and agents with built-in audit trails and granular policy enforcement—all from one platform.

Agent 365 is available starting today in Microsoft 365 Admin Center with Frontier, Microsoft’s early access program for the latest AI innovations.

Scale to the Frontier Firm with control

True transformation happens when agents are built for scale, governed for compliance, and measured for impact. Copilot Studio delivers that foundation, so organizations can build enterprise multi-agent systems, automate workflows with precision, and reimagine processes while minimizing risk.

EY’s results show what’s possible when you invest in a comprehensive agent platform, built on Microsoft. They are just one of many enterprise organizations implementing agents with Copilot Studio. In this case, their PowerPost Agent built on Copilot Studio led to major improvements in journal processing:

  • 95% reduction in lead time
  • 37% cost savings1

That’s the difference between cobbling together siloed agent platforms versus investing in a managed scalable agent platform like Copilot Studio: agents and agented process design that is repeatable, auditable, and scalable.

Get started today

To learn more about Copilot Studio and how it can transform your organization’s productivity, visit the Copilot Studio website and sign up for a free trial today. Take the Agent Readiness Assessment to benchmark your organization’s agent maturity across five critical areas—strategy, data, process, culture, and security—and get a personalized report to accelerate scalable agent adoption and drive agentic business transformation.

Want to explore all of Copilot Studio’s adoption content? Visit the Copilot Studio adoption page.


1 EY redesigns its global finance process with Microsoft Power Platform

The post Why Microsoft Copilot Studio is the foundation for agentic business transformation appeared first on Microsoft Copilot Blog.

]]>