Network Security Archives - Inside Track Blog http://approjects.co.za/?big=insidetrack/blog/tag/network-security/ How Microsoft does IT Fri, 22 May 2026 18:05:32 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.4 137088546 Governing AI agents at scale: Lessons from our journey at Microsoft http://approjects.co.za/?big=insidetrack/blog/governing-ai-agents-at-scale-lessons-from-our-journey-at-microsoft/ Thu, 21 May 2026 16:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=23618 Empowering employees and protecting your organization through agent governance Welcome to the agentic frontier Engage with our experts! Customers or Microsoft account team representatives from Fortune 500 companies are welcome to request a virtual engagement on this topic with experts from our Microsoft Digital team. Agents are expanding the frontier of enterprise AI. By creating […]

The post Governing AI agents at scale: Lessons from our journey at Microsoft appeared first on Inside Track Blog.

]]>

Empowering employees and protecting your organization through agent governance

Welcome to the agentic frontier

Agents are expanding the frontier of enterprise AI. By creating tools that surface knowledge, take actions, and even reinvent workflows, organizations can apply the power of AI to business processes in new and innovative ways.

But this shift raises questions for business and IT leaders: How do you get the benefits of agents without putting your organization and employees at risk? How do you encourage citizen developers to create agents freely while maintaining control, security, privacy, and compliance?

At Microsoft Digital, the company’s IT organization, we’re putting practical governance structures in place to ensure our internal agents are useful, safe, and properly scoped. Through a deliberate strategy of empowerment with established guardrails, we’re unlocking the potential of agentic transformation while maintaining the trust that defines our work.

The AI maturity model and frontier transformation

Agentic AI has made a new operational model possible, one that blends machine intelligence with human judgment, creating AI-operated, human-led teams.

We call organizations that enact this model Frontier Firms.

As organizations move toward this new operational state, they progress from foundational AI assistance through escalating levels of agentic maturity and complexity. First, humans operate with help from an AI assistant like Microsoft 365 Copilot. Then, human-agent teams work together. But the future lies with humans leading teams of agent users: AI agents that perform core labor with relative autonomy.

Pattern 1: Human with assistant—every employee has an AI assistant that helps them work better and faster.
Pattern 2: Human-agent teams—agents join teams as “digital colleagues,” taking on specific tasks at human direction.
Pattern 3: Human-led, agent-operated—humans set direction, and agents execute business processes and workflows, checking in as needed.

Capturing the benefits of this model relies on many factors, but in our experience as Microsoft Digital, two main tenets are instrumental to a successful transformation:

  1. Empowering employees and teams to create and experiment with their own agents
  2. Properly governing those agents to protect the enterprise

It’s a balance. If you set agent builders free without the proper guardrails, you risk data overexposure, agent sprawl, and security vulnerabilities. However, being too restrictive about governance stifles individual imagination, workflow reinvention, and innovation that can come from agentic AI.

A photo of Fielder.

“At Microsoft, we’ve moved beyond envisioning the agentic future into operating within it every day. Our experience as Customer Zero gives us a unique perspective on what it takes to govern AI agents at scale, turning early lessons into proven practices that help organizations innovate with confidence.”

We’re here to help you find the right balance for your organization.

This guide shares what we’ve learned along the way. As you read, you’ll follow our journey as Customer Zero at Microsoft, and you’ll gain access to tips and resources that we’ve assembled to help you apply our expertise to your own agent governance practice.

Every organization is different, and your experience will differ from ours in terms of risk tolerance, technical capability, resourcing, and more. This guide highlights some principles and best practices you can apply to your own business context, needs, and objectives.

“At Microsoft, we’ve moved beyond envisioning the agentic future into operating within it every day,” says Brian Fielder, vice president of Microsoft Digital. “Our experience as Customer Zero gives us a unique perspective on what it takes to govern AI agents at scale, turning early lessons into proven practices that help organizations innovate with confidence.”

Now is the time to seize this opportunity. Follow along to start your own journey toward frontier transformation and capture the benefits of trusted, connected agentic intelligence.

Learn from our experience governing agents

Within Microsoft Digital, we’ve been acting as Customer Zero for frontier transformation by creating the tools, infrastructure, and processes that power agents at Microsoft.

Our goal is to make it easy for employees to engage with agentic tools freely and adaptably while maintaining safety and responsibility. The path to this objective relies on a three-pronged approach to governance:

  • Embedded governance functionality: Agent creation and publishing tools should incorporate good guidance, governance, and guardrails out of the box, making agents people create essentially self-governing.
  • IT oversight: This is a new space and a new way of working, so it isn’t feasible for all agents to self-govern at this point. As an IT organization, we fill gaps in governance through reviews and oversight. We establish risk-based policies around types of agents, exposure and sharing, and other pivots.
  • User education: It’s almost impossible to predict every governance gap and need, so educating our users helps them avoid accidentally increasing risk. Our Agents at Microsoft team and individual change managers are the guides for these efforts. Employees can also refer to resources like Microsoft Learn courses and the Agent Builders SharePoint hub.

Throughout this journey, we’ve empowered our employees to create all kinds of agents, ranging from simple personal tools built by people working in every function, with every level of technical skill, all the way to AI-powered enterprise tools designed by professional developers for use across lines of business and even the entire company.

As part of the process, we’ve incorporated guardrails to ensure less technical employees are limited to tools that simply retrieve enterprise knowledge, such as SharePoint Agent Builder or Copilot Studio, while software engineers get the full power of any tool they need that can take action or automate workflows, including Microsoft Foundry and Microsoft 365 Agent Toolkit.

SharePoint

  • Lowest level of difficulty
  • For all roles
  • Function: information-retrieval only
  • Microsoft 365 content
  • Light governance
  • Lowest risk

Copilot Studio Agent Builder

  • Low difficulty
  • For all roles
  • Function: information-retrieval only
  • Microsoft 365 content and web sources
  • Light governance
  • Low risk

Copilot Studio (full)

  • Low to moderate difficulty
  • For all roles
  • Function: task completion
  • Microsoft 365 content + connectors to external channels
  • Advanced governance
  • Higher potential for risk

Agent Toolkit, Foundry

  • Highest difficulty
  • For developers
  • Function: workflow automation
  • Multiple internal and external channels
  • Advanced governance
  • Highest potential for risk

Over the course of this journey, we’ve learned valuable lessons about effective agent governance, including:

  • How to build an impactful but flexible governance strategy
  • Strategies for creating an AI-ready data ecosystem
  • Ways to apply appropriate policies and controls for highly diverse agents
  • Approaches for tracking the impact and value of agents

Chapter 1: Building your agent governance strategy

Thinking through your organizational needs and building a framework to govern agents

As we’ve incorporated agents into different aspects of our organization, we’ve also deepened their involvement in employees’ daily workflows and core business processes. Because of this, we’re diligent about the governance guardrails and policies that protect our organization.

We’ve accumulated a wealth of knowledge and insights in this area through our efforts governing Microsoft 365 Copilot. Based on this experience, some of the key priorities that we made sure to adhere to included:

  • Effectively applying controls to ensure users and apps don’t get access to privileged information
  • Preventing employees from creating agents that violate company policies
  • Balancing the freedom for employees to share their creations with the need to prevent agent sprawl
  • Delineating which agents are authoritative and applicable for enterprise functions and which ones are meant for employees’ own personal use.
  • Inventorying agents to provide lifecycle management
  • Securing and protecting confidential data while respecting our responsible AI principles: Fairness, reliability and safety, privacy and security, transparency, accountability, and inclusiveness
  • Unlocking telemetry that enables us to govern agents effectively

By focusing on each of these dimensions, our governance team has centered its efforts on the value these agents provide to the company while also ensuring organizational safety and trust. To realize this value, we emphasize three key principles that help protect both our employees and the organization:

Security

We’ve established standards for data classification, policies for handling confidential information, and other security measures to protect data from unauthorized access, misuse, and disclosures. Microsoft Purview powers these capabilities through data labeling, rights management, and data loss prevention.

Privacy

Privacy compliance measures keep personal data protected and ensure agents adhere to regulatory frameworks in the regions where we operate. We conduct regular privacy assessments for all applications, including high-impact agents.

Regulation

Regulatory compliance assessments ensure agents meet prevailing legal standards. Our legal and compliance teams carefully monitor AI guidelines, regulations, and laws as they evolve so we can understand and incorporate them into these assessments.

We incorporated elements of our tenant’s minimum bar for governance into how we secure agents. Those include Microsoft Purview Information Protection, a functional inventory, activity logging, lifecycle management, and the ability to properly isolate agents so that they don’t cross data boundaries.

Our overarching tenant governance strategy is to govern items like documents and data at the container level. However, within a SharePoint site, for example, the added functionality of agents demands that we introduce further controls like sharing limits, breadth of knowledge sources, agent metadata, and information about an agent’s behaviors.

Turning priorities into principles

To operationalize governance, we developed six principles that guide our approach to agents. They form the governance foundation for a wide matrix of agent creation and usage opportunities.

  1. We ensure a strong data hygiene foundation so we can trust our data estate as employees build and use agents.
  2. We empower employees to build personal agents that can access permitted services and data sources to help automate and accelerate their tasks.
  3. We empower teams and lines of business to build agents with known lower-risk patterns to accelerate impact.
  4. We provide a smooth release path for engineering teams to develop agents designed for enterprise functions so they can access all the services and sources they need. This includes the same software development lifecycle (SDLC) reviews and certifications as other enterprise software, which we outline in Chapter 3.
  5. We accelerate innovation through agent and automation templates while maintaining an AI Center of Excellence (CoE) to help teams think through their opportunities.
  6. We reimagine employee experiences and task execution to simplify and optimize productivity.

Securing control through agent lifecycles

As we strategized to operationalize good governance, agent lifecycles became one of our most crucial tools. We superimposed the enterprise lifecycle on top of these policies, with both user-based and attestation-based lifecycles.

This means we treat agents owned by individual employees like any other user app and delete them when they leave the organization. Meanwhile, we ensure that agents owned by teams have a lifecycle that’s defined by the tenant and tied to attestation, our internal enterprise SDLC, and accountability confirmations.

This approach helps us combat sprawl by eliminating agents that no longer serve a purpose. It provides a solid foundation for more fine-tuned, matrixed policies and practices.

Governing amid real-time technology acceleration

One recent development illustrates how the rapid advancement of AI technology requires us to stay ahead of policy for new features.

Model Context Protocol (MCP) adds new capabilities, but also new risks and challenges. It’s a simple standard that lets AI systems communicate with the right tools and data without custom integration work. Instead of building a new connection or API every time, teams plug into a common pattern.

That standardization delivers speed and flexibility, but it also changes the security equation. We’ve extended our security and governance practices to account for MCP servers.

Our practices and policies help us govern agents effectively in this new environment. First, we assess security across four layers: Applications and agents, the AI platform, data, and infrastructure. We establish a secure-by-default strategy by positioning every remote MCP server behind our API gateway and establishing practices for vetting, identity management, automation that slows agents at the right moments, context trimming, and server isolation.

As you define policies for governing your own agentic ecosystem, you can take inspiration from our process. Start by asking questions about what you want to accomplish and what you want to protect, then move on to establishing your most important priorities. From there, you can cement those priorities into policies.

Learning from our approach to agent governance strategy

Match policies to progress on your AI journey

The complexity of agent governance depends on the maturity of your organization and where you are in your adoption journey. Start slowly to let that maturity grow over time.

A strong policy framework is the foundation

Lean on existing app governance policies, then layer agent-specific structures on top.

Take your cues from established standards

Global regulations around privacy, security, and responsible AI provide a good baseline for establishing governance policies. Assign teams to work through these regulations and incorporate their insights into your agent governance strategy.

Decide on your comfort level with risk

Bring cross-disciplinary experts together from across your organization to determine what level of risk is acceptable for different agents and their use cases. Put guardrails in place for low-risk scenarios and establish processes for supporting more complex or sensitive use cases. Evaluate what data sources agents can extract information from. Establish whether users have shared sensitive data sources.

Change is constant

Plan to reassess and revise your governance structure regularly. Agents are evolving rapidly, as is the tooling surrounding them, so maintaining good governance policies will be an ongoing practice.

Governance is a value driver for employees

Governance isn’t just about protecting your organization. It also provides the right patterns to make sure your employees are getting value from agents. Establish strong measures of business value and a robust methodology for management and assessment of agents through ongoing tracking. This kind of observation and telemetry is foundational and should be a key part of your governance efforts.

Key takeaways

Use these tips based on what we learned here at Microsoft to build your strategy for agent governance at your company:

  • Establish a cross-disciplinary agent Center of Excellence. Bring together stakeholders across the organization to define priorities, goals, and shared practices for agent adoption.
  • Right-size oversight based on risk. Determine your organization’s risk tolerance and define which agents require more or less involvement from IT, security, and compliance teams.
  • Operationalize agent oversight and management. Establish an oversight model and implement tools that help manage agents at scale.
  • Establish change management and adoption. Determine and implement a strategy for driving adoption to educate and empower employees.
  • Create a centralized governance and information hub. Provide employees and agent builders with a single place to find guidance, standards, and governance information.

Learn more

How we did it at Microsoft

Further guidance for you

Chapter 2: Establishing a solid data foundation for agent governance

Setting agents up for success using a secure, robust data foundation

Operating according to an escalating maturity model means we’ve done the foundational work to secure and govern our data estate for Microsoft 365 Copilot. Many of the same principles apply to agents, with the added complexity of incorporating additional data sources.

To lead these efforts, we established a cross-functional team of data professionals within our AI CoE. This team is mostly comprised of Microsoft Digital employees who support corporate functions like Corporate, External, and Legal Affairs (CELA) and Global Workplace Services. Together with our AI CoE, this team helped us define what it means to have AI-ready data.

In essence, AI-ready data just means information we’ve certified for AI workloads. We certify those data sources using Microsoft Purview to identify defects in our core data products, and we’ve also built AI-powered assessments to certify which data lakes are AI-ready.

In most ways, governance is tool-agnostic and rooted in basic principles. With robust data labeling, data hygiene, and permissions in place alongside our AI tools, which respect labels by default, we can confidently give every employee the ability to build basic agents and trust in our governance guardrails. For decades, the challenge of data analysts and engineers was maintaining a consistently reliable source of truth despite inconsistent data quality, insufficient governance, and years of collecting data in silos. Microsoft Fabric and Microsoft Purview can help resolve these issues.

We’re embracing a more balanced, federated approach to data management today. We call this approach a data mesh. Rather than allowing unchecked decentralization or forcing all our data into a single centralized system, the data mesh formalizes domain ownership while embedding governance, quality, and interoperability directly into shared platforms.

Graphic shows our data mesh architecture surrounded by the platform services layer and the data management zones layer.
Our data mesh architecture helps us preserve trust and establish a strong governance foundation while preventing data from becoming siloed.

The data mesh connects and distributes, data products across domains, enabling shared data access and compute while scaling beyond centralized architectures.

Platform services are standardized blueprints that embed security, interoperability, policies, standards, and core capabilities — providing guardrails that enable speed without fragmentation.

Data management zones provide centralized governance capabilities for policy enforcement, lineage, observability, compliance, and enterprise-width trust.

With this approach, our domain teams publish data as well-defined, discoverable products, while common standards for security, metadata, and compliance are enforced through automation rather than manual processes. This model preserves enterprise trust and consistency without sacrificing speed or autonomy. By adopting a data mesh mindset, we can scale analytics and AI more effectively across the organization while still keeping ownership closely connected to the business focus.

Confidentiality labels, the practical framework for data protection

To operate according to Zero Trust principles, we needed a coherent system that lets us see, label, and protect data. Otherwise, the burden of data loss prevention would fall solely on employees, who would have to exercise individual discretion whenever they decided how to house and share potentially sensitive content.

With labeling, it’s important to strike a balance between the depth necessary for supporting an array of data governance controls and the simplicity to ensure labeling isn’t burdensome for users.

We decided on four overarching labels for container and file classification, each with its own sub-labels. The highest-level schema looks like this:

  1. Highly confidential: We only share our most critical data with named recipients.
  2. Confidential: Any items crucial to achieving our goals feature limited distribution.
  3. General: Employees can share daily work–like personal settings and postal codes–internally throughout Microsoft.
  4. Public: We share unrestricted data meant for public consumption freely. That includes information like publicly released source code and openly announced financials.

For our risk tolerance and organizational needs, we made the decision to protect data designated confidential or higher. As a result, we contain data flows to their tenants and only trust suitable storage destinations for content. That suitability depends on a storage location’s ability to gate which connectors can work with particular source data and sensitivity labels.

The administrators responsible for workspaces like SharePoint sites set default labels. These labels serve as a foundation for appropriate access and circulation for objects within those containers. It takes the burden of labeling off of employees. The sensitivity labels that administrators apply map to several different categories of policies that can anticipate and help to mitigate data loss and risk.

They communicate four key areas:

  1. Breadth of availability: Labels determine whether the workspace is broadly available internally or is a private site.
  2. External permissions: We administer guest allowance via the group’s classification, allowing specified partners to access teams when appropriate.
  3. Sharing guidelines: We tie important governance policies to the container’s label. For example, can an employee share this workspace outside of Microsoft? Is this group limited to a specific division or team? Is it restricted to specific people? The label establishes these rules.
  4. Conditional access: While we haven’t implemented this policy at Microsoft, tying identity and device verification to container labels can introduce additional governance controls.

Within Microsoft Digital, we’ve put a lot of thought into how each of our labels aligns with relevant policies. You can see more of the logic behind our sensitivity labels and their policies in this graphic:

A chart shows the different types of data container labels and what level of access is given for each one.
Our Microsoft Digital schema clearly lays out what each container sensitivity label means and how it affects content.

If a container owner needs different policies for a set of files to provide greater external access, they can self-service new groups without accidentally violating our governance practices.

At Microsoft, we use Microsoft Purview, which is our suite of data estate management tools, but you can use your tool of choice to apply labels in your environment. Microsoft tools will respect them. Microsoft Purview helps us accomplish three important tasks: mapping our labeling structure onto the relevant policies, verifying them against our standards, and backstopping self-service data loss prevention practices through automation.

Automation is particularly useful. We’ve configured Microsoft Purview Information Protection to scan automatically for wayward credentials, malicious user behaviors, and other sensitive information in items without the proper protections. When Purview detects a violation, our governance team receives alerts that prompt them to contain the risk by upgrading an item’s sensitivity label or requiring employees to remedy the issue.

The result is a system that allows flexibility for employees to self-manage their digital workspaces while providing guardrails that help our governance experts take appropriate actions without overtaxing their time and resources.

Our approach within Microsoft Digital is just one way to create an AI-ready data estate, but aspects of our story will hold true for almost any organization. Consider establishing a body to take over responsibility for AI-ready data, developing your primary goals for AI-ready data, unifying your data estate, and implementing a system of confidentiality labels.

Learning from our approach to agent governance strategy

Define the responsibility for AI-ready data

Identify and assign enterprise data owners to implement and oversee the processes that guarantee data quality.

Create intuitive labels

Your employees will be the ones applying labels, so make those labels intuitive. For example, “highly confidential” is easy to understand, while “business-critical” could be interpreted in many ways from a sensitivity standpoint.

Don’t overwhelm your users

Make labeling simple and intuitive to ensure it isn’t overwhelming. Employees should have a limited set of choices to keep things comprehensible.

Use existing defaults

Identify the security needs and regulatory compliance that are specific to your organization and use built-in governance controls available through Microsoft tools.

Key takeaways

You can use these tips based on what we learned here at Microsoft to tackle agent governance at your company:

  • Establish a cross-functional data council. Form a data council to help promote a culture of AI-ready data with professionals from all relevant disciplines, including human resources, legal, security, IT, and anyone else who can share relevant expertise.
  • Certify datasets for AI workloads. Limit agents to datasets that have been certified as “AI-ready” to minimize hallucinations and reasoning errors.
  • Define your labeling parameters. Keep the number of labels to five main labels with five sub-labels each. The fewer you use, the better.
  • Align your sensitivity labels with policies. Consider how your labels line up with breadth of availability, external permissions, sharing guidelines, and conditional access.

Learn more

How we did it at Microsoft

Further guidance for you

Chapter 3: A matrixed approach to agent governance

Governing different types of agents for different contexts, built with different toolsets

Our customers have expressed a strong desire to start building agents, but they’re concerned about where to begin and how to manage those agents once they’re built. They worry about persistent problems such as hallucinations and agent sprawl. These concerns are especially pronounced on IT teams.

During our Customer Zero journey, we’ve learned that the diversity of agent types and creation methods means there’s no one-size-fits-all approach to governance. Generalized approaches will only get you so far.

We’ve found it helpful to think about different kinds of agents along an escalating spectrum of development complexity:

The Microsoft Digital agent controls model, spanning citizen, partnered, and professional development models and their relevant tools.
The agent controls model we’ve developed at Microsoft Digital spans different agent-building methods for different kinds of creators using a spectrum of tools.

There’s an entire matrix of different parameters that apply to an agent at any level of this spectrum, and they all require different policies. Those parameters include:

  • Level of reach: Personal agents, limited sharing (like development environments or team boundaries), or enterprise-wide distribution
  • Agent-building tool: SharePoint agent builder, Agent Builder in Microsoft 365 Copilot, Microsoft Copilot Studio, or tools geared to more professional developers (such as Microsoft Foundry or Microsoft 365 Agent Toolkit)
  • Knowledge sources and content accuracy: Public sites, SharePoint and OneDrive, directly uploaded files, enterprise apps and systems, or third-party knowledge bases
An overview of the range of agent-building tools and our matrixed approach to governing them across different parameters.
Our matrixed approach to agent creation and governance spans a wide array of tools, knowledge sources, actions, channels, and more.

Each of these parameters creates a pivot that we need to govern, and we’ve carefully assembled a set of policies and controls to account for them. As our understanding and use of agents advances, we’re continually updating how we match their characteristics and capabilities with relevant policies and any applicable reviews.

Within Microsoft Digital, we’ve adopted a risk-based approach that helps us establish a matrixed model for agent governance. The foundational idea is that we identify potential harms for each kind of agent, then assign policies for the level of review and oversight they require.

For example, simple agents that can only read and present data tend to be low risk. Because their access is tied to their creators’ identities and access, our data governance structures and guardrails can prevent overexposure. But for agents that have capabilities like writing data, taking action, or creating items, more reviews are necessary.

A matrix of agent governance policies, pivoted by parameter

The following matrix enumerates the factors that determine how we govern different kinds of agents created using different tools. This matrix helps our employees understand the agent creation process and helps us maintain safety and control.

SharePoint agent builder

What users can build: Knowledge-only agents
These agents reason over Microsoft 365 Copilot collaboration data, and they’re gated to the SharePoint environment where they’re created.

Technical proficiency: No-code

Knowledge sources: SharePoint, custom instructions

Capabilities: Not applicable

Actions and plug-ins: Not applicable

Sharing and publishing: Copilot navigation in SharePoint, sharing by link, sharing in Microsoft Teams chat

Custom engine or bring-your-own model: Not applicable

Reviews: No review needed
IT doesn’t gate knowledge-only agents outside of governance tied to SharePoint sites. Microsoft Digital honors reactive take-down requests like any other self-service construct, but does not provide proactive gating.

Agent Builder in Microsoft 365 Copilot

What users can build: Knowledge-only agents
These agents feature graph connectors from a preapproved catalog to expose additional data.

Technical proficiency: No-code

Knowledge sources: SharePoint, external websites, custom instructions, additional internal knowledge sources via graph connectors

Capabilities: Code interpreter, image generator

Actions and plug-ins: Not applicable

Sharing and publishing: Individual use, sharing by link

Custom engine or bring-your-own model: Not applicable

Reviews: No review necessary
These agents only access graph data available in Copilot. Microsoft Digital honors reactive take-down requests like any other self-service construct, but does not provide proactive gating.

Microsoft Copilot Studio

What users can build: Task and custom agents
These agents connect to more systems through connectors and orchestration logic to handle more complex scenarios. We might publish agents at this level of complexity and utility to our agent catalog for wide organizational use.

Technical proficiency: Low-code or pro-code

Knowledge sources: SharePoint, external websites, custom instructions, additional internal knowledge sources via advanced graph connectors, Power Platform connectors

Capabilities: Not applicable

Actions and plug-ins:
Retrieval and task agents: Read-only actions
Custom agents: Read or write actions using Power Platform connectors

Sharing and publishing:
Retrieval or task agents in a personal developer environment: Sharing by link with up to 10 people
Custom agents: Publishing to 10 people or the agent catalog in Microsoft 365 Copilot Chat
Broad publishing: Requires a review similar to professionally developed apps, including an understanding of the agent’s data implications

Custom engine or bring-your-own model: Custom Azure OpenAI large language models (LLMs)

Reviews: Custom agents for our catalog require reviews for security, privacy, accessibility, responsible AI, and an environment-specific maker stack review.

Microsoft Foundry

What users can build: Retrieval, task, and custom agents
These agents may or may not connect to more systems through connectors and orchestration logic to handle more complex scenarios. We might publish agents produced at this level of complexity and utility as Microsoft Teams apps or to our agent catalog for wide organizational use.

Technical proficiency: Pro-code

Knowledge sources: SharePoint, external websites, custom instructions, additional internal knowledge sources via graph connectors

Capabilities: Code interpreter, image generator, Teams chats and channels

Actions and plug-ins: API actions

Sharing and publishing: Publishing as an app in Teams or as an agent in the catalog in Copilot Chat

Custom engine or bring-your-own model: Custom Azure OpenAI large language models (LLMs)

Reviews: Custom agents for publishing as a Teams app or in our catalog require reviews for security, privacy, accessibility, responsible AI, and an environment-specific maker stack review.

In addition to mapping out our policies for governing agents, the matrix illustrates how we see their relative utility across the organization. It demonstrates an escalation from personally useful to organizationally useful agents. Their governance policies and controls escalate accordingly.

Regionality is an additional concern. Regulatory compliance might vary, but it’s important to keep in mind that certain kinds of data access and actions might be perfectly permissible in one region, but not in another.

One example is our Employee Self-Service Agent, a central resource employees can turn to for help with IT support, HR questions, and facilities requests. Because it can access potentially sensitive personal information, this agent required additional review from European works councils to ensure it met all relevant workplace standards.

As you facilitate the experimentation and innovation with agents across your workforce from citizen developers to pro developers, consider adopting a similar matrixed approach to agent governance. It starts with understanding your organization’s needs, your risk tolerance, and the different employee populations you want to equip with agent-building capabilities.

Learning from our matrixed approach to agent governance

Figure out your building environment strategy

Decide which scenarios match up with specific environments and make those environments available to the relevant employees.

Design governance structures that scale from low-code to more advanced agentic tools

With the proliferation of AI agents, platform-level approvals similar to the Power Platform model at Microsoft can ensure rapid innovation while requiring review for individual high-impact scenarios.

Build trust through transparency and structure

A clear, well-documented approval process helps internal regulatory advisors understand new AI technologies and establishes the trust needed for productive, long-term collaboration.

Treat regional partners as strategic allies in the agentic future

Early feedback on digital agents from regional partners like works councils helps improve product design, accelerate approvals, and reduce fear or misconceptions about AI in the workplace.

Don’t forget that Copilot Studio is part of Power Platform

You can use what you’ve learned empowering citizen developers in Power Platform to guide your work with agents.

Key takeaways

Use these tips based on what we learned here at Microsoft to tackle agent governance at your company:

  • Establish your tolerance for risk. Determine where the most prevalent risks emerge across different populations and kinds of agents. Remember, you control the guardrails in your environment.
  • Determine what agent-building tools you want to roll out and who can use them. Different populations benefit from different agent-building capabilities. Put thought into what individuals and teams can create and the degree of partnership each level will need from IT.
  • Define your governance parameters for different kinds of agents. Determine the best ways to hedge against risk at every level. For example, you might choose to trust in tenant governance for simple agents and establish reviews for more complex tools.

Learn more

How we did it at Microsoft

Further guidance for you

Chapter 4: Tracking, impact, and value

Managing agents and assessing their business impact for the organization

It’s clear that agents bring astonishing capabilities to the enterprise. For many organizations, what remains unclear is exactly how to measure their impact. Without that information, businesses are at a loss for ways to articulate value and drive improvement.

Tracking agents is also a crucial component of preventing sprawl: We need to understand what agents we have, how employees are using them, what critical processes they’re supporting, and if they’re contributing value or need to be retired.

We’re at the beginning of our impact-tracking journey, but our work can provide a starting point for your own efforts to measure the value of AI initiatives at your organization.

Managing our agent catalog through comprehensive tracking

Microsoft Digital partners with other internal organizations to ensure we’re prioritizing the right agents and avoiding agent sprawl. Ideally, these engagements take place before teams start building their agents so we can avoid wasted effort or duplicated work.

Still, ongoing management efforts are crucial to keeping our agent ecosystem healthy. Telemetry is the key to assessing usage and ensuring compliance. We’ve developed our own internal tooling to ensure that:

  • Metadata is complete and available
  • The tooling tells us the right information about our agents
  • The tools connect properly with other compliance tooling, like Microsoft Purview

This telemetry also reveals agent behaviors, shows how agents do their work, and tracks events, actions, and policy baselines.

These capabilities help us gain visibility into policy adherence and violations, and then to conduct enforcement actions. We also track the speed of reaction and mitigation. AI-ready data and robust guardrails mean we head off most violations before they occur.

A robust inventory, an agile policy framework, and an automated workflow for enforcement are cornerstones for successfully governing agents at scale.

The release of Microsoft Agent 365, now in early access, represents the next step in agent observability and management, two key aspects of agent governance and sprawl mitigation. This control pane for agents incorporates many of our learnings as we’ve bridged governance gaps through IT intervention.

Some of the key aspects of the control pane:

The registry

Provides a complete view of agents, and the enterprise agent store makes it easy to find the right agents for each role and business process within familiar workflows in Microsoft 365 Copilot and Teams.

Visualization

Delivers the observability layer, including role-specific oversight, compliance and audit features, and performance measurements that can help organizations track their agents’ impact and see where they contribute value.

Interoperability

Ensures Agent 365 is open to any Microsoft-built or partner ecosystem, while delivering work intelligence through access to data and Microsoft 365 apps.

Security features

Provide crucial confidence through visibility into security posture, detection and response capabilities, and intelligent runtime defense.

As Customer Zero for Agent 365, we’re excited to have a platform for observability and telemetry that encompasses everything from agentic creation through usage.

Tracking governance from agent inception

Professionally developed agents add a new dimension of tracking and governance, because we need standards in place for ensuring compliant agent-building and to remediate any issues.

We use our Azure DevOps instance to catalog apps on our tenant, and we’ve applied this practice to agents created professionally for lines of business and enterprise agents. This tool contains our service tree with product and app log registration, which is tied to our KPI dashboard and scoring system that validates agent data against our policies.

Our expectation is that all new apps and agents start from a place of compliance. Any new agent is registered through this platform, and we expect adherence within the first 14 days. In our experience, the introduction of new metrics, policies, or timeframes as our governance policies evolve is where agents tend to drop out of compliance. The priority is restoring compliant status.

We’ve established a series of metrics to help track and manage these expectations:

  • Enablement velocity
  • Renewal velocity
  • Agents in compliance
  • Time to remediation of noncompliance

Through a DevOps process built on our preexisting software development lifecycle practices, we’ve applied governance not only to agents themselves, but to the process of building them professionally.

Measuring progress and unlocking value

Properly measuring value depends on concrete definitions of success and metrics that support it. Articulating AI’s impact came with several challenges. First, we had to land on a consistent taxonomy for different measurement areas. Then we needed to make the relevant data accessible, ensure its quality, and confirm it made sense.

The Microsoft Digital AI Value Framework is our flexible, modular tool for measuring the impact of our AI initiatives. With tools for measurement firmly in place, we can effectively demonstrate value and guide further decision-making.

Revenue impact

Direct contributions to revenue generation and business growth

Example metrics:

  • Increased sales or customers
  • Improved customer targeting
  • Higher lead quality
  • Deal velocity

Productivity and efficiency

Efficiency gains while completing tasks and processes without a reduction in quality

Example metrics:

  • Increased throughput
  • Process optimization
  • Task automation

Security and risk management

Improvements in identifying, preventing, and managing security vulnerabilities and risks

Example metrics:

  • Vulnerability detection or prevention
  • Reduction in data security incidents
  • Increased compliance with responsible AI standards

Employee and customer experience

The impact of AI initiatives on employee satisfaction, engagement, and productivity

Example metrics:

  • Employee or customer engagement satisfaction with products or services
  • Improved employee health scores

Quality improvement

Enhancements in the quality of deliverables, services, and processes

Example metrics:

  • Higher-quality deliverables
  • Confidence in code quality
  • Accuracy of numbers

Cost savings

Reduction in operational costs and resource allocation efficiencies

Example metrics:

  • Operational efficiencies
  • Improved resource allocation
  • Future cost avoidance

We plan to use the following capabilities to improve the overall ecosystem:

  • Filtering our agent inventory on specific criteria like the type of agent or how it was built
  • Enhancing governance-specific actions we can take with agents in areas like ownership and quarantining
  • Gaining visibility into trends like agent usage
  • Ingesting agent blueprints and defining policy templates

We’re still in the midst of our agentic measurement journey at Microsoft, but the blueprint for tracking already exists. Your organization might be in the early stages of agent readiness and deployment. If that’s the case, it could be helpful for you to internalize the lessons we’ve learned as Customer Zero and apply them as early as possible in your own journey toward AI maturity.

Learning from our agent adoption experience

Think proactively, not retroactively

If you put effort into tracking agentic impact early in your AI maturity journey, you’ll be poised to start capturing insights immediately instead of applying your methodology retroactively.

Involve a wide array of stakeholders

This workstream needs oversight from different kinds of stakeholders, including your leadership team, IT, Microsoft 365 administrators, agent developers and builders, and employee champions. That will provide the sponsorship, expertise, and perspective you need for success.

Different measurements will be appropriate for different phases of your initiatives

These measurements include monthly, weekly, or daily active usage; consider which metrics make sense at each phase of an AI initiative.

Establish a continuum of value

Agents need to tie into real business goals, so it’s important to establish metrics that actually speak to those objectives. Cascade business goals to concrete KPIs with well-defined timelines and track those diligently.

Embrace the red

Try to think of underperformance not as failure, but as data. Performance data over time helps you course correct or pivot, making sure you invest where it matters.

Key takeaways

Here are some important steps to keep in mind as you embark on your own tracking and measurement efforts for agents:

  • Establish priorities and parameters for tracking agents. Consider measurements that relate to sprawl, usage, and coverage, and build them into your telemetry tooling.
  • Pull your stakeholders together to establish measurement parameters. Cascade business priorities into measurable value.
  • Conduct ongoing tracking. Establish a cadence for tracking and reviewing progress with your team.

Learn more

How we did it at Microsoft

Further guidance for you

Governing the frontier to scale innovation

AI agents are rapidly becoming core contributors to how work gets done. As our experience within Microsoft Digital demonstrates, realizing their full potential demands more than powerful tools or enthusiastic builders. It requires thoughtful governance that evolves alongside your AI maturity, protects what matters, and gives employees the confidence to innovate responsibly.

As you consider your own strategy for managing agents, it can be helpful to keep one truth in mind: Governance is a catalyst for progress, not a barrier. By embedding guardrails into tools, grounding agent creation in AI‑ready data, applying risk‑based and matrixed policies, and reinforcing all of it through adoption and education, we’ve been able to expand agentic capability without sacrificing security, privacy, or trust.

From our experience, we’ve learned that governance works best when it’s:

  • Proportional, scaling with risk and agent complexity
  • Embedded, not bolted on after the fact
  • Human‑led, recognizing that accountability and judgment remain essential
  • Iterative, adapting as technology, regulations, and business needs evolve

When you design governance this way, it allows experimentation, learning, and impact at scale. Employees feel empowered to build agents that solve real problems, while IT and compliance teams gain visibility and control without becoming bottlenecks. Crucially, leaders can measure value, manage risk, and make informed decisions about where to invest next.

A photo of Alaparthi.

“At Microsoft, we believe the future of agentic AI depends on governance that empowers people first. The structures should be invisible when they’re working, intentional when they’re needed, and trusted by everyone they serve.”

This is the foundation of the Frontier Firm: Organizations where humans lead and agents operate, guided by clear principles and trusted systems.

As you continue your AI maturity journey, remember that there is no single, correct governance model. Your approach will reflect your risk tolerance, regulatory environment, data maturity, and organizational culture. The practices outlined here provide a proven starting point informed by real-world deployment at enterprise scale.

“At Microsoft, we believe the future of agentic AI depends on governance that empowers people first,” says Vijaya Alaparthi, principal group product manager in Microsoft Digital. “The structures should be invisible when they’re working, intentional when they’re needed, and trusted by everyone they serve.”

Now is the moment to act. Start with strong foundations. Empower your builders. Measure what matters. And treat governance not as a constraint, but as a strategic advantage that allows your organization to move faster, innovate safely, and lead confidently on the agentic frontier.

Key takeaways

Here are the high-level learnings and insights that you need to consider as you embark on your own agent governance journey, based on what we’ve learned here at Microsoft:

  • Treat governance as an enabler of innovation, not a brake. Effective agent governance is what makes large‑scale innovation possible. When you embed guardrails into platforms, data, and processes, employees can build and experiment confidently without exposing the organization to unnecessary risk or slowing progress.
  • Match governance rigor to agent risk and maturity. Not all agents need the same level of oversight. A risk‑based, matrixed approach lets organizations trust lightweight, personal agents while applying deeper reviews to agents that write data, take actions, or operate across business‑critical systems.
  • Start with AI‑ready data and zero‑trust foundations. Strong agent governance rests on secure, well‑labeled, high‑quality data. Clear ownership, intuitive sensitivity labels, default protections, and automation reduce reliance on user judgment and allow agents to operate safely at scale.
  • Embed governance where agents are built and used. The most effective governance is built into tools and workflows, not enforced through manual reviews alone. Defaults, limits, identity‑based access, lifecycle controls, and telemetry should apply automatically so agents are governed by design.
  • Plan for the full agent lifecycle to prevent sprawl. Agent inventories, ownership models, attestation, and retirement processes are essential. Governance needs to account for how you create, share, evolve, audit, and ultimately decommission agents, whether individuals or enterprise teams are responsible for building them.
  • Reinforce governance through adoption and education. Guardrails work best when employees understand them. Targeted adoption programs, clear guidance, prerequisites for advanced tools, and visible leadership sponsorship can help employees build responsibly and recognize their role in protecting the organization.
  • Measure what matters to prove value and drive improvement. Visibility drives trust. Telemetry, observability, and clear metrics that span productivity, quality, risk reduction, and experience allow organizations to track impact, course‑correct early, and continuously improve their agent ecosystem.

Learn more

Try it out

Get started building and managing agents at your company with Microsoft Agent 365.

We’d like to hear from you!

Want more information? Email us and include a link to this story and we’ll get back to you.

The post Governing AI agents at scale: Lessons from our journey at Microsoft appeared first on Inside Track Blog.

]]>
23618
Supercharging network operations at Microsoft with AI-based unified network intelligence http://approjects.co.za/?big=insidetrack/blog/supercharging-network-operations-at-microsoft-with-ai-based-unified-network-intelligence/ Thu, 21 May 2026 15:30:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=23737 At Microsoft, our network engineers work across multiple systems, including topology views, telemetry dashboards, logs, incidents, tickets, and fragmented tools. They piece together signals from these sources to understand what’s happening during an incident, often under considerable time pressure. But this kind of fragmentation slows down reasoning. Engineers spend more time navigating tools than diagnosing […]

The post Supercharging network operations at Microsoft with AI-based unified network intelligence appeared first on Inside Track Blog.

]]>
At Microsoft, our network engineers work across multiple systems, including topology views, telemetry dashboards, logs, incidents, tickets, and fragmented tools. They piece together signals from these sources to understand what’s happening during an incident, often under considerable time pressure.

But this kind of fragmentation slows down reasoning. Engineers spend more time navigating tools than diagnosing issues.

To address this, the Microsoft Infrastructure, Networking, and Tenant organization in Microsoft Digital, the company’s IT organization, is building Infrastructure Graph (IGraph), a unified platform that brings topology, real-time telemetry, and operational context into a single view.

On top of this foundation, agentic capabilities enable AI agents to reason across these signals, surfacing insights, explaining issues, and recommending next steps. This shifts the experience from exploring data to making decisions faster and with greater confidence.

A photo of Sinha.

“Engineers increasingly face fragmented visibility. We wanted to unify live telemetry, topology, and context into one single intelligent visualization experience and show engineers what’s really important, so they don’t have to dive into oceans of data.”

Astha Sinha, product manager, Infrastructure, Networking, and Tenant team, Microsoft Digital

This visualization layer and intelligence platform provides a view of our entire Microsoft enterprise network—including more than 20,000 on-premises devices across 900 sites worldwide—to instantly surface the most critical issues and offer proactive recommendations to our engineers.

“Engineers increasingly face fragmented visibility,” says Astha Sinha, a product manager in the Infrastructure, Networking, and Tenant team in Microsoft Digital. “We wanted to unify live telemetry, topology, and context into one single intelligent visualization experience and show engineers what’s really important, so they don’t have to dive into oceans of data.”

Network insight at speed

IGraph displays the following in a single pane-of-glass view for a given site:

  • Topology and dependency context: Visualizes routers, switches, access points, client devices, and their relationships, enriched with path and dependency awareness to localize impact areas
  • Real-time health and telemetry insights: Surfaces live performance signals (utilization, errors, abnormal behavior) correlates directly onto the topology to highlight where the network is degraded or “running hot”
  • Operational and incident context: Integrates incidents, tickets, and change signals into the graph, enabling engineers to understand what is happening and where and what systems are affected in a single view
A photo of Kumar Singh.

“Fragmentation across operational data sources was only part of the problem. The harder challenge was externalizing and structuring the implicit domain knowledge engineers rely on, then integrating it with real-time telemetry and topology to enable low-latency, context-aware reasoning in the agentic layer.”

Vinod Kumar Singh, principal software engineer, Infrastructure, Networking, and Tenant team, Microsoft Digital

On top of this visualization layer, the team is building an agentic layer using Azure Foundry that allows AI agents to discover and use external tools and data sources.

Without IGraph agent, accessing data involves pulling from multiple existing sources, including servers and logs, with mixed latency (from minutes to hours). This fragmentation makes near-real-time reasoning almost impossible, as agents lack a unified, low-latency view of topology and telemetry.

“Fragmentation across operational data sources was only part of the problem,” says Vinod Kumar Singh, a principal software engineer in the Infrastructure, Networking, and Tenant team in Microsoft Digital. “The harder challenge was externalizing and structuring the implicit domain knowledge engineers rely on, the integrating it with real-time telemetry and topology to enable low latency, context-aware reasoning in the agentic layer.”

How IGraph works

The user starts in context. Say they’re on the IGraph UI for Building 32. They can already see the building topology, recent incidents, support tickets, and live health and performance metrics.

The engineer can ask a natural language question such as, “The internet is not working in Building 32—what’s going on?”

The AI agent begins reasoning across UI context (location, devices, open incidents), topology (involved devices and neighbors), historical metrics, and real-time device calls. It works with specialized MCP servers and agents to identify impacted devices, test live responsiveness, measure neighboring impact, verify data flow, and flag abnormal utilization or error trends.

A photo of Vijay.

“Engineers spend a lot of time firefighting. The visualization layer gives them the view they need to quickly solve the incidents. It helps free up their time to engage in more systemic improvements on their applications.”

Abhijit Vijay, principal software engineer manager, Infrastructure, Networking, and Tenant team, Microsoft Digital

Using this context, IGraph pulls in the relevant logs, real-time telemetry, and incident history to complete the analysis.

Instead of raw metrics and hundreds of rows of data, the agent returns a clean summary that provides a view of the failing device, the health of neighboring devices, and the blast radius. It shows what’s broken, what’s still healthy, the likely causes, and next actions.

The engineer stays in one UI for all this, and isn’t forced to use different tools or manually correlate data.

“Engineers spend a lot of time firefighting,” says Abhijit Vijay, a principal software engineer manager on the team in Microsoft Digital. “The visualization layer gives them the view they need to quickly solve the incidents. It helps free up their time to engage in more systemic improvements on their applications.”

The impact of incident visibility

IGraph offers a new real-time telemetry layer that:

  • Uses a UI that surfaces telemetry and topology by correlating data from upstream systems
  • Decreases effective latency for users, enabling near-real-time insights (often within seconds)
  • Provides near-real-time signals in the UI on health, performance, routing state, and neighboring device relationships
A photo of Mallick.

“Our goal is to accelerate how network engineers understand what’s happening, enabling them to shift from reactive troubleshooting to proactive prevention—identifying and mitigating issues before they occur.”

Nevedita Mallick, principal product manager, Infrastructure, Networking, and Tenant team, Microsoft Digital

Combined, these capabilities give network engineers an up-to-the moment view of what’s happening across the network, before small issues can cascade into larger incidents.

By making live telemetry easier to access and interpret, IGraph helps teams move from reactive troubleshooting to proactive prevention.

“Our goal is to accelerate how network engineers understand what’s happening, enabling them to shift from reactive troubleshooting to proactive prevention—identifying and mitigating issues before they occur,” says Nevedita Mallick, a principal product manager for the Infrastructure, Networking, and Tenant team in Microsoft Digital.

That speed and clarity are especially important for new engineers.

A photo of Keskar.

“The tool delivers value right away, especially for newer engineers. Instead of having to piece things together, they get an instant view of the network that shows how devices are connected and displays the already-surfaced incidents directly on the graph.”

Manjiri Keskar, principal cloud network engineer, Infrastructure, Networking, and Tenant team, Microsoft Digital

Complex networks rely on unwritten knowledge and experience built up over time, which can slow onboarding and make troubleshooting harder than it needs to be. IGraph shortens that learning curve by making the network’s relationships and current state immediately visible.

“The tool delivers value right away, especially for newer engineers,” says Manjiri Keskar, a principal cloud network engineer in the Infrastructure, Networking, and Tenant team in Microsoft Digital. “Instead of having to piece things together, they get an instant view of the network that shows how devices are connected and displays the already-surfaced incidents directly on the graph.”

What’s next for IGraph Agent

Without IGraph Agent, network analysis is largely reactive.

Teams often address failures after customers have already felt the impact, instead of preventing issues by acting when early warning signs appear.

A photo of Munde.

“Agentic AI is transforming networking DevOps from manual, reactive operations into intelligent intent-driven systems that can provision, validate, and troubleshoot networks autonomously. Looking ahead, it will power self-healing networks and dramatically accelerate buildouts, allowing engineers to focus on architecture, strategy, and innovation.”

Sonika Munde, senior network engineer, Infrastructure, Networking, and Tenant team, Microsoft Digital

Teams often address failures after customers have already felt the impact, instead of preventing issues by acting when early warning signs appear.

“Agentic AI is transforming networking DevOps from manual, reactive operations into intelligent, intent-driven systems that can provision, validate, and troubleshoot networks autonomously,” says Sonika Munde, a senior network engineer in the Infrastructure, Networking, and Tenant team in Microsoft Digital. “Looking ahead, it will power self-healing networks and dramatically accelerate buildouts, allowing engineers to focus on architecture, strategy, and innovation.”

That unified network intelligence will let IGraph Agent communicate with multiple lightweight agents that continuously analyze network conditions, dramatically compressing response times.

“What used to happen in hours will happen in minutes,” Munde says.

Now, the team is pushing further. One example is layering in weather intelligence to help engineers anticipate issues before they materialize, as big storms can trigger power fluctuations that ripple through the network. By visualizing this data, engineers can proactively communicate with customers and take mitigation steps that protect operational workloads.

Overall, IGraph lets teams focus on prevention. Engineers spend less time navigating dashboards and cross-checking data and more time detecting patterns and surfacing emerging risks. Manual analysis is reduced as the agent highlights insights in real time.

A photo of Thompson.

“By bringing telemetry, topology, and AI together in one intelligent layer, we’re turning fragmented signals into real-time intelligence so teams can move faster, act earlier, and protect the critical workloads that power Microsoft.”

Jason Thompson, principal group product manager, Infrastructure, Networking, and Tenant team, Microsoft Digital

The technology is poised to go even further. IGraph will eventually help power self-healing networks and speed up network build-outs, freeing engineers to focus on architecture and innovation. The future vision for the tool includes fully automated predictive network intelligence across all Microsoft campuses, with agents that monitor, reason, recommend responses, and safely take action.

“By bringing telemetry, topology, and AI together in one intelligent layer, we’re turning fragmented signals into real-time intelligence so teams can move faster, act earlier, and protect the critical workloads that power Microsoft,” says Jason Thompson, a principal group product manager for the Infrastructure, Networking, and Tenant team in Microsoft Digital.

Key takeaways

To move from reactive operations to proactive AI-supported network management, we recommend starting with these steps:

  • Start consolidating real-time telemetry into a single view. Even a lightweight dashboard is enough to prepare for AI-driven insights later.
  • Identify high-frequency incident types to target for AI triage. Pick the most common or disruptive scenarios and map out what data engineers currently review for them.
  • Document the decision logic your engineers use today. Before implementing AI, capture the human reasoning steps to help guide your approach.
  • Pilot an agentic solution with one network segment or site. Start with one building, one lab, or a small testbed.

The post Supercharging network operations at Microsoft with AI-based unified network intelligence appeared first on Inside Track Blog.

]]>
23737
Staying human: How we’re using AI to transform the sales experience at Microsoft http://approjects.co.za/?big=insidetrack/blog/staying-human-how-were-using-ai-to-transform-the-sales-experience-at-microsoft/ Thu, 21 May 2026 15:15:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=23718 At first glance, AI transformation can look like a technology deployment project: New tools arrive, training programs launch, dashboards go live, and leaders focus on speed, scale, and rollout discipline. But in practice, the technical side of transformation is only part of the story. The missing piece is us humans. When we encounter these kinds […]

The post Staying human: How we’re using AI to transform the sales experience at Microsoft appeared first on Inside Track Blog.

]]>
At first glance, AI transformation can look like a technology deployment project: New tools arrive, training programs launch, dashboards go live, and leaders focus on speed, scale, and rollout discipline.

But in practice, the technical side of transformation is only part of the story. The missing piece is us humans.

When we encounter these kinds of challenges internally at Microsoft, we think of ourselves as “Customer Zero.” We roll out our technology across our own organization first, learning what works and what doesn’t in real time and at scale so we can pass our lessons on to you.

A photo of Bertrand.

“After an early wave of enthusiasm for Copilot, adoption declined. People questioned whether AI was relevant to their role, worried about what it might mean for their work, and disengaged when the change they experienced didn’t match the change they imagined.”

Daniel Bertrand, senior director, AI Transformation Office

We learned valuable lessons about AI adoption and sustainable change when we deployed Microsoft 365 Copilot across our Microsoft Commercial organization, one of the company’s largest sales and service organizations. What we observed led us to reset our strategy and build a more human-centered process for deploying and driving adoption of our AI technology.

Driving AI adoption with role relevance and daily habits

Here on the Customer Zero team in Microsoft Customer and Partner Solutions (MCAPS), our 60,000-employee strong sales organization, we saw that getting access to Copilot didn’t automatically result in widespread AI adoption.

“After an early wave of enthusiasm for Copilot, adoption declined,” says Daniel Bertrand, a senior director on the AI Transformation Office team in MCAPS. “People questioned whether AI was relevant to their role, worried about what it might mean for their work, and disengaged when the change they experienced didn’t match the change they imagined.”

Initially, people used Copilot like a search engine and expected it to make work go away. When that didn’t happen automatically, they didn’t know how to approach prompting the AI, or how to create value with it. The gap between access and know‑how is where adoption slowed.

A photo of Neece Robien.

“I knew from experience that people prefer to hear from—and learn alongside—those closest to their day-to-day work, to build trust and confidence.”

Susan Neece Robien, senior director of adoption and change, AI Transformation Office

We reframed the problem from “How do we scale the technology?” to, “What does this change feel like for people doing the work every day?”

By talking to people in our larger organization about why they were reluctant to work with Copilot, we discovered the adoption barrier was less about the technology being available and more about whether people trusted it, understood how it fit their role, and felt confident enough to build new habits around it.

The ‘Adoption-in-a-Box’ approach

After these conversations, we changed our strategy across the board.

“I knew from experience that people prefer to hear from—and learn alongside—those closest to their day‑to‑day work, to build trust and confidence,” says Susan Neece Robien, a senior director of adoption and change on the AI Transformation Office team. “That led me to conceptualize Adoption‑in‑a‑Box—a repeatable approach that combines behavior‑change guidance, peer influence, habit‑forming activities, and light gamification so people can experiment with AI in a non‑threatening way and build confidence over time.”

We rolled out the Adoption-in-a-Box concept across the team in the following ways:

  • Emphasized visible leadership support: We circulated videos and “day in the life” PowerPoint 1-pagers of how our leaders were using Copilot.
  • Formed a community of early adopters: They becamepeer champions for adoption, evangelizing best practices and leading workshops.
  • Created a Role Hub: The hub contained practical, role-specific learning about how to use Copilot rather than doing high-level general trainings.
  • Ran prompt campaigns: To get our team started with habitually using AI in their daily roles, we ran prompt campaigns to make prompt learning accessible and actionable.
  • Created the Copilot Cup: We encouraged friendly competitions with leadership support. We also ran hackathons and prompt-based scavenger hunts to gamify learning about and using the AI for our team.
  • Created ongoing measurement mechanisms: We stood up dashboards with monthly, weekly, and daily average usage reports. We also ran quarterly surveys to track sentiment around AI adoption on the team.

After our initial success with Adoption-in-a-Box, we scaled it to adoption leads, who brought the model to life within their teams.

When people feel safe in experimenting with AI and incorporating it into their day-to-day work, that’s when it provides real value for the organization and the individual. We’ve learned that sustainable, scalable AI transformation succeeds when we put people first.

Key takeaways

If you’re wondering how to encourage your own team to adopt new AI technology into their workflows, you can learn from our experience:

  • Prioritize visible leadership participation. Leaders set the tone of any transformation, and AI adoption is no exception.
  • Roll out for role relevance. Specificity is the key here: How does AI relate to each person’s individual role? If the tool provides value and saves time, people will incorporate it into their workflow.
  • Establishing habits is crucial. Sustainable adoption means people use the tool on a daily basis in the natural flow of their work. Give them low-friction opportunities to learn the ropes.
  • Encourage peer-to-peer experimentation. Early adopters can be a valuable resource for showing others the way. Lowering the stakes by having a peer guide employees in a workshop or one-on-one can take the pressure off as they experiment with the tech.

The post Staying human: How we’re using AI to transform the sales experience at Microsoft appeared first on Inside Track Blog.

]]>
23718
How Work IQ is supercharging our AI usage at Microsoft http://approjects.co.za/?big=insidetrack/blog/how-work-iq-is-supercharging-our-ai-usage-at-microsoft/ Thu, 21 May 2026 15:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=23773 At Microsoft, we’re constantly thinking about the future of work—how the power of AI and agents is transforming the way knowledge workers do their jobs, streamlining workflows, and boosting employee productivity. These innovations have come in many different forms across every group and function at the company. It’s impossible to capture them all in a […]

The post How Work IQ is supercharging our AI usage at Microsoft appeared first on Inside Track Blog.

]]>
At Microsoft, we’re constantly thinking about the future of work—how the power of AI and agents is transforming the way knowledge workers do their jobs, streamlining workflows, and boosting employee productivity.

These innovations have come in many different forms across every group and function at the company. It’s impossible to capture them all in a single concept or story, but one of the ways that we’ve activated the power of AI for our workforce is Work IQ.

Work IQ isn’t a product.

It’s a shared intelligence layer that enables Microsoft 365 Copilot and AI agents to reason over and understand your organization’s work data, then use that context to generate more relevant responses and actions. This means that the entire Microsoft Graph—including rich unstructured data from your Teams chats and meetings, Outlook emails, Word documents, PowerPoint presentations, and more—is now part of your AI-powered work experience.

A photo of Hasan.

“It’s not really a brand-new capability, but more an evolution of what users already know, which is access to the grounding data in their Microsoft tenant. The difference is that Work IQ adds an additional layer to provide more context, allowing for richer and more relevant results.”

Aisha Hasan, principal product manager, Microsoft Digital

Work IQ enables Copilot to not only tailor answers to your role and responsibilities, but also to understand who your most frequent collaborators are, comprehend details about your latest projects, surface deliverables and deadlines, and intuit next steps. Additionally, Work IQ makes it easy for any AI agent to take advantage of the same rich enterprise data to return and act on more contextual results.

“It’s not really a brand-new capability, but more an evolution of what users already know, which is access to the grounding data in their Microsoft tenant,” says Aisha Hasan, a principal product manager in Microsoft Digital. “The difference is that Work IQ adds an additional layer to provide more context, allowing for richer and more relevant results.”

At Microsoft Digital, the company’s IT organization, we’ve seen firsthand how this intelligence layer is accelerating employee adoption of Copilot and agentic AI as outputs become more perceptive and valuable. Work IQ is a foundational step toward a future where AI has moved beyond isolated assistance and become a trusted professional helper—sometimes described as a digital colleague—that carries out tasks and anticipates needs in every aspect of daily work.

How Work IQ impacts everyday work

One of the most instructive aspects of Work IQ’s impact across our organization is that it happened without a traditional deployment. There was no enablement event for employees or operational playbook distributed to administrators. It didn’t require any changes to the application interfaces. Yet over time, our employee Copilot interactions improved in measurable ways.

A photo of Willingham.

“There was a period where we weren’t adding new content to Copilot, and yet I noticed our metrics for quality and user satisfaction kept going up. Why was that? It was because of all these incremental improvements that we refer to as Work IQ.”

Dodd Willingham, principal product manager, Microsoft Digital

This was a direct consequence of introducing a shared intelligence layer into a Microsoft environment that was already rich in work signals. Those work signals are extremely valuable data that was difficult to extract meaning from before the advent of AI. As the technology advanced, we could take full advantage of this data to inform and improve agentic responses.

As Customer Zero for the company, Microsoft Digital was at the forefront of measuring the impact of Work IQ. Our employees saw significant gains in relevance, grounding, and answer coherence in Copilot that were visible in the metrics, even during times when the underlying content remained relatively static. That’s the Work IQ difference.

“There was a period where we weren’t adding new content to Copilot, and yet I noticed our metrics for quality and user satisfaction kept going up,” says Dodd Willingham, a principal product manager in Microsoft Digital. “Why was that? It was because of all these incremental improvements that we refer to as Work IQ.”

At a systems level, Work IQ reasons across a broad cross-section of Microsoft 365 data, including:

  • Outlook email content, thread structure, and interaction patterns
  • Teams chats, channels, and meeting transcripts
  • Calendar events and scheduling metadata
  • Documents and files across Word, PowerPoint, Excel, OneDrive, and SharePoint
  • Signals that show who collaborates with whom, how often, and in what context

Work IQ can also access structured data in tools like Dynamics 365, Power BI, Power Apps, and other business applications. The ability to extract context and interpret structured and unstructured data in a unified intelligence layer is the reason why Work IQ is making such a difference for our employees.

Making Outlook better

Outlook provides a useful lens on how Work IQ functions because it’s both heavily used by our employees and a highly contextual tool. Although the application hasn’t outwardly changed, the way Copilot interacts with inbox and calendar data has evolved, in part due to richer context provided by Work IQ.

A photo of Marzynski.

“The intelligence works behind the scenes as you use Outlook. Your inbox just gradually feels more relevant. Outlook adapts to your work patterns, making your inbox feel more like an assistant, instead of a filing cabinet of communications.”

Matthew Marzynski, principal product manager, core experiences, Microsoft Digital

Now when you turn to Copilot in Outlook to summarize email threads, it can surface decision points, action owners, and unresolved issues. Instead of treating email as a collection of messages and providing rote summaries, Copilot perceives it as a record of decisions and commitments over time.

Calendar-related experiences are on a similar trajectory. Meeting preparation and follow‑up suggestions are now drawing on prior interactions with the same participants, relevant documents that were previously shared, and historical patterns around similar meetings.

A graphic showing the three layers of Work IQ: data layer, context layer, and skills and tools layer.
Work IQ uses AI to apply contextual reasoning over different sources of work data, improving the results generated by the skills and tools that our knowledge workers use every day, such as Microsoft 365 Copilot.

Work IQ isn’t rule-based automation layered on top of Outlook. Users aren’t configuring new filters or workflows. Instead, the system is adapting based on observed patterns, meaning user behavior can remain the same while output quality improves

“The intelligence works behind the scenes as you use Outlook,” says Matthew Marzynski, a principal product manager for core experiences in Microsoft Digital. “Your inbox just gradually feels more relevant. Outlook adapts to your work patterns, making your inbox feel more like an assistant, instead of a filing cabinet of communications.”

Applying persistent memory

Another important aspect of Work IQ is the ability to retain persistent memory of each employee’s role, responsibilities, and work context. Copilot and other agents no longer need to be continually prompted with details about who the user is and what they’re working on. It learns that information and remembers it going forward.

This feature, also called persistent understanding, builds trust and increases efficiency each time an employee turns to AI for help with their work. AI systems that depend on manual context-setting don’t scale well across large organizations, which we at Microsoft Digital learned as we tested and deployed Copilot across the company.

“The user no longer has to tell the agent, ‘I work in this area, so please tailor your response to that’ every time,” says Anishkumar Ramakrishnan, a principal PM manager in Microsoft Digital. “With Work IQ, Copilot and agents recall it going forward. It remembers things that the user doesn’t even remember themselves about their past work and actions. This is the promise of intelligent context.”

From answers to action: Work IQ and AI agents

As organizations move toward integrating AI agents into all aspects of their day-to-day work, the value of Work IQ increases. Any agent—not just a general-purpose agent like Copilot—that can interpret vast amounts of your unstructured work data is going to produce results that are far more relevant than one that simply draws on general knowledge about a topic or process.

A photo of Jangir.

“Before, a builder had to go connector by connector and be very prescriptive—calendar read, email read, meeting access—just to build an agent. Now they can simply point the agent to Work IQ, and it gains contextual access across mail, calendar, meetings, and files through a single connector (API or MCP server).”

Naveen Jangir, principal architect, Microsoft Digital

Early agent implementations relied on narrower task-specific access to data. For each agent, a developer would have to build connections to a particular document library, mailbox, or set of calendar data. Each connection required separate consent and management, which generally resulted in a more limited scope.

But with Work IQ, builders can create agents using Microsoft Copilot Studio or other development platforms (such as Microsoft Foundry) that use APIs or Model Context Protocol (MCP) servers to connect to Microsoft Graph data. This enables them to bring the full power of enterprise data to any agentic creation, not just Microsoft 365 agents.

Before, a builder had to go connector by connector and be very prescriptive—calendar read, email read, meeting access—just to build an agent,” says Naveen Jangir, a principal architect in Microsoft Digital. “Now they can simply point the agent to Work IQ, and it gains contextual access across mail, calendar, meetings, and files through a single connector (API or MCP server).”

This shift doesn’t just simplify agent development—it fundamentally expands what agents are capable of. Instead of operating within narrow, predefined tasks, agents can now reason across a broader work context to deliver better outcomes. For example, an agent supporting a project manager can surface relevant email threads, identify key stakeholders from meeting activity, reference the latest project documents, and highlight upcoming deadlines—all within a single interaction.

Intelligence without bypassing governance

From a governance perspective, Work IQ doesn’t introduce a new security model. Instead, it operates entirely within the existing Microsoft 365 data protection boundaries that our company and our customers already rely on.

The intelligence layer can access this enterprise data, but it does so while honoring permissions, sensitivity labels, access policies, and compliance controls defined at the source. Work IQ can only surface or act on information that the user—or an agent identity acting on the user’s behalf—is already authorized to access.

This inheritance model is intentional. Governance remains rooted in the data layer, not in the AI layer. Work IQ respects established controls such as identity‑based access and tenant policies, which means agents are generally given less access than human users.

“An agent user only gets access to what is explicitly shared with it,” Jangir says. “Human users typically have broader default access. By design in Work IQ, agents can usually see less than people, not more.”

For IT and security teams, this places the emphasis squarely on data discipline and identity controls, which are complementary security layers. Work IQ amplifies the value of well‑governed data and exposes weaknesses where governance is inconsistent. Admins remain in control of access and can turn off APIs and MCP server connections if they want to limit an agent’s data access.

Work IQ, Fabric IQ, and Foundry IQ

As we’ve scaled up Copilot and agentic AI internally, one lesson has become clear: Intelligence works best when it’s part of a layered infrastructure rather than working on its own.

That’s why Work IQ is just one context layer we’re using at Microsoft. We’ve also developed Fabric IQ and Foundry IQ, which are complementary layers in our overall data strategy. Each of these addresses a different aspect of enterprise intelligence.

A graphic showing the overlap of the three intelligence layers to produce more powerful agentic results.
Work IQ combines with the Fabric IQ and Foundry IQ intelligence layers to create a shared business ontology that enables the completion of more complex agentic tasks.

The three layers serve distinct but connected purposes:

  • Work IQ focuses on unstructured productivity data, helping AI understand how people work across email, meetings, documents, and collaboration signals.
  • Fabric IQ applies similar reasoning to analytical and structured data, adding context and explanation to metrics, trends, KPIs, and other business signals.
  • Foundry IQ provides the foundation for builders to create agents that draw from both worlds, connecting intelligence across Microsoft 365, analytics platforms, and line‑of‑business systems.

Taken together, these layers also contribute to something deeper: the emergence of a shared business ontology. By extracting and aligning business entities—such as people, projects, and processes—from both structured data in Fabric IQ and the unstructured signals captured by Work IQ, the system perceives meaningful connections that previously were hidden. This unified understanding allows agents to reason across domains with greater precision, linking metrics to the real work and making insights more actionable in context.

This architecture matters because it removes artificial seams. Agents shouldn’t need to shift between separate contexts for work content, enterprise data, or application logic. The IQ layers make it possible to deliver a single agentic experience that reasons consistently, applies governance uniformly, and moves with users across environments. Just as importantly, the same controls—identity, permissions, labeling, and policy—flow through each layer, keeping trust intact as capability expands.

At Microsoft, Work IQ and the other context layers are helping Copilot and agents to accelerate beyond AI experimentation. They are now vital operational tools that make everyone more productive across the global enterprise. Context and intelligence in agentic tools are a key part of the future of work, at Microsoft and for our customers as well.

Key takeaways

Here are some things to keep in mind as you prepare your own organization to take full advantage of Work IQ:

  • Treat the technology as infrastructure, not a feature. We didn’t formally roll out Work IQ. Its value emerged gradually as it improved Copilot responses and as our agent builders could more easily tap into unstructured enterprise data.
  • Expect improvements in AI quality without changes to your data. We saw measurable gains in relevance and user satisfaction even when underlying content remained the same, driven by better contextual reasoning across existing work signals.
  • Focus on how employees work, not just what content exists. Work IQ improves AI outcomes by connecting people, relationships, and activity patterns, resulting in more actionable and grounded responses.
  • Use Work IQ to move from assistance to action with agents. By giving agents access to contextual enterprise data through a unified layer, we enabled more automated workflows without requiring developers to manage dozens of connectors manually.
  • Invest in data governance early to maximize AI value. Because Work IQ inherits permissions and policies from the data layer, its effectiveness—and safety—relies on clear labeling, intentional access design, and disciplined data management.
  • Enable self-service collaboration data so it’s available for Work IQ. WorkIQ can only ground on data that is both available and not purposefully hidden. We make sure that our meetings are AI-enabled (and often recorded) and allow self-service in Teams and SharePoint, so the data is not hidden from Work IQ.
  • Build toward a unified intelligence model across work and data. Combining Work IQ with Fabric IQ and Foundry IQ means agents can operate seamlessly across different kinds of data and incorporate more intelligence into their output and actions.

The post How Work IQ is supercharging our AI usage at Microsoft appeared first on Inside Track Blog.

]]>
23773
Unfolding our AI-in-IT story: What to expect at the 2026 Microsoft 365 Community Conference http://approjects.co.za/?big=insidetrack/blog/unfolding-our-ai-in-it-story-what-to-expect-at-the-2026-microsoft-365-community-conference/ Mon, 20 Apr 2026 16:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=23224 This article is about an event that is now completed. We leave the post up on our site as a record of the conference and the topics covered by some of our Microsoft Digital subject matter experts. At Microsoft Digital, the company’s IT organization, we shape and propel many of our groundbreaking products through our […]

The post Unfolding our AI-in-IT story: What to expect at the 2026 Microsoft 365 Community Conference appeared first on Inside Track Blog.

]]>
This article is about an event that is now completed. We leave the post up on our site as a record of the conference and the topics covered by some of our Microsoft Digital subject matter experts.

At Microsoft Digital, the company’s IT organization, we shape and propel many of our groundbreaking products through our role as the company’s Customer Zero—and we want to tell that story. At this year’s Microsoft 365 Community Conference, we hosted a variety of sessions focused on change management, AI adoption, and how we manage governance in the era of the Frontier Firm.

As Customer Zero for Microsoft 365 Copilot, we embedded the technology into our employees’ daily workflows and carefully monitored the results. That journey from early experimentation to broad adoption of the tool across our organization continues to guide the company as we explore what comes next.

Today, that’s agents.

“Copilot changes how our employees work. Agents are changing how the work gets done. Our focus is to make the technology practical and valuable, so people want to use it daily.”

Stephan Kerametlian, senior director, business program management, Microsoft Digital

We’ve reached a level of maturity with Copilot that allows us to move from individual productivity to systems that can reason and collaborate on our behalf. Our focus now is on driving the adoption of agents across the company, grounding them in our workflows to solve problems.

“Copilot changes how our employees work,” says Stephan Kerametlian, a senior director in Microsoft Digital. “Agents are changing how the work gets done. Our focus is to make the technology practical and valuable, so people want to use it daily.”

Adoption doesn’t happen without trust

As we’ve empowered employees with more capable AI tools that can help automate tasks and make decisions, we’ve been equally focused on making sure the right safeguards are in place.

Innovation and safety are extremely important—the challenge is to enable both at the same time. And this is where governance comes in.

We’ve spent a lot of time getting governance right. This means giving people confidence, not slowing them down. When employees know the guardrails are there, they feel empowered to experiment and innovate safely.”

David Johnson, principal PM architect, Microsoft Digital

At Microsoft, good governance is what makes innovation sustainable. It’s how we protect the company, our data, and our customers, while still giving employees the freedom to build and push boundaries with AI.

“We’ve spent a lot of time getting governance right,” says David Johnson, a principal PM architect in Microsoft Digital. “This means giving people confidence, not slowing them down. When employees know the guardrails are there, they feel empowered to experiment and innovate safely.”

How Microsoft does IT: Managing and governing agents—empower with risk-aligned oversight

Session description: See how Microsoft Digital empowers employees with tools to build and manage agents. From agent management with Microsoft Agent 365, to securing our environment with Microsoft Defender, to managing our productivity estate with Microsoft Purview, this session offers broad insights into how we use our own technology to accelerate agentic innovation while mitigating risk.

Speakers: David Johnson, Naveen Jangir, and Mike Powers

A photo of Johnson

David Johnson leads our internal Microsoft 365 and productivity services with responsibility for tenant strategy, architecture, and governance. He manages how we empower employees with guardrails and manages our capability onboarding and tenant configuration.

A photo of Jangir

Naveen Jangir is a principal architect in Microsoft Digital. He drives Microsoft 365 security and compliance strategy and leads tenant architecture and capability onboarding, while overseeing secure adoption of services across the enterprise.

A photo of Powers

Mike Powers is a senior service engineer and AI administrator in Microsoft Digital who manages Copilot features, Agent 365, and enterprise AI operations. He partners with internal product groups and security stakeholders to make sure AI tools and agents are deployed responsibly and governed effectively.

More on AI agents and governance at Microsoft


Inside Microsoft: Reclaiming engineering time with AI in Azure DevOps

Session description: AI tools embedded directly into Azure DevOps (ADO) are changing how engineering teams work, eliminating manual tasks without creating separate tools or increasing cognitive load. This session explores how ADO AI Chat and the AI Work Item Assistant accelerate coding workflows at Microsoft. You’ll learn how to improve your backlog quality, sprint hygiene, and downstream effectiveness of GitHub Enterprise and Copilot, helping your teams reclaim capacity and focus on the work that moves products forward.

Speakers: Gopal Panigrahy and Sumit Dutta

A photo of Panigrahy

Gopal Panigrahy is a product leader and member of our product management team in Microsoft Digital. He’s an advocate for our customer-first approach to product development and is passionate about helping people overcome challenges in the era of AI.

A photo of Dutta

Sumit Dutta is a product-minded technology leader working at the intersection of AI, enterprise platforms, and scalable product design. Offering a strong blend of engineering knowledge and product strategy, he focuses on building systems that are not just functional but also extensible and reliable.

More on AI and IT engineering at Microsoft


How Microsoft does IT: Microsoft 365 governance in the age of Copilot and agents

Session Description: Microsoft 365 Copilot and Copilot agents are powerful tools, but without proper governance, you could be putting your company at risk. In this lightning talk, you’ll learn how Microsoft Digital protects our enterprise while enabling employee innovation with Copilot and agents.

Speaker: David Johnson

A photo of Johnson

Johnson brings hands-on experience operating Copilot and AI-powered agents inside Microsoft, with a focus on identity, permissions, data boundaries, and real-world misuse prevention. He takes real-world lessons and makes them practical for others.

More on governance at Microsoft


Accelerating AI adoption with Copilot controls: Lessons from Microsoft Digital

Session description: Microsoft 365 Copilot and AI agents unlock productivity gains, but without careful oversight they can also introduce security and compliance risks. The session covers how the Copilot Control System helps scale AI safely, including adoption insights and satisfaction signals. You’ll also see demos of popular agents, including the Employee Self-Service Agent and the Admin agent.

Speakers: Amy Ceurvorst and Reshma Kapoor

A photo of Ceurvorst

Amy Ceurvorst is a director of business programs In Microsoft Digital. She’s worked extensively with Copilot controls and evangelizes a unified way to view Copilot health reports that help administrators understand Copilot health.  

A photo of Kapoor

Reshma Kapoor is a senior product manager in Microsoft Digital with 20 years of experience leading and shipping products at scale. She is customer‑obsessed, grounding product decisions in real customer signals to deliver intuitive, high‑impact experiences.

More on AI and Copilot adoption and deployment


How Microsoft does IT: Driving adoption of Microsoft 365 Copilot and agents across Microsoft

Speakers: Cadie Kneip and Stephan Kerametlian

Session description: Our team at Microsoft Digital led the first enterprise-scale deployment of Microsoft 365 Copilot, launching to more than 300,000 employees and vendors worldwide. Learn how the team drove adoption using change management strategies to encourage employees to thread Copilot into their daily work. Now we’re doing the same for agents across the enterprise. Learn best practices for accelerating adoption and maximizing value while guiding your own journey with Copilot and AI agents.

A photo of Kneip

Cadie Kneip is a senior business program director and the Copilot Champs community lead in Microsoft Digital. She specializes in turning complex AI initiatives into confidence-building pathways that help employees thrive in an AI-powered workplace. 

A photo of Kerametlian

Stephan Kerametlian is a senior director in Microsoft Digital, where he leads our global change management efforts for Copilot and agents. He thrives on learning how people use AI and on finding ways to get more people to embrace the technology.

More on adoption and deployment of Copilot and agents


Real-world adoption stories: A fireside chat with a key customer

Session description: Pull back the curtain on the customer experience with Copilot adoption. Join this fireside chat with a Microsoft customer to hear about lessons learned and the real impact that Copilot is delivering across their organization. You’ll glean practical insights you can apply immediately at your own company. 

Speakers: Karuana Gatimu and Sam Crewdson

A photo of Gatimu

Karuana Gatimu is a director of Customer Advocacy – AI & Collaboration in Microsoft Digital and a solution architect driven by a passion for people, storytelling, and leadership. With 30 years of experience at the intersection of technology and human impact, she turns complex innovation into compelling narratives that help organizations adopt change and deliver business value.

A photo of Crewdson.

Sam Crewdson, a principal product manager in Microsoft Digital, is passionate about turning user insights into product improvements. His work focuses on driving adoption of the latest SharePoint features and helping users take advantage of the power of both SharePoint and OneDrive. Working at the intersection of IT, users, feedback, and strategy, he translates real‑world business needs into collaborative experiences that scale.  

More insights on Copilot adoption


The post Unfolding our AI-in-IT story: What to expect at the 2026 Microsoft 365 Community Conference appeared first on Inside Track Blog.

]]>
23224
Microsoft CISO advice: Read our four tips for securing your network http://approjects.co.za/?big=insidetrack/blog/microsoft-ciso-advice-read-our-four-tips-for-securing-your-network/ Thu, 19 Mar 2026 16:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=22779 Geoff Belknap, CVP and operating CISO for Core and Enterprise, shares four key practices your business can use to be prepared for managing network security incidents. Learn from our experience Network isolation (Secure Future Initiative) “Knowing where devices are, who owns them, and what they’re supposed to be doing is pretty important in the middle […]

The post Microsoft CISO advice: Read our four tips for securing your network appeared first on Inside Track Blog.

]]>
Geoff Belknap, CVP and operating CISO for Core and Enterprise, shares four key practices your business can use to be prepared for managing network security incidents.

“Knowing where devices are, who owns them, and what they’re supposed to be doing is pretty important in the middle of an incident,” Belknap says.

Watch this video to see Geoff Belknap discuss how we’re securing our network at Microsoft. (For a transcript, please view the video on YouTube: https://www.youtube.com/watch?v=nWPaaTHGE-M.)

Key takeaways

Here are best practices you can use to secure your network:

  • Build a complete inventory. Keep track of what your network devices are, who owns them, and what they do.
  • Capture robust telemetry. Make sure your operational teams have the tools they need to see and analyze access and authentication logs.
  • Use dynamic access control. Manage who can send packets on the corporate network by applying policies.
  • Deprecate old network assets. Cyberattackers know to look for older, unpatched network devices. You can reduce the attack surface by replacing older devices.

The post Microsoft CISO advice: Read our four tips for securing your network appeared first on Inside Track Blog.

]]>
22779
Microsoft CISO advice: Explore our four tips for securing your customer support ecosystem http://approjects.co.za/?big=insidetrack/blog/microsoft-ciso-advice-explore-our-four-tips-for-securing-your-customer-support-ecosystem/ Thu, 12 Mar 2026 16:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=22635 Microsoft business operations teams know all too well that cyberattackers seek to exploit customer support pathways. Tools that can unlock customer accounts or aid in troubleshooting issues in complex environments are a rich target. “The path attackers really like to use is to compromise support tooling and laterally move to your core tooling,” says Raji […]

The post Microsoft CISO advice: Explore our four tips for securing your customer support ecosystem appeared first on Inside Track Blog.

]]>
Microsoft business operations teams know all too well that cyberattackers seek to exploit customer support pathways. Tools that can unlock customer accounts or aid in troubleshooting issues in complex environments are a rich target.

“The path attackers really like to use is to compromise support tooling and laterally move to your core tooling,” says Raji Dani, Deputy Chief Information Security Officer (CISO) for Microsoft business operations.

Dani and her team focus on understanding and mitigating the risks within customer support operations. In this video, she shares principles and practices for every business that relies on online tools in their customer support ecosystem.

Watch this video to see Raji Dani discuss four customer support ecosystem security principles. (For a transcript, please view the video on YouTube: https://www.youtube.com/watch?v=rJ87jjz3vvo .)

Key takeaways

Here are best practices you can apply to your customer support ecosystem:

  • Create dedicated and isolated support identities. Use standardized support identities with phish-resistant multifactor authentication based in a separate identity ecosystem.
  • Implement least privilege and enforce device protection. Only grant the access needed for a given task and nothing more.
  • Ensure tooling does not have high privilege access to customer data. Architect secure tools and manage service-to-service trust and high privileged access.
  • Implement strong telemetry. Anomalous patterns in logs and telemetry data are often the first clue a cyberattack is underway.

The post Microsoft CISO advice: Explore our four tips for securing your customer support ecosystem appeared first on Inside Track Blog.

]]>
22635
Hardening our digital defenses with Microsoft Baseline Security Mode http://approjects.co.za/?big=insidetrack/blog/hardening-our-digital-defenses-with-microsoft-baseline-security-mode/ Tue, 18 Nov 2025 16:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=20811 Security isn’t just a feature—it’s a foundation. As threats grow more varied, widespread, and sophisticated, enterprises need to rethink how they protect their environments. That’s why we, in Microsoft Digital, the company’s IT organization, took a necessary step forward and deployed Microsoft Baseline Security Mode internally across the company. Engage with our experts! Customers or […]

The post Hardening our digital defenses with Microsoft Baseline Security Mode appeared first on Inside Track Blog.

]]>
Security isn’t just a feature—it’s a foundation.

As threats grow more varied, widespread, and sophisticated, enterprises need to rethink how they protect their environments. That’s why we, in Microsoft Digital, the company’s IT organization, took a necessary step forward and deployed Microsoft Baseline Security Mode internally across the company.

Baseline Security Mode is a new approach to endpoint protection that enforces secure-by-default configurations across our enterprise. And it’s not just about locking things down—it’s about doing so in a way that’s scalable, manageable, and respectful of user experience.

This is a story for every organization trying to balance usability with security. Baseline Security Mode is designed to help IT teams enforce protections without breaking productivity. It’s a shift toward proactive defense with standardized secure settings.

Understanding the need for Microsoft Baseline Security Mode

Security must evolve with the environment.

At Microsoft Digital, we’ve built a strong foundation of endpoint protection over the years. But as our ecosystem expanded—more devices, more workloads, more diverse user needs—we saw an opportunity to take our security posture to the next level.

Our existing configurations were effective, but they reflected the natural complexity of a large enterprise. Different teams had different requirements. Some relied on legacy technologies that had served them well. Others needed flexibility to support specialized workflows. Over time, this led to variation in how security policies were applied.

We wanted to unify that approach.

Baseline Security Mode emerged as a way to streamline and strengthen our defenses. It was about building on what worked. We started by identifying areas where legacy protocols and configurations could be modernized. That included technologies like ActiveX controls and older authentication flows, which we carefully evaluated and phased out where appropriate.

We also improved how we gather and use telemetry. Initially, we had limited visibility into how certain features were used. That made it harder to predict the impact of changes. So, we ran pilots, collected feedback, and refined our approach. Baseline Security Mode was a game changer here, providing built-in reports that gave us the visibility we needed to observe the impact of applying settings in our environment. For example, when we reviewed blocking legacy file formats, we discovered that some workflows depended on them. We responded quickly, offering alternatives and guiding users through the transition.

Ease of use was a priority.

We built intuitive controls into the Microsoft 365 admin center, allowing IT admins to manage policies with just a few clicks. No more manual scripts. No more guesswork. We also introduced exception handling to support specialized needs, ensuring that security didn’t come at the cost of productivity.

We worked closely with internal stakeholders, including compliance teams and work councils, to validate every step and build trust. We made sure the experience was smooth, the tools were reliable, and the changes were clearly communicated.

This wasn’t just a technical upgrade—it was a cultural shift.

Baseline Security Mode gave us a way to unify our security posture while honoring the diversity of our environment. It’s a smarter, more scalable way to protect our endpoints, and it reflects everything we’ve learned from years of experience.

Putting consistent security configuration into practice

Baseline Security Mode establishes a new standard, enabling organizations to be secure by default.

It is the result of a collaborative effort of multiple product teams at Microsoft, building on their security and incident-handling expertise.  It’s designed to simplify and strengthen endpoint protection across Windows and Microsoft 365. The feature lives in the Microsoft 365 admin center, where IT admins can enforce modern security policies with just a few clicks.

“When we blocked certain file formats, users were confused by the error messages and thought they were blocked from saving the file. So, we ran pilots, gathered feedback, and helped the product team build an improved error experience to save blocked formats to safe, newer formats.”

Harshitha Digumarthi, senior product manager, Microsoft Digital

The product teams delivered 20 features across five workloads: Office, OneDrive and SharePoint, Teams, Substrate, and Identity. Each one targets a specific risk—blocking legacy authentication, disabling insecure protocols, restricting ActiveX, and more.

When we deployed Baseline Security Mode as Customer Zero at Microsoft Digital, our job was to validate these features and controls in real-world enterprise conditions.

We pushed for exception handling.

Some users still relied on legacy formats or protocols. Certain teams, for example, needed access to older Office features. So, we worked with the product team to ensure exceptions could be built into the UI.

That flexibility was key. We knew from experience that without it, customers might hesitate to adopt the feature.

“When we blocked certain file formats, users were confused by the error messages and thought they were blocked from saving the file,” says Harshitha Digumarthi, a senior product manager at Microsoft Digital. “So, we ran pilots, gathered feedback, and helped the product team build an improved error experience to save blocked formats to safe, newer formats.”

We also pushed for better telemetry.

A photo of Gonis.

“When we heard about Baseline Security Mode, it was still in ideation. There were no tools in the Microsoft 365 admin center yet. We had to figure out how to enable this internally while the product team built the capabilities in parallel.”

Markus Gonis, senior service engineer, Microsoft Digital

At first, we had only a few days of data. That wasn’t enough to understand how features were used or what impact they would have. So we worked with the product team to expand telemetry, improve error reporting, and reduce false positives, including identifying bugs that skewed metrics and made troubleshooting harder.

We ran the deployment through our Tenant Trust Program and work council reviews to ensure global compliance. That gave us—and our customers—confidence.

Baseline Security Mode isn’t just a feature. It’s a shift in how we think about security, and we’re proud to have helped shape it.

Deploying Baseline Security Mode at Microsoft Digital

Rolling out Baseline Security Mode wasn’t just a technical exercise—it was a cross-team effort that demanded precision, patience, and partnership.

Microsoft Digital took the lead on deployment. We acted as Customer Zero, testing every feature in real-world conditions before it reached customers. That meant working closely with the product team to validate functionality, identify bugs, and shape the user experience.

“When we heard about Baseline Security Mode, it was still in ideation,” Gonis says. “There were no tools in the Microsoft 365 admin center yet. We had to figure out how to enable this internally while the product team built the capabilities in parallel.”

Telemetry was limited. We had only 30 days of data to work with. That made it hard to predict how changes would affect users, so we ran pilots with internal user acceptance testing cohorts and we deployed in phases.

Philpott appears in a photo.

“It was a great Customer Zero experience. Our security teams stood to benefit from Baseline Security Mode features, and we helped the product team find bugs and the issues that just hadn’t come up in early testing or at a large scale. It was a win-win situation”

John Philpott, principal product manager at Microsoft Digital

For some legacy protocols, usage was low. In these cases, the features being deployed made removing these protocols seamless. Where usage was higher or unclear, a more detailed approach was required.

First, a few thousand users. Then 50,000. Then 100,000. Eventually, the entire Microsoft tenant. We paused between each wave to monitor help desk tickets, gather feedback, and confirm that our mitigation strategies were working.

Communication was critical.

We ran targeted campaigns, sent individual emails, and published technical reports explaining what was changing, why it mattered, and how users could adapt. We even used Viva Engage to notify users directly. It was important to explain to users why longstanding functionalities were being removed. We had to explain what we were doing and how to mitigate any impact.

We did a lot of work with the product team to ensure the user experience and the IT pro experience both exceeded expectations.

“It was a great Customer Zero experience,” says John Philpott, principal product manager within Microsoft Digital. “Our security teams stood to benefit from Baseline Security Mode features, and we helped the product team find bugs and the issues that just hadn’t come up in early testing or at a large scale. It was a win-win situation.”

We flagged inconsistencies in policy syntax, pushed for better error handling, and worked with the product team to align deployment tools across workloads.

But we didn’t stop at deployment. We tracked progress, validated telemetry, and signed off on each feature before it moved into broader rollout. We even helped pave the way for the next iterations, identifying features that needed more design work or deeper telemetry before they could be deployed.

This was a true partnership. The product team built the features. We tested them, validated them, and helped make them better.

Baseline Security Mode is now live across Microsoft. And it’s ready for the world.

Capturing real benefits

Baseline Security Mode is more than a set of policies—it’s a platform for proactive defense.

The product team built it to reduce legacy risks and enforce modern security standards across Microsoft 365 workloads. Microsoft Digital validated it in production, surfacing bugs, shaping telemetry, and confirming that the features worked as intended.

We tested 22 features across Office, OneDrive & SharePoint, Substrate, Identity, and Teams. Each one targeted a specific vulnerability—like blocking ActiveX controls, disabling Exchange Web Services, or enforcing phishing-resistant authentication for admins.

We flagged critical ActiveX dependencies in third-party apps —something the product group hadn’t found—which enabled them to initiate removal. That kind of early detection helped fix issues before the features reached customers.

We found regressions in PowerShell and legacy authentication flows. The OneDrive and SharePoint team caught a high-impact bug and worked with the product team to resolve it.

That validation mattered.

We also helped shape the admin experience.

Exception handling was built into the UI. Admins could create security groups, assign users, and manage exclusions directly in the Microsoft 365 admin center.

“There’s no need to handle everything manually,” Philpott says. “Simply click here and then here to disable. It’s a much simpler process.”

Extending benefits to Microsoft customers

Baseline Security Mode is ready for enterprise.

We’ve tested it. We’ve hardened it. And we’ve made it easier to adopt.

Microsoft Digital’s deployment journey helped shape the product into something customers can trust. We didn’t just validate features—we made sure they worked in real-world environments, across diverse teams, and under the pressure of scale.

 The product team designed the features to be enterprise-ready. We ran them through our Tenant Trust Program and work council reviews to ensure compliance across global regions. That gave us confidence—and gave customers confidence too.

The benefits are clear. We’ve reduced our attack surface. We’ve improved compliance. We’ve made it easier for IT teams to enforce security without disrupting workflows. And we’ve laid the groundwork for secure-by-default computing across Microsoft.

 Customers can do the same.

Start small. Run pilots. Monitor impact. Use the tools in the Microsoft 365 admin center to deploy policies, manage exceptions, and guide users through the change. And don’t be afraid to ask for help—our journey has shown that collaboration between deployment teams and product teams makes all the difference.

Baseline Security Mode is ready, and we’re ready to help others adopt it.

Looking ahead

The first wave of Baseline Security Mode—BSM 2025—delivered 22 features across five major workloads. Microsoft Digital helped validate and deploy those features across the enterprise. And the next wave of features is already in motion.

And it’s bigger, with 46 features, more than double what we had in the first round. The product team is expanding coverage to include deeper protocol restrictions, broader app controls, and more granular authentication policies.

We’re also preparing for broader industry adoption.  

Governments, regulators, and enterprise customers are asking for secure-by-default configurations. Baseline Security Mode is our answer. And the next version will make it even easier to adopt.

We’ll continue to lead as Customer Zero. We’ll test new features, validate insights surfaced by telemetry, and share feedback with the product team. We’ll run pilots, monitor impact, and guide users through the change. And we’ll keep pushing for simplicity, scalability, and trust.

Because security isn’t a one-time project— It’s a mindset, and it’s Microsoft’s highest priority.

Key takeaways

Ready to adopt Baseline Security Mode? Here’s some actions we recommend based on our deployment experience:

  • Start with a pilot: Test Baseline Security Mode with a small group of users to identify legacy dependencies and gather feedback before scaling.
  • Use the Microsoft 365 admin center for deployment: Apply policies and manage exceptions directly through the UI—no scripting required.
  • Identify and plan for exceptions early: Work with business units to understand where legacy formats or protocols are still needed and create security groups for exclusions.
  • Communicate proactively with users: Launch campaigns to explain upcoming changes, their impact, and how users can adapt.
  • Validate telemetry and error reporting: Ensure your environment captures enough data to monitor the impact of new policies and troubleshoot effectively.
  • Engage your compliance and governance stakeholders: Review new policies with internal governance teams to ensure alignment with organizational and regional standards.
  • Treat security as an ongoing journey: Continue to monitor, iterate, and evolve your security posture as new threats and features emerge.

The post Hardening our digital defenses with Microsoft Baseline Security Mode appeared first on Inside Track Blog.

]]>
20811
Powering agentic AI adoption at Microsoft: Our ‘Customer Zero’ story http://approjects.co.za/?big=insidetrack/blog/powering-agentic-ai-adoption-at-microsoft-our-customer-zero-story/ Thu, 13 Nov 2025 18:45:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=20862 At Microsoft, we are enabling our employees, teams, and organizations to build AI agents to help them complete important tasks—from individual employees in the personal productivity tenant all the way to enterprise-wide agents that are available to everyone. Engage with our experts! Customers or Microsoft account team representatives from Fortune 500 companies are welcome to […]

The post Powering agentic AI adoption at Microsoft: Our ‘Customer Zero’ story appeared first on Inside Track Blog.

]]>
At Microsoft, we are enabling our employees, teams, and organizations to build AI agents to help them complete important tasks—from individual employees in the personal productivity tenant all the way to enterprise-wide agents that are available to everyone.

In short, we’re all-in on agentic AI, and we want to help you get there, too.

“We’ve made a lot of progress deploying and driving adoption of Microsoft 365 Copilot since it was released, and we’re now doing the same when it comes to enabling our employees and our teams to build agents that make us more productive,” says Brian Fielder, vice president of Microsoft Digital, the company’s IT organization. “We’re Customer Zero at Microsoft, which means we’re the first to deploy and use the technology and services that we sell to our customers. Those learnings give us a unique perspective and story to share with you about the journey we’ve been on with AI and agents.”

We have two collections of agentic AI content that we think will be useful to you.

A photo of Fielder.

“When it comes to agents, we’re still at the start. We expect to learn much more as we continue, lessons we’ll share here—stay connected and we’ll continue to share our story with you.” 

Brian Fielder, vice president, Microsoft Digital

The first set of stories documents our vision and strategy for agents. They walk you through our experience deploying agentic AI, our work to create tools that enable our employees to dive in, and, through smart governance, empower everyone at Microsoft to be confident and creative with how they use agents while keeping the company safe and secure.

Our second set of stories highlights some of the most interesting and effective agents that our employees, teams, and organizations have built. These stories will not only give you examples of agents that we’ve built, they show how you can go about building  similar agents for your organization based on the collective experience of our employees and teams at Microsoft.

“We hope you find reviewing the journey we’ve been on practical and useful,” Fielder says. “When it comes to agents, we’re still at the start. We expect to learn much more as we continue, lessons we’ll share here—stay connected and we’ll continue to share our story with you.”  


Deploying agentic AI at Microsoft


Agents we’ve deployed internally at Microsoft


Key takeaways

We hope that you find our agentic AI stories useful. We wanted to share a mixture of our strategy and vision around enabling our employees to deploy agents, and to share stories that feature some of the most promising agents that our employees and teams have built and deployed.

We also understand that it can feel challenging to know where to start—it was for us. Here are some things we learned along the way that should help you:

  • Governing agents is complex, and dependent on the overall AI maturity of your organization. Start slowly to build that maturity before unleashing too many new agents in your environment.
  • A strong policy framework is the foundation. Lean on existing app governance policies, then layer agent-specific structures on top.
  • Invest in data infrastructure and AI platforms. Building robust data infrastructure ensures your organization is prepared to leverage AI, and supports scalable, innovative, and secure AI-driven solutions.
  • Develop a building environment strategy. Decide what scenarios match up with specific environments and make the right environments available to the relevant employees.
  • Global regulations around categories like privacy, security, and responsibility provide a good baseline for establishing governance policies. Set relevant teams to work thinking through these regulations and incorporate their insights into your agent governance.
  • Foster a culture of creativity and teamwork. Champion an AI-forward culture where innovation and collaboration drive the adoption of agentic AI.
  • Develop AI expertise through training and development. As agentic AI transforms workflows and business outcomes across every industry, upskilling will empower your teams to navigate the rapid advances of AI, drive innovation, and ensure your organization stays competitive.
  • Align AI initiatives with strategy. Ensuring AI initiatives align with business goals maximizes their impact and positions your organization to succeed in the rapidly evolving world of agentic AI.
  • Implement ethical AI practices. You can use Microsoft’s Responsible AI principles as a guide. Adopting ethical AI practices builds trust, ensures responsible innovation, and prepares your organization to navigate the evolving landscape as AI becomes central to business operations and decision-making.

The post Powering agentic AI adoption at Microsoft: Our ‘Customer Zero’ story appeared first on Inside Track Blog.

]]>
20862
Vuln.AI: Our AI-powered leap into vulnerability management at Microsoft http://approjects.co.za/?big=insidetrack/blog/vuln-ai-our-ai-powered-leap-into-vulnerability-management-at-microsoft/ Thu, 16 Oct 2025 16:05:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=20623 In today’s hyperconnected enterprise landscape, vulnerability management is no longer a back-office function—it’s a frontline defense. With thousands of devices from a multitude of vendors, and a relentless stream of Common Vulnerabilities and Exposures (CVEs), here at Microsoft we faced a challenge familiar to every IT decision maker: how to scale vulnerability response without scaling […]

The post Vuln.AI: Our AI-powered leap into vulnerability management at Microsoft appeared first on Inside Track Blog.

]]>
In today’s hyperconnected enterprise landscape, vulnerability management is no longer a back-office function—it’s a frontline defense. With thousands of devices from a multitude of vendors, and a relentless stream of Common Vulnerabilities and Exposures (CVEs), here at Microsoft we faced a challenge familiar to every IT decision maker: how to scale vulnerability response without scaling cost, complexity, or risk.

A photo of Fielder.

“While AI enables amazing capabilities for knowledge workers, it also increases the threat landscape, since bad actors using AI are constantly probing for vulnerabilities. Vuln.AI helps keep Microsoft safe by identifying and accelerating the mitigation of vulnerabilities in our environment.”

Brian Fielder, vice president, Microsoft Digital 

Enter Vuln.AI, an intelligent agentic system developed by our team in Microsoft Digital—the company’s IT organization—to transform how we identify, prioritize, and resolve vulnerabilities across our enterprise network.

Manual methods can’t keep up

As a company, we detect over 600 million cybersecurity threats every day, according to our latest Digital Defense Report. Some of those signals are bad actors probing our internal network and infrastructure looking for unpatched vulnerabilities. Our infrastructure supports over 300,000 employees and vendors, 25,000 network devices, and over 560 buildings across 102 countries. This scale means we face a constant stream of vulnerabilities—each requiring triage, impact analysis, and remediation.

“While AI enables amazing capabilities for knowledge workers, it also increases the threat landscape, since bad actors using AI are constantly probing for vulnerabilities. Vuln.AI helps keep Microsoft safe by identifying and accelerating the mitigation of vulnerabilities in our environment,” says Brian Fielder, a vice president within Microsoft Digital. 

Historically, our Infrastructure, Networking, and Tenant team here in Microsoft Digital relied on manual assessments to determine which network devices were impacted by new vulnerabilities. Traditional vulnerability scanning tools generate a lot of false positives and false negatives, and a significant amount of analysis still falls to security engineers, requiring manual validation before any vulnerability impact can be communicated to device owners. These manual methods were time-consuming, error-prone, and reactive—our security engineers were spending hours on each vulnerability, at times missing critical threats or sinking too much time into false alarms.

A photo of Bansal.

“AI’s true power lies in the problem it’s applied to. Start by identifying the most time-consuming or painful task in your organization-then explore how AI can augment or improve it. Begin with a small, targeted enhancement and iterate continuously.”

Ankit Bansal, senior product manager, Microsoft Digital

With the vast number of vulnerabilities coming in every day, security engineers needed a scalable way to quickly analyze, prioritize, and respond.

The solution: Vuln.AI

We already achieved dramatic impact with our AI Ops and Network Infrastructure Copilot, which is on track to save us over 11,000 hours of network service management time per year. We built Vuln.AI on top of that investment:

  1. The Research Agent analyzes vulnerability feeds and network metadata from our Infrastructure Data Lakehouse (IDL) built on top of Azure Data Explorer, which regularly ingests data from our device vendors and other sources. Once new vulnerabilities are detected, it automates the identification of impacted devices and integrates with other internal tooling for validation and reporting.
  2. The Interactive Agent acts as a gateway for engineers and device owners to ask follow-up questions and initiate remediation. Through agent-to-agent interaction, it leverages our Network Infrastructure Copilot to query the research agent’s findings. This agentic interface enables real-time decision-making and contextual insights.

Together, these agents are significantly improving our network security operations. The results we’re seeing so far are compelling:

  • A 70% reduction in time to vulnerability insights, enabling faster prioritization and mitigation, minimizing exposure windows.
  • Lower risk of compromise through increased accuracy, quicker detection, and containment of threats.
  • A stronger compliance posture that supports adherence to financial, legal, and regulatory requirements.
  • Higher accuracy in identifying vulnerable devices, reducing false positives and missed threats
  • Engineering hours saved and reduced fatigue, significantly improving productivity.

Our gains translate to lower operational risk, faster response times, and more resilient infrastructure—critical outcomes for any enterprise navigating today’s threat landscape.

“AI’s true power lies in the problem it’s applied to,” says Ankit Bansal, a senior product manager within Microsoft Digital. “Start by identifying the most time-consuming or painful task in your organization-then explore how AI can augment or improve it. Begin with a small, targeted enhancement and iterate continuously.”

How Vuln.AI works

The system continuously ingests our CVE data from our device suppliers’ API feeds and a publicly available database of known cybersecurity vulnerabilities.  It correlates that data with device attributes such as its hardware model and OS to identify the potential impact on the network and surface actionable insights.

Engineers interact with the system via Copilot, Teams, or custom tooling, which allows seamless integration with our network security teams’ daily workflows.

“We built a hybrid approach in Vuln.AI to guide LLMs through complex security advisories,” says Blaze Kotsenburg, a software engineer in Microsoft Digital. “By combining structured function calls, templated prompts, and data validation, we keep the model focused on producing reliable, actionable insights for vulnerability mitigation.”

A photo of Lollis.

“We chose Durable Functions for Vuln.AI because it allowed us to confidently orchestrate complex, stateful research. The reliability and simplicity of the framework meant we could shift our focus to engineering the intelligence behind the agent, especially the prompting strategies used in Vuln.AI’s backend processing.”

Mike Lollis, a senior software engineer in Microsoft Digital.

When it came to building Vuln.AI, we relied heavily on our own technology platforms, including: 

  • Azure AI Foundry for model development and deployment
  • Azure Data Explorer to store device metadata and CVEs
  • Agent to agent interaction with Network Copilotto query our database for device and inventory knowledge
  • Azure OpenAI models for natural language processing and classification
  • Azure Durable Functions for fine-grained orchestration and custom LLM workflows

“We chose Durable Functions for Vuln.AI because it allowed us to confidently orchestrate complex, stateful research,” says Mike Lollis, a senior software engineer in Microsoft Digital.  “The reliability and simplicity of the framework meant we could shift our focus to engineering the intelligence behind the agent, especially the prompting strategies used in Vuln.AI’s backend processing.”

Vuln.AI in action

Consider a common scenario: a new CVE that affects a network switch has just been published. Vuln.AI’s research agent immediately flags the vulnerability, maps it to potentially affected devices in our network inventory, and pushes the findings to an internal database.

A photo of Lee.

“AI is only as good as the data you provide. Much of the success with Vuln.AI came from our dedicated efforts to source comprehensive vulnerability data and device attributes. For effective AI-powered solutions, you really need to invest in a strong data foundation and a strategy for how to integrate into the rest of your infrastructure.”

Linda Lee, product manager II, Microsoft Digital

This data then becomes immediately accessible in our internal tools, where it is validated and approved by security engineers. Following this, network engineers are provided with precise information about their vulnerable devices.

Engineers can prompt Vuln.AI’s interactive agent to instantly retrieve the following information:

“12 devices impacted by CVE-2025-XXXX. Would you like me to suggest some next steps for mitigation or remediation?”

With Vuln.AI, network engineers can now begin vulnerability response operations much more quickly—no spreadsheet wrangling and no delays.

“AI is only as good as the data you provide,” says Linda Lee, a product manager II within Microsoft Digital. “Much of the success with Vuln.AI came from our dedicated efforts to source comprehensive vulnerability data and device attributes. For effective AI-powered solutions, you really need to invest in a strong data foundation and a strategy for how to integrate into the rest of your infrastructure.”

It’s about automating manual workflows and research.

“Vuln.AI has reduced our triage time by over 50%,” says Vincent Bersagol, a principal security engineer in Microsoft Digital.

This is allowing our engineers to focus on deeper analysis.

“The synergy between security and AI engineering has unlocked a new level of precision in vulnerability insights,” Bersagol says. “This is just the beginning.”

The journey ahead

Our journey with AI-powered vulnerability management has only just begun. Looking ahead, our roadmap for Vuln.AI includes:

  • Extending data coverage to include more hardware suppliers
  • Integrating more detailed device profiles for more targeted vulnerability response
  • Supporting autonomous workflows to streamline network engineers’ remediation efforts
  • Incorporating other AI agents to support more security use cases

These enhancements will further reduce risk, accelerate response times, and empower engineers to focus on more strategic initiatives.

“Trust is the foundation of everything we do in Microsoft Digital,” Bansal says. “Securing our network is essential to upholding that trust. Intelligent solutions like Vuln.AI not only help us stay ahead of emerging threats—they also establish the blueprint for integrating AI more deeply into our security operations.”

For IT leaders, Vuln.AI offers a blueprint for modern vulnerability management:

  • Scalable: Handles thousands of devices and vulnerabilities with ease
  • Accurate: Reduces false positives and missed threats
  • Efficient: Saves time, money, and resources
  • Secure: Built on Microsoft’s trusted AI and security frameworks

In a world where every second counts and any threat can be costly, Vuln.AI transforms vulnerability management from a bottleneck into a competitive advantage for Microsoft.

Key takeaways

As your organization looks for ways to improve security and threat response in a fast-changing landscape, consider the following insights on how AI is reshaping vulnerability management at Microsoft:

  • Fight fire with fire: The threat landscape has broadened dramatically due to bad actors using AI. Supplementing your own efforts with AI can help you manage your risk more effectively than traditional vulnerability management.
  • Agility is key: Effective vulnerability response hinges on acting fast. An AI-powered solution like Vuln.AI can cut the time needed to analyze and mitigate vulnerabilities by over 50%, enabling organizations to enhance security operations at scale.
  • The future is now: Looking ahead, Microsoft Digital will integrate agentic workflows into more security operations, boosting efficiency in risk prevention, threat detection and response, thereby enabling security practitioners and developers to focus on more strategic projects.

The post Vuln.AI: Our AI-powered leap into vulnerability management at Microsoft appeared first on Inside Track Blog.

]]>
20623
Keeping our in-house optical network safe with a Zero Trust mentality http://approjects.co.za/?big=insidetrack/blog/keeping-our-in-house-optical-network-safe-with-a-zero-trust-mentality/ Thu, 16 Oct 2025 16:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=20611 When it comes to corporate connectivity at Microsoft, a minute of lost connection can lead to catastrophic disruptions for our product teams, sleepless nights for our network engineers, and millions of dollars of lost value for the company. That’s why we built our own optical network at our headquarters in Washington state, and that’s why […]

The post Keeping our in-house optical network safe with a Zero Trust mentality appeared first on Inside Track Blog.

]]>
When it comes to corporate connectivity at Microsoft, a minute of lost connection can lead to catastrophic disruptions for our product teams, sleepless nights for our network engineers, and millions of dollars of lost value for the company.

That’s why we built our own optical network at our headquarters in Washington state, and that’s why we’re building similar networks at other regional campuses around the United States and the rest of the world.

With so much on the line, we need to make sure these in-house networks never go down.

But how are we doing that?

We’re applying the same robust Zero Trust approach we take to security and identity. While our optical networks are extremely reliable, any complex system can be knocked offline. In alignment with the Zero Trust mentality we have as a company, we trusted the integrity of what we’ve built, but we needed a resilient backup system that went beyond redundancy to provide true resilience.

Driven by this goal, we created a Zero Trust Optical Business Continuity Disaster Recovery (BCDR) network that combines two fully independent optical systems designed to sustain uninterrupted services, even during systemic failures. The result is more confidence for our employees and vendors, less pressure on our network engineers, and comprehensive network resilience that will protect us against a major outage.

The urgency of resilience

In 2021, our team in Microsoft Digital, the company’s IT organization, deployed our first next-generation optical network to serve the exclusive network needs of our Puget Sound metro campuses. It offers more bandwidth on less fiber for a lower operational cost than leasing from traditional carriers.

“Puget Sound is a highly concentrated developer network where we need to provide very high throughput,” says Patrick Alverio, principal group software engineering manager for Infrastructure and Engineering Services within Microsoft Digital. “Our optical system is the backbone of all that traffic.”

Our state-of-the-art optical network fulfills our need for fast and reliable connectivity at up to 400 Gbps between core sites, labs, data centers, and the internet edge. We built this network on the Reconfigurable Optical Add/Drop Multiplexer (ROADM) technology, delivering dynamic reconfiguration, colorless, directionless, contentionless (CDC) capabilities, flexible grid support, remote provisioning, and automation. It also features a full-mesh topology that provides a layer of redundancy.

But what if the entire ROADM-based system fails?

There are plenty of operational risks that can derail even the most robust network. Anything from misconfigured automation scripts to policy changes to misaligned software versioning to simple human error can cause outages.

A photo of Elangovan

“We don’t want even a second of downtime. We needed a life raft for when failures occur that could also function as a standby network for core site migrations or platform upgrades.”

Vinoth Elangovan, senior network engineer, Hybrid Core Network Services, Microsoft Digital

To some degree, those kinds of minor disruptions are inevitable. But catastrophic events like fiber cuts, failures in the ROADM operating system, or even natural disasters have the potential for even more wide-ranging disruption.

During a catastrophic outage, thousands of engineers, developers, researchers, and other technical employees who need access to crucial lab environments and data centers could lose connectivity. That can sabotage feature delivery, disrupt product patches, interrupt updates, and halt all kinds of core product functions.

On top of normal software development operations, new AI tools demand massive bandwidth and consistent uptime. Finally, our hybrid networks feature paths integrated with Microsoft Azure that consume on-premises resources, so they also stand to benefit from increased resilience.

A catastrophic network outage can cause incredible damage to all of these business functions. In fact, we experienced exactly that in 2022.

A fiber cut combined with a ROADM system hardware reboot caused a five-minute outage at our Puget Sound metro region. In this environment, every minute of lost connectivity can result in significant financial impact, making network resilience absolutely essential.

“We don’t want even a second of downtime,” says Vinoth Elangovan, senior network engineer, who designed and implemented the Zero Trust Optical BCDR network for Microsoft. “We needed a life raft for when failures occur that could also function as a standby network for core site migrations or platform upgrades.”

Delivering greater network resilience

To ensure we could deliver uninterrupted network connectivity even in the midst of a catastrophic outage, we needed to consider the technical demands of a truly resilient system. Five design pillars helped us assemble our architectural criteria:

  1. Independent optical systems: To provide true resilience, our primary and BCDR platforms needed to operate autonomously.
  2. Physically independent paths: Circuits should avoid shared conduits, fibers, and splices to operate completely independently.
  3. Separate control software: The primary and backup networks should operate through dedicated network management systems (NMSs), automation, and provisioning domains.
  4. Unified client interface: Both systems needed to terminate into the same interface to unify service for clients and applications.
  5. Survivability by design: We couldn’t assume that any system would be immune to failure. Instead, we built for the best possible outcomes.

The result was the Zero Trust Optical BCDR architecture, a layered approach to optical networking. It consists of our primary, ROADM-based transport layer and a secondary, MUX-based transport layer, both terminating into a single logical port channel.

“Our core responsibility is the employee experience, so our main design thrust was making sure service is seamless and uninterrupted—even during an outage.”

Vinoth Elangovan, senior network engineer, Hybrid Core Network Services, Microsoft Digital

Both systems are live and active, which means they deliver production services through their own independent fibers, power supplies, and software stacks. By layering fully independent optical domains and logically unifying them at the Ethernet edge, the network can sustain a complete failure of one system and maintain continuity.

That physical and operational independence is the difference between simple redundancy and robust resilience.

“Our core responsibility is the employee experience, so our main design thrust was making sure it’s seamless and uninterrupted—even during an outage,” Elangovan says.

Optical network backed by a BCDR network

A schematic of an optical network running between different nodes and backed up by a BCDR network.
The optical network in our Puget Sound region connects core sites to labs, datacenters, and the internet edge, while the BCDR network provides backup connections to deliver resilience in case of a catastrophic network failure.

A typical ROADM optical network connects campus and data center sites to the internet edge. Our design features three interconnected optical rings, with two internet edges as multi-directional nodes, while other sites operate as dual-degree nodes with bidirectional redundancy. Meanwhile, our campuses and datacenters are designated as critical sites and equipped with Optical BCDR links to ensure enhanced resiliency. In the event of a complete Optical ROADM line failure, these critical sites retain connectivity.

In the event of an outage on the primary network, the port channel handles forward continuity automatically, shifting WAN traffic between optical paths in real time.

The transition occurs seamlessly and transparently, with no noticeable impact to clients.

A photo of Martin

“Our initial goal was to provide high-throughput connectivity for major labs, with less than six minutes of downtime per year. That represents a service level of 99.999% network continuity, and we’re aiming for even better moving forward.”

Blaine Martin, principal engineering manager, Hybrid Core Network Services, Microsoft Digital

Coupling at the Ethernet layer provides clients and applications with one logical interface, automatic load balancing and traffic distribution, and seamless failover, regardless of which optical domain is providing service.

“Our initial goal was to provide high-throughput connectivity for major labs, with less than six minutes of downtime per year,” says Blaine Martin, principal engineering manager for Hybrid Core Network Services in Microsoft Digital. “That represents a service level of 99.999% network continuity, and we’re aiming for even better moving forward.”

A new era of confidence for network engineers

For the network engineers who keep Microsoft employees and resources connected, the Zero Trust Optical BCDR network relieves much of the pressure that comes from resolving outages.

“Before, we were dependent on a single system, even with redundancies, so the human experience was like firefighting. Now, if the primary optical network is having a problem, I don’t even see it.”

Kevin Bullard, principal cloud network engineering manager, Microsoft Digital

When a network goes down, engineers have an enormous set of responsibilities to manage: processing the incident report, assigning severity, performing checks, notifying internal teams, providing updates, and engaging with physical support teams—all with a profound urgency to restore productivity.

Dialing those pressures back has been a huge benefit.

“Before, we were dependent on a single system, even with redundancies, so the human experience was like firefighting,” says Kevin Bullard, Microsoft Digital principal cloud network engineering manager responsible for maintaining WAN interconnectivity between labs. “Now, if the primary optical network is having a problem, I don’t even see it.”

There will always be pressure on network engineers to restore connectivity during an outage, but they can breathe easier knowing it won’t cost the company millions of dollars as the time to resolve ticks away. And in non-emergency situations like core site migrations, the BCDR network provides a much easier way to shunt services while the main network is offline.

“Our internal users have become more confident that they can stay connected, no matter what,” says Chakri Thammineni, principal cloud network engineer for Infrastructure and Engineering Services in Microsoft Digital. “That gives the people responsible for maintaining our enterprise networks incredible peace of mind.”

Fortunately, there hasn’t been a substantial network outage in the Puget Sound metro area since 2022. But our network engineering teams know that if and when it happens, the BCDR network will be ready to maintain service continuity.

A photo of Alverio.

“We’re always looking ahead into industry trends to stay at the bleeding edge, whether that’s in the technology we provide for our customers or the networks we use to do our own work.”

Patrick Alverio, principal group software engineering manager, Infrastructure and Engineering Services, Microsoft Digital

With our Puget Sound network protected, we have plans in place to extend this model to other metro areas. Naturally, we have to balance population, criticality, and the knowledge that elevated reliability and availability come with a cost.

Our selection criteria for new BCDR networks have largely centered around two factors: expansions of AI-critical infrastructure and concentrations of secure access workspaces (SAWs) for technical employees. With these criteria in mind, we’re planning new BCDR networks first in the Bay Area and Dublin, then in Virginia, Atlanta, and London.

Zero Trust optical BCDR architecture represents a paradigm shift in enterprise network resilience, and we’re committed to expanding the model to benefit both conventional workloads and the expanding infrastructure demands of AI.

“We’re always looking ahead into industry trends to stay at the bleeding edge, whether that’s in the technology we provide for our customers or the networks we use to do our own work,” Alverio says. “We refuse to accept the status quo, and we’re elevating the experience for employees across Puget Sound and Microsoft as a whole.”

Driving AI innovation in optical network resilience

Our journey towards an AI-driven optical network is gaining momentum.

As part of our Secure Future initiative, we’ve automated our Optical Management Platform credential rotation and are actively developing intelligent incident management ticket enrichment, auto-remediation, link provisioning, deployment validation, and capacity planning.

AI plays a central role in this transformation.

With Microsoft 365 Copilot and GitHub Copilot integrated into our engineering workflows, we’re accelerating development cycles, improving code accuracy, and uncovering optimization opportunities that would otherwise take hours of manual effort.

These Copilots are also helping our engineers analyze network patterns, simulate outcomes, and validate deployment logic before execution, reducing human error and strengthening our Zero Trust posture. Over time, we’re evolving toward a system where AI not only assists but proactively predicts potential disruptions, recommends remediations, and continuously learns from operational telemetry.

These advancements are paving the way for a future where our optical infrastructure can anticipate issues, recover faster, and operate with the agility and assurance expected in a Zero Trust environment.

Key takeaways

If you’re considering implementing your own optical and BCDR networks, consider these tips:

  • Understand the technical components of resilience: Independent optical systems, physically independent paths, separate control software, a unified client interface, and survivability by design are the key technical components of true resilience.
  • Plan from a preparedness and value perspective: Evaluate the critical points in your infrastructure and determine where you can get the most value out of resilient connectivity.
  • Ensure your teams have the right skillset: Carefully consider the right workforce to run those systems and be accountable for their operation.

The post Keeping our in-house optical network safe with a Zero Trust mentality appeared first on Inside Track Blog.

]]>
20611
Transforming security and compliance at Microsoft with Windows Hotpatch http://approjects.co.za/?big=insidetrack/blog/transforming-security-and-compliance-at-microsoft-with-windows-hotpatch/ Thu, 02 Oct 2025 16:05:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=20455 Security updates are essential, and every security admin knows that when it comes to applying these updates, faster is better to mitigate the risk. However, security updates have always come with a catch: Windows needs to reboot to apply them. Reboots mean interrupted productivity and downtime for users. For us at Microsoft Digital, Microsoft’s internal […]

The post Transforming security and compliance at Microsoft with Windows Hotpatch appeared first on Inside Track Blog.

]]>
Security updates are essential, and every security admin knows that when it comes to applying these updates, faster is better to mitigate the risk. However, security updates have always come with a catch: Windows needs to reboot to apply them.

Reboots mean interrupted productivity and downtime for users.

For us at Microsoft Digital, Microsoft’s internal IT organization, Windows Hotpatch changes the equation.

It’s a new way to deliver critical Windows updates without rebooting. That means faster compliance, less downtime, and happier users.

We’re using it across Microsoft and it’s already transforming how we think about security and productivity.

“Hotpatch is helping Microsoft reach compliance faster than ever—no reboots, no delays, secure systems at scale, and a seamless experience that keeps users more productive. The risk exposure window is reduced drastically, making our environment safer and more resilient,” says Harshitha Digumarthi, a senior program manager within Microsoft Digital.

Hotpatch installs updates while the system is running—no reboot required. That means we can patch faster, stay compliant, and keep users happy.

And it’s not just us.

Microsoft enterprise customers are already scaling deployments to millions of devices. We’re seeing a shift in how organizations think about patching and how they can expedite the patch time. Hotpatch is here to help. It’s no longer a disruption, it’s just part of the flow.

Increasing productivity and security with Hotpatch

Hotpatch is a servicing technology that delivers cumulative security updates—released on Patch Tuesday, the second Tuesday of each month—without requiring a system reboot. Instead of replacing binaries on disk and restarting the system, Hotpatch modifies in-memory code while the system is running.

This means updates take effect immediately, with no downtime, no maintenance windows, and no disruption to users.

Hotpatch payloads are small by design. Smaller updates mean faster downloads, quicker installs, and minimal impact on performance. CPU usage stays low. No spikes. No slowdowns. Just updates that run in the background and finish silently.

“The experience is so seamless you don’t even know what happened,” says Nevine Geissa, a partner group program manager within the Windows product team. “There are no process restarts, no logging out, no performance impact. No glitch in the video playing or transaction dropping. Everything just works as if nothing has happened.”

Because hotpatch updates happen so painlessly in the background, IT administrators may want to understand how the process works and what validation steps are involved. That’s why we test hotpatch updates with the same rigorous standards we apply to all our security updates.

A photo of Geissa.

“Hotpatch updates go through the exact same validation and rigor that a standard security update goes through. There is no compromise on quality whatsoever. Your device is always as secure as your non-hotpatch device.”

Nevine Geissa, partner group program manager, Windows Servicing and Delivery

Even in cases of zero-day vulnerabilities, Hotpatch can deliver out-of-band updates to enrolled devices without requiring a reboot.

Hotpatch is available for Windows 11 version 24H2 or later, Windows 365, Azure Virtual Desktop, Windows Server 2022/2025 Azure Edition, and Azure Arc connected Windows Server 2025 Datacenter and Standard editions.

The technology has matured over years of internal development.

“Hotpatch updates go through the exact same validation and rigor that a standard security update goes through,” Geissa says. “There is no compromise on quality whatsoever. You will always be at the exact same level of security.”

Hotpatch has evolved and grown.

“It started as internal server capability in Azure and then expanded to our Windows Server 2022 customers,” says Nikita Deshpande, a senior customer experience program manager within the Windows Servicing and Delivery product team at Microsoft. “The tooling and OS support have matured such that now we can offer Hotpatch to AMD64 and Arm64 client machines now too.”

Hotpatch integrates seamlessly with Autopatch, a cloud-based service from Microsoft that automates the process of keeping Windows devices up to date. Designed for enterprise environments, and powered by Microsoft Intune, Autopatch manages updates for Windows, Microsoft 365 Apps for enterprise, Microsoft Edge, and Microsoft Teams, reducing the manual effort required by IT administrators.

Any new policy in our environment created with Autopatch automatically enables Hotpatch—if the device meets requirements. Admins can set up rings, monitor compliance, and roll out updates with just a few clicks.

“It’s the better together story,” Deshpande says. “Autopatch streamlines everything. Add Hotpatch, and it takes Windows Update to a whole new level.”

Implementing Windows Hotpatch internally at Microsoft

The implementation of Hotpatch at Microsoft Digital involved developing and deploying a feature, as well as establishing trust for customers.

The journey started years ago in Azure with virtual machines, then to Windows Server across physical and virtual instances. Now, it’s on Windows 11 clients and scaling fast, but getting here took deep collaboration.

Our team in Microsoft Digital partnered with the product team from the start. We were co-designers with experience in this space. We helped shape the rollout, validate the experience, and make sure Hotpatch was ready for enterprise scale.

Then we scaled. We expanded to 40,000, then 80,000, then 120,000 devices. We’re on track to reach 450,000 devices at Microsoft in the next four months.

We also wanted a great admin experience enabled for the product. The features help with smooth rollout and the visibility helps admins monitor rollouts and measure impact. We’re continually collaborating with the Windows product team to equip administrators with comprehensive insights and actionable recommendations with Hotpatch.

“We worked closely with the product team to make sure admins had the right metrics to measure the success,” Digumarthi says. “It’s not just about implementation—it’s about knowing it worked.”

We ran early adopter programs and insider rings to gather feedback from across Microsoft. That feedback loop helped refine the experience, improve reporting, and ensure the rollout was smooth.

Achieving security without compromising on productivity

Hotpatching is changing how we think about security.

“With Hotpatch, we’re seeing 81% of Microsoft’s enrolled devices become compliant within 24 hours of Patch Tuesday and 90% of enrolled devices are patched within five days.”

Harshitha Digumarthi, senior program manager, Microsoft Digital

Before, it took our team up to nine months to reach 95% compliance for security patching.

That’s nine months of exposure and nine months of risk.

With Hotpatch, we’re achieving 95% compliance in less than three weeks.

“With Hotpatch, we’re seeing 81% of Microsoft’s enrolled devices become compliant within 24 hours of Patch Tuesday, and 90% of enrolled devices are compliant within five days,” Digumarthi says.

That’s not just faster. It’s safer.

“We’re reducing the risk window,” Digumarthi says. “From vulnerability discovery to patch deployment, we’re closing the gap—without disrupting users.”

And it’s not just internal. Since general availability in April, Hotpatch has scaled to over 4.5 million devices globally. That growth shows trust and momentum.

It also shows value. Admins spend less time chasing updates. End users stay productive. And security teams get the compliance they need—without the friction.

“Hotpatching eliminates the trade-off between security and productivity,” Deshpande says. “You don’t have to choose anymore.”

Improving the user experience

Hotpatching doesn’t just improve security—it transforms the user experience.

For end users, it’s invisible.

Updates happen in the background.

No pop-ups. No restarts. No performance hits.

“It’s so seamless,” Geissa says. “There’s no bubble. No prompt. It just works.”

Even the first few times, users might see a green banner letting them know they’ve been hotpatched.

A photo of Selveraj.

“It’s really helpful as an end user; I feel more secure. I don’t need to keep checking and making sure my device is up to date. It just is.”

Senthil Selvaraj, principal group product manager, Microsoft Digital

It’s subtle. It’s clean.

It’s so effective that it’s become a kind of badge among Microsoft insiders.

“It’s really helpful as an end user—I feel more secure,” says Senthil Selvaraj, a principal group product manager at Microsoft Digital. “I don’t need to keep checking and making sure my device is up to date. It just is.”

That’s the magic.

Hotpatching doesn’t interrupt work—it protects it.

It helps other systems stay current too. When the OS is secure, dependent apps and services can update more reliably. That ripple effect improves the overall health of the device.

Admins also see the benefits. Intune reporting shows which devices are ready, which have updated, and which need attention. That visibility helps IT teams track compliance without chasing down machines or relying on manual checks.

For enterprises, it means fewer help desk calls. Fewer complaints. Fewer delays.

Looking forward

Hotpatching is just getting started.

At Microsoft Digital, we’re expanding from 100K to 450K devices in the next four months. That’s nearly every eligible device in our fleet.

Externally, adoption is accelerating. We’ve gone from zero to almost 4.5 million devices since private preview in November 2024. That includes Microsoft and customer fleets, and the number keeps growing.

But scale is just the beginning.

The product team is exploring ways to improve compliance visibility—giving admins deeper insights into patch status, readiness, and impact. That means better reporting, smarter dashboards, and tighter integration with compliance tools.

We’re also working to make adoption easier.

Documentation is improving, Intune reporting is evolving, and we’re building clearer guidance for customers to validate their environments, understand their risk posture, and deploy Hotpatch confidently.

The vision is simple: secure every device, without disruption.

Key takeaways

Here are several key actions you can take to successfully implement Windows Hotpatch in your organization:

  • Check your eligibility and prerequisites. Understand your eligibility and set up the prerequisites in your environment to be hotpatch-capable.
  • Monitor devices and report compliance. Use Intune and other reporting tools to track device readiness, update status, and compliance, even for unmanaged environments.
  • Communicate the benefits to users. Inform users that hotpatching maintains their ability to reboot while enhancing device security with minimal disruption.
  • Deliver a seamless update experience. Emphasize the uninterrupted, restart-free, and performance-neutral nature of updates for users.

The post Transforming security and compliance at Microsoft with Windows Hotpatch appeared first on Inside Track Blog.

]]>
20455