Security and risk management Archives - Inside Track Blog http://approjects.co.za/?big=insidetrack/blog/tag/security-and-risk-management/ How Microsoft does IT Thu, 16 Apr 2026 21:45:18 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.4 137088546 Microsoft CISO advice: How to build Trustworthy Agentic AI http://approjects.co.za/?big=insidetrack/blog/microsoft-ciso-advice-how-to-build-trustworthy-agentic-ai/ Thu, 16 Apr 2026 15:15:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=23173 Building production-ready solutions with agentic AI comes with inherent risks. When agents make mistakes or hallucinate, the potential impacts can multiply rapidly. “It turns out that it’s very easy to write AI-powered software, but it’s very hard to write AI-powered software that works right in real-world cases,” says Yonatan Zunger, CVP and deputy CISO for […]

The post Microsoft CISO advice: How to build Trustworthy Agentic AI appeared first on Inside Track Blog.

]]>
Building production-ready solutions with agentic AI comes with inherent risks. When agents make mistakes or hallucinate, the potential impacts can multiply rapidly.

“It turns out that it’s very easy to write AI-powered software, but it’s very hard to write AI-powered software that works right in real-world cases,” says Yonatan Zunger, CVP and deputy CISO for Microsoft.

Yunger explains how important it is to test if you want to build trustworthy agentic AI.

Watch this video to see Yonatan Zunger explain how to build trustworthy agentic AI. (For a transcript, please view the video on YouTube: https://www.youtube.com/watch?v=eNU7c48541M)

Key takeaways

Here are best practices to apply while building trustworthy agentic AI:

  • Prototype. Test. Iterate. Think of and try prompts your real users might give your agentic AI. Use real data. From those trials, build a set of test cases and keep testing.
  • Use AI tools to amplify testing. Evaluating agents requires a “try it and repeat it” mindset. Using AI Foundry with such tools as Python Risk Identification Tool amplifies these assessment capabilities.
  • Record your tests. Applying this practice, as you would with unit testing, enables you to repeat evaluations as your data models and agents evolve.
  • Don’t skimp on testing. Test early, test often, test with real data. This is the best way to understand what your agent might do when it encounters the unexpected.

The post Microsoft CISO advice: How to build Trustworthy Agentic AI appeared first on Inside Track Blog.

]]>
23173
Microsoft CISO advice: The importance of a written AI safety plan http://approjects.co.za/?big=insidetrack/blog/microsoft-ciso-advice-the-importance-of-a-written-ai-safety-plan/ Thu, 09 Apr 2026 16:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=23016 Yonatan Zunger, CVP and Deputy CISO for Microsoft, has spent his career considering complex questions with security and privacy while building platform infrastructure and solutions. His experience underpins his advice on how to build a safety plan for working with AI. First and foremost, his advice is to have a written plan. “Make it an […]

The post Microsoft CISO advice: The importance of a written AI safety plan appeared first on Inside Track Blog.

]]>
Yonatan Zunger, CVP and Deputy CISO for Microsoft, has spent his career considering complex questions with security and privacy while building platform infrastructure and solutions. His experience underpins his advice on how to build a safety plan for working with AI. First and foremost, his advice is to have a written plan.

“Make it an expectation in your organization that people will create safety plans and have them for everything,” Zunger says. “People get so excited about having clarity in front of them that they end up making much more systematic, careful plans, and the rate of errors goes down dramatically.”

Watch this video to see Yonatan Zunger discuss his advice for creating an AI safety plan. (For a transcript, please view the video on YouTube: https://www.youtube.com/watch?v=H5reZ0uw0EA

Key takeaways

Here are questions and ideas to consider as you create a safety plan for your AI systems:

  • Define the problem. What problem are you trying to solve? A simple and clear problem statement is always a great starting point before building anything, including an AI agent.
  • Outline the solution. What is the basis of your solution? Can you explain your solution to an end user? What does a developer or administrative user of your solution need to know about what it is and does?
  • List the things that can go wrong. What can go wrong with your solution? Creating this list is the first step to figuring out how to deal with those issues.
  • Document your plan. What is your plan to address identified concerns? Identify the process you will follow when something goes wrong.
  • Draft your plan early and update it as your solution matures. Your safety plan can be as simple as a list or outline and should evolve as you prepare to build your solution.
  • Get feedback and buy-in. When you review the plan with stakeholders and leaders in your team and organization, you may uncover risks or issues you had not thought of. You also build awareness and agreement on what to do when something goes wrong.
  • Make a template and build its use into your processes. This tip is for anyone who leads a team or influences process development. Encourage using a safety template in all your projects to bring clarity and structure to how you work with AI.

The post Microsoft CISO advice: The importance of a written AI safety plan appeared first on Inside Track Blog.

]]>
23016
Harnessing AI: How a data council is powering our unified data strategy at Microsoft http://approjects.co.za/?big=insidetrack/blog/harnessing-ai-how-a-data-council-is-powering-our-unified-data-strategy-at-microsoft/ Thu, 09 Apr 2026 16:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=23030 Information technology is an ever-evolving landscape. Artificial Intelligence is accelerating that evolution, providing employees with unprecedented access to information and insights. Data-driven decision making has never been more critical for businesses to achieve their goals. In light of this priority, we have established a Microsoft Digital Data Council to help accelerate our companywide AI-powered transformation. […]

The post Harnessing AI: How a data council is powering our unified data strategy at Microsoft appeared first on Inside Track Blog.

]]>
Information technology is an ever-evolving landscape. Artificial Intelligence is accelerating that evolution, providing employees with unprecedented access to information and insights. Data-driven decision making has never been more critical for businesses to achieve their goals.

In light of this priority, we have established a Microsoft Digital Data Council to help accelerate our companywide AI-powered transformation.

Our data council is a cross-functional team with representation from multiple domains within Microsoft, including Microsoft Digital, the company’s IT organization; Corporate, External, and Legal Affairs (CELA); and Finance.

A photo of Tripathi.

“By championing robust data governance, literacy, and responsible data practices, our data council is a crucial part of our AI-powered transformation. It turns enterprise data into a strategic capability that fuels predictive insights and intelligent outcomes across the organization.”

Naval Tripathi, principal engineering manager, Microsoft Digital

Our data council’s mission is to drive transformative business impact by establishing a cohesive data strategy across Microsoft Digital, empowering interconnected analytics and AI at scale. Our vision is to guide our organization toward Frontier Firm maturity through a clear blueprint for high-quality, reliable, AI-ready data delivered on trusted, scalable platforms.

“By championing robust data governance, literacy, and responsible data practices, our data council is a crucial part of our AI-powered transformation,” says Naval Tripathi, principal engineering manager in Microsoft Digital. “It turns enterprise data into a strategic capability that fuels predictive insights and intelligent outcomes across the organization.”

Our evolving data strategy

Over the past two decades, we at Microsoft—along with other large enterprises—have continuously evolved our data strategies in search of the right balance between control and agility. Early approaches were highly decentralized, with different teams owning and managing their own data assets. While this enabled local optimization, it also resulted in inconsistent quality and limited enterprise-wide insight.

Our subsequent shift toward centralized data platforms brought much-needed standardization, security, and scalability. However, as data platforms grew more sophisticated, ownership often drifted away from the business domains closest to the data, slowing responsiveness and diluting accountability.

Today, we and other leading companies are embracing a more balanced, federated approach, often described as a data mesh. Rather than forcing all our data into a single centralized system or allowing unchecked decentralization, the data mesh formalizes domain ownership while embedding governance, quality, and interoperability directly into shared platforms.

With this approach, our domain teams publish data as well-defined, discoverable products, while common standards for security, metadata, and compliance are enforced through automation rather than manual processes. This model preserves enterprise trust and consistency without sacrificing speed or autonomy.

By adopting a data mesh mindset, we can scale analytics and AI more effectively across the organization while still keeping ownership closely connected to the business focus. The result is a system that supports innovation at the edges, strong governance at the core, and seamless collaboration across domains, enabling the transformation of data from a technical asset to a strategic, enterprise-wide capability.

Quality, accessibility, and governance

To scale enterprise data and AI, organizations must first ensure their data is trusted, discoverable, and responsibly governed. At Microsoft Digital, our data strategy is designed to create data foundations that power intelligent applications and effective decision making across the company.

A photo of Uribe.

“High-quality, well-governed data is essential to accelerate implementation and adoption of AI tools. Data quality, accessibility, and governance are imperatives for AI systems to function effectively, and recognizing that is propelling our data strategy.”

Miguel Uribe, principal PM manager, Microsoft Digital

By implementing a data mesh strategy at scale, we aim to unlock valuable data insights and analytics, enabling advanced AI scenarios. Our data council focuses on three core dimensions that make AI-ready data possible:

  • Quality: Making sure enterprise data is reliable and complete
  • Accessibility: Enabling secure and discoverable access to data
  • Governance: Protecting and managing our data responsibly

Together, these dimensions form the foundation for scalable innovation and AI-powered data use. They connect data silos and ensure consistent, high‑quality access across the enterprise—enabling both humans and AI systems to work from the same trusted data foundation. As AI use cases mature, this foundation allows AI agents to retrieve and reason over data through enterprise endpoints, while supporting advanced analytics, data science, and broader technology.

“High-quality, well-governed data is essential to accelerate implementation and adoption of AI tools,” says Miguel Uribe, a principal PM manager in Microsoft Digital. “Data quality, accessibility, and governance are imperatives for AI systems to function effectively, and recognizing that is propelling our data strategy.”

Quality

AI-ready data is available, complete, accurate, and high-quality. By adopting this standard, our data scientists, engineers, and even our AI agents are better able to locate, process, and govern the information needed to drive our organization and maximize AI efficiencies.

By utilizing Microsoft Purview, our data council can oversee the monitoring of data attributes to ensure fidelity. It also monitors parameters to enforce standards for accuracy and completeness.

Accessibility

Ensuring that our employees get access to the information they need while prioritizing security is a foundational element of our enterprise data strategy. Microsoft Fabric allows us to unify our organization’s siloed data in a single “mesh” that enables advanced analytics, data science, data visualization and other connected scenarios.

Microsoft Purview then gives us the ability to democratize that data responsibly. By implementing a data mesh architecture, our employees can work confidently, unencumbered by siloed or inaccessible data, and with the assurance that the data they’re working with is secure.

A graphic shows how the data mesh architecture allows employees to access data they need, with platform services and data management zones surrounding this architecture.
The data mesh architecture enables our employees to do their work efficiently while preventing the data they’re working on from becoming siloed.

The data mesh connects and distributes data products across domains, enabling shared data access and compute while scaling beyond centralized architectures.

Platform services are standardized blueprints that embed security, interoperability, policies, standards, and core capabilities—providing guardrails that enable speed without fragmentation.

Data management zones provide centralized governance capabilities for policy enforcement, lineage, observability, compliance, and enterprise-wide trust.  

Governance

As organizations scale AI capabilities, strong governance becomes essential to ensure security, compliance, and ethical data use. Data governance—which includes establishing data policies, ensuring data privacy and security, and promoting ethical AI usage—is critical, as is compliance with General Data Protection Regulation (GDPR) and Consumer Data Protection Act (CDPA) regulations, among others.

However, governance is not only a technical capability; it’s also a cultural commitment.

Responsible data use must be embedded into the way teams manage data and build AI solutions. Through Microsoft Purview, we implemented an end-to-end governance framework that automates the discovery, classification, and protection of sensitive data across the enterprise data landscape.

This unified approach allows teams to innovate confidently, knowing that the data powering their insights and AI systems is trusted and protected, as well as responsibly managed.

“AI systems are only as reliable as the data that powers them,” Uribe says. “By investing in trusted and well-managed data, we accelerate not only the adoption of AI tools but our ability to generate meaningful insights and intelligent outcomes.”

The data catalog as the discovery layer

By serving as a common discovery layer for humans and AI, the data catalog ensures that governance translates directly into speed, accuracy, and trust at scale.

A unified data strategy only succeeds if both people and AI systems can consistently find the right data. At Microsoft, this is enabled by our enterprise data catalog, which operationalizes the standards set by our data council. 

For business users, the catalog provides intuitive search, ownership transparency, and trust signals—enabling confident self‑service analytics. For AI agents, the same catalog exposes machine‑readable metadata, allowing agents to programmatically discover canonical datasets, validate schema and freshness, and respect governance constraints.

Our role as Customer Zero

In Microsoft Digital, we operate as Customer Zero for the company’s enterprise solutions, so that our customers don’t have to.

That means we do more than adopt new products early. We deploy them at enterprise-scale, operate them under real‑world constraints, and hold them to the same standards our customers expect. The result is more resilient, ready‑to‑use solutions and a higher quality bar for every enterprise customer we serve.

A photo of Baccino.

“When we engage product teams with real telemetry from how data is created, governed, and consumed at scale, we move the conversation from theory to execution. That’s how enterprise readiness becomes real.”

Diego Baccino, principal software engineering manager, Microsoft Digital

Our data council embodies this Customer Zero mindset through its Enterprise Readiness initiative. By engaging product engineering as a unified enterprise voice, the council drives strategic conversations that surface operational blockers, influence roadmap prioritization, and ensure new and existing data solutions are truly ready for enterprise use.

These learnings are then shared broadly across Microsoft Digital to accelerate adoption, reduce duplication, and scale proven patterns across teams.

“When we engage product teams with real telemetry from how data is created, governed, and consumed at scale, we move the conversation from theory to execution,” says Diego Baccino, a principal software engineering manager in Microsoft Digital and a member of the council. “That’s how enterprise readiness becomes real.”

This work is deeply integrated with our AI Center of Excellence (CoE), where Customer Zero principles are applied to accelerate AI outcomes responsibly. Together, the AI CoE and the data council focus on improving data documentation and quality—foundational capabilities that are required to make AI feasible, trustworthy, and scalable across the enterprise.

By grounding AI innovation in measurable data quality and governance standards, Microsoft Digital ensures that experimentation can safely mature into production‑ready solutions. The partnership between our data council, our AI CoE, and our Responsible AI (RAI) Council is essential to our broader data and AI strategy.

“AI readiness isn’t aspirational—it’s operational,” Baccino says. “By measuring the health of our data, setting clear quality baselines, and using those signals to guide product and platform decisions, we turn data into a strategic asset and AI into a repeatable capability.”

Together, these teams exemplify what it means to be Customer Zero: Transforming enterprise experience into action, governance into acceleration, and data into durable competitive advantage.

Advancing our data culture

Our data council plays a pivotal role in advancing the organization transition from data literacy to enterprise data and AI capability. In conjunction with our AI CoE, it creates curricula and sponsors learning pathways, operational practices, and community programs to equip our employees with the skills and mindset required to thrive in a data- and AI-centric world.

While early efforts focused on improving data literacy, our data council ’s mission has evolved to enable data and AI capability at scale together with our AI CoE—where employees not only understand data but can effectively apply it to build, operate, and govern intelligent solutions.

“Our focus is not just teaching our teams about data. It is enabling employees to apply data to create AI-driven outcomes. When teams understand how data powers AI systems, they can make better decisions, design better products, and build more responsible AI experiences.”

Miguel Uribe, principal product manager, Microsoft Digital

Our curriculum includes high-level courses on data concepts, applications, and extensibility of AI tools like Microsoft 365 Copilot, as well as data products like Microsoft Purview and Microsoft Fabric.

By facilitating AI and data training, offering internally focused data and AI certifications, and internal community engagement, our council ensures that employees develop the capabilities required to responsibly build and operate AI-powered solutions. Achieving data and AI certifications not only promotes career development through improved data literacy, it also enhances the broader data-driven culture within our organization.

“We recognize that AI capability is built when data skills are applied directly to real AI scenarios and business outcomes—not when learning exists in isolation,” Uribe says. “Our focus is not just teaching our teams about data; it is enabling employees to apply data to create AI‑driven outcomes. When teams understand how data powers AI systems, they can make better decisions, design better products, and build more responsible AI experiences.”

Lessons learned

Our data council was created to develop and execute a cohesive data strategy across Microsoft Digital and to foster a strong data culture within our organization. Over time, several critical lessons have emerged.

Executive sponsorship enables transformation

Executive sponsorship is a key element to ensure implementation and adoption of a data strategy. Our leaders are committed to delivering and sustaining a robust data strategy and culture and have been effective champions of the council’s work.

“Leadership provides support and reinforcement of the council’s mission, as well as guidance and clarity related to diverse organizational priorities,” Baccino says.

Cross-functional collaboration accelerates impact

Our council’s work has also benefited from the diverse representation offered by different disciplines across our organization. Embracing diverse perspectives and understanding various organizational priorities is critical to implementing a successful data strategy and culture in a large and complex organization like Microsoft Digital.

Modern platforms allow for scalable AI productivity

Technology and architecture also play a critical role in enabling enterprise data and AI capability. Platforms like Microsoft Purview and Microsoft Fabric provide the governance, discovery, and analytics infrastructure required to create trusted, AI-ready data ecosystems.

Combined with strong leadership support and community engagement, these platforms allow our organization to move beyond isolated data projects toward connected, enterprise-wide intelligence.

As our organization continues to evolve, our data council’s strategic work and valuable insights will be crucial in shaping the future of data-driven decision making and AI transformation at Microsoft.

Key takeaways

Here are some things to keep in mind as you contemplate forming a data council to help you manage and scale AI impacts responsibly at your own organization:

  • A data mesh strikes the balance enterprises have been chasing. By formalizing domain ownership while enforcing standards through shared platforms, you avoid both chaotic decentralization and slow, over-centralized control.
  • Governance is an accelerator when it’s automated and embedded. Using platforms like Microsoft Purview and Microsoft Fabric, governance shifts from a manual gatekeeping function to a built‑in capability that enables faster, trusted analytics and AI.
  • AI systems are only as strong as their discovery layer. A unified enterprise data catalog allows both people and AI agents to find, trust, and use data consistently—turning standards into operational speed.
  • Customer Zero turns theory into enterprise‑ready execution. By operating its own data and AI platforms at scale, Microsoft Digital provides real telemetry and practical feedback that directly shapes product readiness.
  • Building AI capability is a cultural effort, not just a technical one. Our data council’s focus on applied learning, certification, and real-world AI scenarios ensures data skills translate into durable business outcomes.
  • AI scale exposes the cost of fragmented data ownership. A data council cuts through silos by aligning priorities, resolving tradeoffs, and concentrating investment on the data assets that matter most for AI impact.
  • Shared metrics create shared ownership. Publishing data quality and AI‑readiness scores at the leadership level reinforces accountability and positions data as a core enterprise asset.

The post Harnessing AI: How a data council is powering our unified data strategy at Microsoft appeared first on Inside Track Blog.

]]>
23030
Microsoft CISO advice: The most important thing to know about securing AI http://approjects.co.za/?big=insidetrack/blog/microsoft-ciso-advice-the-most-important-thing-to-know-about-securing-ai/ Thu, 02 Apr 2026 16:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=22863 Using AI comes with inherent risks. In a recent video, Yonatan Zunger, CVP and deputy CISO for Microsoft, suggests thinking about AI as a new intern will help you naturally take the right approach to AI security.  Zunger and his team focus on AI safety and security. They consider all the different ways anything involving […]

The post Microsoft CISO advice: The most important thing to know about securing AI appeared first on Inside Track Blog.

]]>
Using AI comes with inherent risks. In a recent video, Yonatan Zunger, CVP and deputy CISO for Microsoft, suggests thinking about AI as a new intern will help you naturally take the right approach to AI security. 

Zunger and his team focus on AI safety and security. They consider all the different ways anything involving working with AI can go wrong.

“An important thing to know about AI is that AI’s make mistakes,” Zunger says. “You already know how to work with systems that make mistakes, get tricked.”

Watch this video to see Yonatan Zunger discuss his advice for working with AI. (For a transcript, please view the video on YouTube: https://youtu.be/b1x6gDbSWVY. )

The post Microsoft CISO advice: The most important thing to know about securing AI appeared first on Inside Track Blog.

]]>
22863
Deploying Microsoft Baseline Security Mode at Microsoft: Our virtuous learning cycle http://approjects.co.za/?big=insidetrack/blog/deploying-microsoft-baseline-security-mode-at-microsoft-our-virtuous-learning-cycle/ Thu, 26 Mar 2026 16:05:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=22829 The enterprise security frontier isn’t just evolving. It’s accelerating beyond the limits of traditional security models. AI acceleration, cloud adoption, and rapid growth of enterprise apps have dramatically expanded the attack surface. Every new app introduces a new identity. Every identity carries permissions. Over time, those permissions accumulate, often without clear ownership or regular review. […]

The post Deploying Microsoft Baseline Security Mode at Microsoft: Our virtuous learning cycle appeared first on Inside Track Blog.

]]>
The enterprise security frontier isn’t just evolving. It’s accelerating beyond the limits of traditional security models.

AI acceleration, cloud adoption, and rapid growth of enterprise apps have dramatically expanded the attack surface. Every new app introduces a new identity. Every identity carries permissions. Over time, those permissions accumulate, often without clear ownership or regular review.

A photo of Ganti.

“An app is another form of identity. In a cloud-first, Zero Trust world, identity becomes the primary security perimeter, and access is governed by the principle of least privilege. Whether it is a user, an app, or an agent, when permissions are overly broad or elevated the blast radius expands dramatically, increasing risk exponentially.”

B. Ganti, principal architect, Microsoft Digital

Inside Microsoft Digital—the company’s IT organization—we recognized this early. Many of our highest‑risk security scenarios didn’t start with malware or phishing. They started with access. Specifically, apps running with permissions beyond what they required.

“An app is another form of identity,” says B. Ganti, principal architect in Microsoft Digital. “In a cloud-first, Zero Trust world, identity becomes the primary security perimeter, and access is governed by the principle of least privilege. Whether it is a user, an app, or an agent, when permissions are overly broad or elevated the blast radius expands dramatically, increasing risk exponentially.

Traditional security approaches such as periodic reviews, best‑practice guidance, and point‑in‑time hardening weren’t enough in an environment that changes daily. Configurations drift, new apps appear, and risk grows quietly in places that are hard to see at scale.

That reality forced a mindset shift internally here at Microsoft. Security couldn’t be optional. It couldn’t be advisory. And it couldn’t be static.

Our team operates one of the largest enterprise environments in the world, with tens of thousands of apps and a culture built on self‑service and autonomy. That scale drives innovation, but it also amplifies risk.

Our application identities became one of the most complex governance challenges we faced. Our ownership wasn’t always clear. Our permissions were often granted broadly to avoid disruption. And once approved, access rarely came under scrutiny again.

“As a self‑service organization, we empower people to move fast,” Ganti says. “But that also means apps get created, permissions get granted, and not everyone always remembers why.”

The rise of AI‑powered apps and agents—often requiring access to large volumes of data—increased our risk further.

Photo of Fielder

“We’re using Microsoft Baseline Security Mode to move security from guidance to enforcement. It establishes secure‑by‑default configurations that scale across our environment, so teams can innovate quickly without inheriting unnecessary risk.”

Brian Fielder, vice president, Microsoft Digital

We needed a system to reduce that risk systematically, not one app at a time.

Microsoft Baseline Security Mode (BSM) became that system—a prescriptive, enforceable baseline that defines what “secure” means and keeps it that way.

“We’re using Microsoft Baseline Security Mode to move security from guidance to enforcement,” says Brian Fielder, vice president of Microsoft Digital. “It establishes secure‑by‑default configurations that scale across our environment, so teams can innovate quickly without inheriting unnecessary risk.”

Defining Microsoft Baseline Security Mode

BSM is more than just a checklist of recommended settings. It’s an enforced security baseline built directly into the Microsoft 365 admin center, designed to reduce attack surface by default across core Microsoft 365 workloads.

It was developed and then deployed internally at Microsoft, with our team in Microsoft Digital serving as a close design and deployment partner throughout the process.

A photo of Wood.

“The settings in the Microsoft Baseline Security Mode were informed by years of experience in running our planet-scale services, and by analyzing historical security incidents across Microsoft to harden the security posture of tenants. The team identified concrete security settings that would prevent or significantly reduce known security vulnerabilities.”

Adriana Wood, principal product manager, Microsoft 365 security

At a technical level, BSM establishes a minimum required security posture by applying Microsoft‑managed policies and configuration states across services including Exchange Online, SharePoint Online, OneDrive, Teams, and Entra ID. The focus is on eliminating common misconfigurations, rather than theoretical or edge‑case risks.

“The settings in the Microsoft Baseline Security Mode were informed by years of experience in running our planet-scale services, and by analyzing historical security incidents across Microsoft to harden the security posture of tenants,” says Adriana Wood, a principal product manager for Microsoft 365 security. “The team identified concrete security settings that would prevent or significantly reduce known security vulnerabilities. The resulting mitigation controls were implemented and validated in Microsoft’s enterprise tenant, with Microsoft Digital evaluating operational impact, rollout characteristics, and failure modes before making it more broadly available to our customers.”

Legacy baselines rely on documentation and manual implementation. Administrators interpret guidance, apply settings where feasible, and revisit them periodically. In dynamic cloud environments, that model breaks down fast. Configurations drift, exceptions accumulate, and security degrades.

A photo of Bunge.

“Before enforcement, administrators can use reporting and simulation tools to understand how a baseline will affect users, apps, and workflows. That visibility allows teams to identify noncompliant assets, prioritize remediation by risk, and avoid unexpected disruptions.”

Keith Bunge, principal software engineer, Microsoft Digital

BSM replaces that approach with policy‑driven enforcement.

Now our controls are applied consistently across the tenant and continuously validated. When our configurations fall out of compliance, our risk surfaces immediately—it’s not discovered months later in an audit. The model is simple: get clean, stay clean.

Another key capability of BSM is impact awareness.

“Before enforcement, administrators can use reporting and simulation tools to understand how a baseline will affect users, apps, and workflows,” says Keith Bunge, a principal software engineer in Microsoft Digital. “That visibility allows teams to identify noncompliant assets, prioritize remediation by risk, and avoid unexpected disruptions. Our team in Microsoft Digital partnered closely with the product group to ensure these capabilities were practical for real enterprise deployments, not just greenfield environments.”

BSM is also not static.

The baseline evolves on a regular cadence to reflect changes in the threat landscape, new Microsoft 365 capabilities, and lessons learned from operating at scale.

From our perspective, BSM is not just a feature. It’s a security operating model. It shifts the default from “secure if configured correctly” to “secure by default.” Security decisions move out of individual teams and into a consistent, centrally enforced baseline. The question is no longer whether a control should be applied, but whether an exception is truly necessary—and how the associated risk will be mitigated.

That shift is what makes BSM sustainable at scale. And it’s why apps—where identities, permissions, and data access converge—became the next focus area for us in Microsoft Digital.

Addressing apps and high-risk surfaces

When we evaluated risk across our environment, one pattern was clear: Our apps represented both our most concentrated and least governed attack surface.

Apps are identities. They authenticate. They’re granted permissions. And unlike human users, they often operate continuously, without reassessment or visibility.

In a large, self‑service environment like ours, apps are created constantly by engineering teams, business groups, and automation workflows. Over time, many of those apps could accumulate permissions beyond what they actually needed, particularly within our Microsoft Graph. Our delegated permissions were especially risky, because they allow apps to act on our employees’ behalf at machine speed across massive data sets.

“As a user, I might not know where all my data lives,” Ganti says. “But an app with delegated permissions doesn’t have that limitation. It can search everything, everywhere, all at once.”

The challenge wasn’t just volume—it was inconsistency.

Our ownership was often unclear. Our permission reviews were infrequent or manual. And once we granted elevated access, we had few systemic controls in place requiring it to be revisited.

Microsoft Baseline Security Mode addresses this directly by treating apps explicitly as identities that must conform to least‑privilege principles.

We started with visibility. We inventoried apps and analyzed permission scopes, authentication models, and potential blast radius. Our apps with broad Microsoft Graph permissions, access to large volumes of unstructured data, or unclear ownership were prioritized. In some cases, we reduced permissions to more granular scopes. In others, we rearchitected apps to use delegated access more safely—or we retired them altogether.

This work was intentionally structured as a burndown, not a one‑time cleanup.

Removing our excess permissions was only half the equation. Preventing them from coming back was just as critical. BSM introduced guardrails earlier in the app lifecycle, to surface and control elevated permission requests before they reached production. New or updated apps requesting high‑risk permissions now trigger consistent review, and in many cases are blocked outright unless they meet strict criteria.

Moving from ‘get clean’ to ‘stay clean’

Reducing risk once is hard. Keeping it reduced is harder.

After our initial application burndown, we quickly learned that cleanup alone wouldn’t scale. Even as we reduced permissions and remediated high‑risk apps, new apps continued to appear. Existing apps evolved, teams changed, and without structural controls, the same risks would inevitably return.

BSM enabled us to shift from remediation to sustainability.

It started with visibility.

We needed a reliable way to detect when apps drifted out of compliance. That meant continuously monitoring permission changes, new consent grants, and scope expansions across our tenant. Instead of periodic reviews, we moved to continuous validation tied directly to the baseline.

Next came risk‑based prioritization.

Not every noncompliance carries equal impact. Our apps with broad Microsoft Graph permissions, access to large volumes of data, or unclear ownership were surfaced first. This ensured our security teams focused on material risk, rather than treating every deviation as equal.

It was equally important for us to control how new risk entered the system.

BSM introduces guardrails earlier in the application lifecycle. Our elevated permission requests are surfaced sooner and reviewed more consistently. In many cases, high‑risk permissions are blocked by default unless clear justification and mitigation are in place. Known‑bad patterns are stopped before our teams build or update apps.

Over time, this enforcement model fundamentally changed the operating posture.

Instead of recurring cleanup campaigns, we moved to continuous alignment. Our environment stays closer to the baseline by default. Our deviations are treated as exceptions that require explicit action, not silent drift.

This “stay clean” capability also reduced operational overhead.

As enforcement and validation moved into Microsoft Baseline Security Mode, we retired custom scripts, dashboards, and manual review processes that were difficult to maintain at scale. Our baseline became the source of truth for application security posture, not a snapshot taken after the fact.

Most importantly, we proved that BSM could scale.

“This isn’t limited to Microsoft 365. This is Microsoft, and it expands over time as more services come into scope.”

Jeff McDowell, principal program manager, OneDrive and SharePoint product group

By combining continuous validation, risk‑based prioritization, and enforced guardrails, we established a repeatable model for sustaining security improvements over time.

That model now serves as our foundation for extending BSM to additional workloads and security surfaces across the enterprise.

“This isn’t limited to Microsoft 365,” says Jeff McDowell, a principal program manager in the OneDrive and SharePoint product group. “This is Microsoft, and it expands over time as more services come into scope.”

Operationalizing Microsoft Baseline Security Mode

Defining a baseline is only the first step. Making it work day‑to‑day is the real challenge.

For us in Microsoft Digital, operationalizing BSM meant embedding it directly into how we run security. That required clear ownership, repeatable processes, and tight integration with our existing workflows.

Governance came first.

BSM creates a clear line between what is centrally enforced and what individual teams can influence. The baseline is owned and managed centrally to ensure consistency across the tenant. Our application owners and engineering teams still make design decisions, but within defined guardrails aligned to enterprise risk tolerance.

This clarity reduces friction.

Instead of debating security settings app by app, our teams start from a shared default. Our security conversations shift away from “Can we make an exception?” to “How do we meet the baseline with the least disruption?”

Operationally, BSM is integrated into our application lifecycle.

New apps are evaluated against baseline requirements early, before permissions are broadly granted or dependencies are established. Changes to existing apps, such as new permission requests or expanded scopes, are surfaced automatically and reviewed in context, rather than discovered months later during audits.

In an environment where apps are constantly being created, updated, and retired, automation is essential. Without policy‑driven enforcement, our security teams would be managing a perpetual backlog of reviews. BSM allows us to focus on true exceptions instead of revalidating the baseline itself.

That baseline is also embedded into our ongoing operations.

Our security posture is monitored continuously, not through periodic snapshots. When our configurations drift or new risks appear, we identify them early and address them while the blast radius is still small. Over time, this reduces both our operational effort and incident response overhead.

Perhaps our most important change was cultural.

BSM normalizes the idea that security defaults are foundational. Our teams still innovate and move quickly—but they do so in an environment where secure is expected, enforced, and sustained.

Embracing the feedback loop as Customer Zero

From the start, our team in Microsoft Digital deployed Microsoft Baseline Security Mode as Customer Zero: We applied early versions in our live, large‑scale enterprise environment, where we fed our real‑world learnings back to the product group. That feedback loop became central to how the platform evolved.

Running BSM at Microsoft scale quickly exposed challenges that don’t appear in smaller tenants. Visibility was one of the first. With thousands of apps and constantly changing permissions, it was difficult to pinpoint which apps violated least‑privilege principles and where security teams should focus first.

Those gaps directly shaped the product. Reporting and analytics were refined to better surface elevated permissions, risky scopes, and noncompliant apps, helping teams move from investigation to action more quickly.

Scalability was another critical lesson.

Controls that worked for dozens of apps didn’t automatically work for thousands. Our team needed policies that were opinionated, enforceable, and operationally sustainable without constant adjustment. That pushed BSM toward clearer defaults and stronger enforcement boundaries.

“What made the collaboration work is that Microsoft Digital was deploying this in a real tenant with real consequences,” Wood says. “Their feedback helped us understand what enterprises actually need to adopt these controls successfully, not just what looks good on paper.”

Over time, this became a virtuous cycle. Our team surfaced friction and risk through deployment. The product group translated those insights into product improvements. We then adopted those same improvements to replace custom tooling and manual processes.

For customers, this matters. The controls in BSM are shaped by operational reality, tested under scale and refined so other organizations don’t have to learn the same lessons the hard way.

What’s next for Microsoft Baseline Security Mode

Future iterations of BSM will expand coverage beyond traditional collaboration services to additional platforms and services, while maintaining the same opinionated approach. The goal is not to restrict environments indiscriminately, but to ensure new capabilities are introduced with security baked in from the start.

As compliance requirements grow more complex and more global, organizations need a consistent, defensible security baseline. BSM provides a Microsoft‑managed standard informed by real‑world attack patterns and enterprise deployment realities.

Controls evolve. Scope expands. Feedback loops remain active. As new risks emerge, the baseline adapts, without requiring organizations to redefine their security posture from scratch.

It’s a foundation designed to support whatever comes next.

Key takeaways

If you’re ready to strengthen your organization’s security posture with Microsoft Baseline Security Mode, consider these immediate actions:

  • Establish clear ownership. Assign responsibility for baseline security management to ensure consistency and accountability.
  • Implement repeatable processes. Develop standardized procedures to evaluate and enforce baseline requirements throughout the app lifecycle.
  • Integrate with existing workflows. Embed security controls into daily operations to reduce friction and streamline compliance.
  • Prioritize automation and monitoring. Use automated enforcementand continuous validation for early risk detection and response.
  • Foster a security-first culture. Normalize secure defaults and encourage teams to innovate within defined guardrails.
  • Design for evolution. Design your baseline to adapt as new services, platforms, and compliance needs arise.

The post Deploying Microsoft Baseline Security Mode at Microsoft: Our virtuous learning cycle appeared first on Inside Track Blog.

]]>
22829
Microsoft CISO advice: Read our four tips for securing your network http://approjects.co.za/?big=insidetrack/blog/microsoft-ciso-advice-read-our-four-tips-for-securing-your-network/ Thu, 19 Mar 2026 16:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=22779 Geoff Belknap, CVP and operating CISO for Core and Enterprise, shares four key practices your business can use to be prepared for managing network security incidents. Learn from our experience Network isolation (Secure Future Initiative) “Knowing where devices are, who owns them, and what they’re supposed to be doing is pretty important in the middle […]

The post Microsoft CISO advice: Read our four tips for securing your network appeared first on Inside Track Blog.

]]>
Geoff Belknap, CVP and operating CISO for Core and Enterprise, shares four key practices your business can use to be prepared for managing network security incidents.

“Knowing where devices are, who owns them, and what they’re supposed to be doing is pretty important in the middle of an incident,” Belknap says.

Watch this video to see Geoff Belknap discuss how we’re securing our network at Microsoft. (For a transcript, please view the video on YouTube: https://www.youtube.com/watch?v=nWPaaTHGE-M.)

Key takeaways

Here are best practices you can use to secure your network:

  • Build a complete inventory. Keep track of what your network devices are, who owns them, and what they do.
  • Capture robust telemetry. Make sure your operational teams have the tools they need to see and analyze access and authentication logs.
  • Use dynamic access control. Manage who can send packets on the corporate network by applying policies.
  • Deprecate old network assets. Cyberattackers know to look for older, unpatched network devices. You can reduce the attack surface by replacing older devices.

The post Microsoft CISO advice: Read our four tips for securing your network appeared first on Inside Track Blog.

]]>
22779
Microsoft CISO advice: Explore our four tips for securing your customer support ecosystem http://approjects.co.za/?big=insidetrack/blog/microsoft-ciso-advice-explore-our-four-tips-for-securing-your-customer-support-ecosystem/ Thu, 12 Mar 2026 16:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=22635 Microsoft business operations teams know all too well that cyberattackers seek to exploit customer support pathways. Tools that can unlock customer accounts or aid in troubleshooting issues in complex environments are a rich target. “The path attackers really like to use is to compromise support tooling and laterally move to your core tooling,” says Raji […]

The post Microsoft CISO advice: Explore our four tips for securing your customer support ecosystem appeared first on Inside Track Blog.

]]>
Microsoft business operations teams know all too well that cyberattackers seek to exploit customer support pathways. Tools that can unlock customer accounts or aid in troubleshooting issues in complex environments are a rich target.

“The path attackers really like to use is to compromise support tooling and laterally move to your core tooling,” says Raji Dani, Deputy Chief Information Security Officer (CISO) for Microsoft business operations.

Dani and her team focus on understanding and mitigating the risks within customer support operations. In this video, she shares principles and practices for every business that relies on online tools in their customer support ecosystem.

Watch this video to see Raji Dani discuss four customer support ecosystem security principles. (For a transcript, please view the video on YouTube: https://www.youtube.com/watch?v=rJ87jjz3vvo .)

Key takeaways

Here are best practices you can apply to your customer support ecosystem:

  • Create dedicated and isolated support identities. Use standardized support identities with phish-resistant multifactor authentication based in a separate identity ecosystem.
  • Implement least privilege and enforce device protection. Only grant the access needed for a given task and nothing more.
  • Ensure tooling does not have high privilege access to customer data. Architect secure tools and manage service-to-service trust and high privileged access.
  • Implement strong telemetry. Anomalous patterns in logs and telemetry data are often the first clue a cyberattack is underway.

The post Microsoft CISO advice: Explore our four tips for securing your customer support ecosystem appeared first on Inside Track Blog.

]]>
22635
Powering the new age of AI-led engineering in IT at Microsoft http://approjects.co.za/?big=insidetrack/blog/powering-the-new-age-of-ai-led-engineering-in-it-at-microsoft/ Thu, 05 Mar 2026 17:05:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=22539 When generative AI burst into the mainstream, it landed in our IT engineering organization like a shockwave. There was excitement, curiosity, skepticism, and no shortage of questions about what this technology meant for the future of IT. At Microsoft Digital—the company’s IT organization—we didn’t start with a grand transformation plan. Instead, we started with a […]

The post Powering the new age of AI-led engineering in IT at Microsoft appeared first on Inside Track Blog.

]]>
When generative AI burst into the mainstream, it landed in our IT engineering organization like a shockwave.

There was excitement, curiosity, skepticism, and no shortage of questions about what this technology meant for the future of IT.

At Microsoft Digital—the company’s IT organization—we didn’t start with a grand transformation plan. Instead, we started with a realization: AI wasn’t just another tool to roll out. It was a fundamental shift in how engineering work could happen.

For years, our IT teams have been focused on scale, reliability, and operational excellence. Those priorities didn’t change. What changed were the possibilities.

Suddenly, engineers could draft code in seconds, summarize complex systems instantly, or automate work that had once consumed hours or days. It was an opportunity to take the skills and capabilities of our people and amplify them with AI.

That realization forced us to step back and ask harder questions.

How do you help thousands of engineers understand what AI can actually do to impact their day-to-day work? How do you move from experimentation to trust? And how do you adopt AI in a way that strengthens engineering fundamentals instead of eroding them?

The answer came in the form of a phased journey grounded in people, culture, and continuous learning.

Phase 1: Awareness and access

It might sound surprising when speaking about engineering processes, but our first challenge wasn’t technology; it was understanding.

When generative AI entered the conversation, most engineers saw the headlines and dabbled in various tools, but few understood fully what it meant for their work. Some were excited, others were wary. Many simply didn’t know where to start. That gap between awareness and practical value was the first barrier we had to address.

We realized early that top-down mandates wouldn’t work. Telling engineers to “use AI” without context or relevance would only deepen skepticism. Instead, we focused on something both simpler and more difficult: Exposure.

We started by making AI visible and accessible in the tools engineers already used. GitHub Copilot. Microsoft 365 Copilot. Early copilots embedded directly into engineering workflows. The goal wasn’t immediate productivity gains. It was familiarity. Letting engineers see, firsthand, what AI could and couldn’t do.

A photo of Singhal.

“We encouraged tool usage and adoption so people would at least play around with AI. And once they did, they started seeing the value. That’s when the mindset shifted from ‘AI might replace me’ to ‘AI can be my companion.’”

Mukul Singhal, partner group engineering manager, Microsoft Digital

Just as important, we talked openly about limitations.

AI wasn’t perfect. It hallucinated. It made confident mistakes. And that honesty mattered. By framing AI as an assistant, we reinforced the role of engineering judgment. Engineers didn’t need to fear losing control. They needed to understand how to stay in control.

We also made experimentation safe.

No quotas. No forced adoption metrics. Engineers were encouraged to try AI on low‑risk tasks: summarizing documentation, generating test cases, or exploring unfamiliar codebases. Small wins built confidence, confidence built curiosity, and curiosity drove organic adoption.

As that experimentation took hold, the mindset began to shift.

“We encouraged tool usage and adoption so people would at least play around with AI,” says Mukul Singhal, a partner group engineering manager in Microsoft Digital. “And once they did, they started seeing the value. That’s when the mindset shifted from ‘AI might replace me’ to ‘AI can be my companion.’”

Over time, conversations changed from ‘Should we use AI?’ to ‘Where does AI help most?’

Engineers began sharing prompts, tips, and lessons learned with one another. What started as individual exploration turned into community learning. Awareness gave way to momentum.

Phase one was about providing access to explore, to question, and to learn. And that foundation made everything that followed possible.

Phase 2: Culture shift

Access created awareness and awareness created curiosity.

As more engineers began experimenting with AI, we noticed a pattern. Some teams were moving faster, learning faster, and reducing friction in their day‑to‑day work. Others stalled after initial trials. The difference wasn’t technical skill or capability, it was mindset.

A photo of Mamilla.

“People started shifting from the mindset of ‘Will AI work?’ to ‘AI is working for me.’ I think that was a very transformational shift, to where I believe a lot of engineers in the organization started believing in AI.”

Veera Mamilla, principal group engineering manager, Microsoft Digital

To move forward, we had to shift how AI was perceived from something optional or experimental to something that was simply part of how modern engineering gets done.

That meant normalizing AI as a trusted partner in the engineering process.

Leaders played a critical role in that shift. Rather than positioning AI as a productivity shortcut, they framed it as a way to strengthen engineering fundamentals: clearer design discussions, better documentation, faster feedback loops, and more time for deep problem‑solving. The message was intentional and consistent. Using AI wasn’t about cutting corners, it was about reimagining how work gets done.

We also had to address a fear that surfaced early: that AI adoption was a signal of replacement rather than empowerment.

“People started shifting from the mindset of ‘Will AI work?’ to ‘AI is working for me,’” says Veera Mamilla, a principal group engineering manager in Microsoft Digital. “I think that was a very transformational shift, to where I believe a lot of engineers in the organization started believing in AI.”

That framing mattered.

As engineers incorporated AI into their workflows, success stopped being measured by output alone. The focus shifted to outcomes. Did AI help you understand a system faster? Did it surface risks earlier? Did it free up time to focus on higher‑value work?

Over time, AI stopped feeling like a novelty. It became part of the engineering fabric. We reinforced it through leadership modeling, peer learning, and shared success stories. Teams no longer asked whether AI belonged in their workflows. They asked how to use it responsibly and effectively.

Phase 3: Upskilling and role evolution

Once AI moved from curiosity to expectation, the challenge of skill building became unavoidable.

From the start, we made a deliberate choice: This would be an upskilling and reskilling journey, not a wholesale replacement of roles. The goal wasn’t a new workforce. It was an investment in the one we had.

That decision shaped everything that followed.

Early upskilling efforts focused on practical entry points. Prompt engineering. Tool literacy. Understanding how copilots and early agents behaved in real engineering workflows. We treated these as something every engineer needed to experiment with, regardless of discipline.

But it quickly became clear that skills alone weren’t the full story. Roles themselves were starting to evolve.

A photo of Singh.

“Your title might still be software engineer or principal engineer. But if you’re acting like an AI engineer, what does that actually mean? That question helped us start defining how these roles were evolving.”

Ragini Singh, partner group engineering manager, Microsoft Digital

Across software development, service engineering, and cloud network engineering, the work was shifting from manual execution toward orchestration and oversight. Engineers were no longer expected to do every task end‑to‑end by hand. Instead, they were learning how to guide AI, review its output, and decide where automation made sense and where it didn’t.

As part of this shift, we began researching how the industry itself was redefining engineering roles. Leaders examined emerging job descriptions from across the market and compared them with Microsoft’s own role frameworks. At the time, there was no formal “AI engineer” role in the internal job library. Rather than creating a new title, the focus stayed on evolving expectations within existing roles.

The idea of an “AI‑native engineer” emerged not as a job description, but as a mindset.

An AI‑native engineer still understands systems, architecture, and risk. What’s different is how that expertise gets applied. Routine tasks are delegated to AI. Judgment, design, and accountability stay with the human. Engineers move from doing all the work themselves to supervising work done in partnership with AI.

“Your title might still be software engineer or principal engineer,” says Ragini Singh, a partner group engineering manager in Microsoft Digital. “But if you’re acting like an AI engineer, what does that actually mean? That question helped us start defining how these roles were evolving.”

This evolution looked different across disciplines. Software engineers focused on AI‑assisted coding, test generation, and spec‑driven development. Service engineers leaned into AI for incident response, knowledge capture, and operational decision support. Cloud network engineers began moving from manual intervention toward intelligent orchestration and agent‑assisted troubleshooting. The common thread wasn’t identical tooling, it was a shared shift toward higher‑order work and reduced toil.

Phase 4: Embedding AI across the engineering lifecycle

By this phase, we knew individual productivity gains were simply the starting point for larger and broader benefits.

Early on, most AI usage showed up in familiar places: Code suggestions, documentation summaries, quick answers. Useful, but fragmented. The bigger opportunity emerged when we stepped back and asked a harder question: What would it look like if AI were embedded across the entire engineering lifecycle, not just used at isolated moments?

We stopped thinking in terms of tools and started thinking in terms of flow. Design. Build. Test. Deploy. Operate. Improve. AI needed to show up across all of it, in ways that reinforced how engineers already worked.

A photo of Sadasivuni.

“If AI is only showing up at one step, you don’t get the full value. The real impact comes when it’s integrated across the lifecycle, where engineers can design, build, operate, and learn faster as a system.”

Sudhakar Sadasivuni, principal group engineering manager, Microsoft Digital

In software engineering, that meant pulling AI earlier into the process. We began using it to help draft requirements, reason through design options, and review code with broader system context to accelerate how quickly we could get to informed decisions. Coding assistance mattered, but it was no longer the center of gravity.

Testing and quality followed a similar pattern. AI supported test generation, defect analysis, and code review, reducing repetitive effort and helping issues surface sooner. That gave engineers more time to focus on quality and architecture instead of cleanup.

In service engineering, we embedded AI into incident management and operational workflows. Engineers used it to summarize incidents, surface relevant knowledge, and analyze signals across systems. In cloud network engineering, AI helped shift work away from manual intervention toward orchestration and intelligent troubleshooting. Across disciplines, the principle stayed the same: AI should reduce friction, not introduce it.

As we scaled this approach, one thing became clear. Embedding AI wasn’t just a technical exercise. It was a systems change.

“If AI is only showing up at one step, you don’t get the full value,” says Sudhakar Sadasivuni, a principal group engineering manager in Microsoft Digital. “The real impact comes when it’s integrated across the lifecycle, where engineers can design, build, operate, and learn faster as a system.”

As AI became part of core workflows, engineers remained accountable for outcomes. AI output was reviewed, tested, and validated like any other engineering input. Embedding AI didn’t lower the bar for rigor. It raised expectations around judgment, oversight, and data quality. We became more deliberate about responsibility and governance.

Over time, these integrations created compound benefits.

Faster design cycles reduced downstream rework. Better testing lowered operational noise. Improved operational insight shortened recovery times. AI stopped being something we used occasionally and became something the engineering system itself was built around.

Phase 5: Eliminating toil and accelerating outcomes

At some point, every AI story hits the same test. Does it actually make engineers’ days better? For us, that proof showed up fastest in elimination of toil.

Across Microsoft Digital, engineers have always spent time on work that was necessary but draining. It included tasks such as manual troubleshooting, repetitive diagnostics, log analysis, and routine operational tasks that kept systems running but didn’t move the organization forward.

AI gave us a chance to change that.

A photo of Garrison.

“Toil reduction is the biggest thing. That’s where engineers’ eyes light up. If we can eliminate toil, people engineers will flock to use AI. I really believe it.”

Beth Garrison, principal cloud network engineer, Microsoft Digital

In cloud network engineering, for example, troubleshooting used to require manually reconstructing what happened, such as logging into devices, chasing configurations, and piecing together context after the fact. As we began introducing agents and machine learning into these workflows, that work shifted. Instead of spending time assembling the picture, engineers could generate the views they needed faster and focus on resolving issues.

The same shift showed up in how we used operational data.

Rather than reacting to incidents after impact, we started using machine learning to analyze logs, identify patterns, and surface anomalies earlier. That moved teams from reactive response toward proactive monitoring and prevention.

One thing became clear very quickly: Toil reduction wasn’t just a benefit; it was the catalyst for adoption.

“Toil reduction is the biggest thing. That’s where engineers’ eyes light up,” says Beth Garrison, a principal cloud network engineer at Microsoft Digital. “If we can eliminate toil, people engineers will flock to use AI. I really believe it.”

Service engineering followed a similar arc.

Across governance, operations, productivity, and cost management, we began applying agents and automation to simplify complex work and reduce manual review cycles. Governance and compliance workflows became faster and more consistent. Operational processes benefited from guided remediation and earlier insight. Knowledge capture improved as documentation and remediation guidance could be generated and updated automatically.

When we removed repetitive work such as manual triage, rote diagnostics, endless documentation cleanup, we transformed how engineers spent their time. More focus on design. More proactive problem‑solving. More energy directed toward improving systems instead of just maintaining them.

Toil reduction made the value of AI tangible. It’s the moment AI stopped being interesting and became indispensable, and our engineering teams started asking where else we can apply it next.

Measuring what matters

By the time AI was embedded across our engineering lifecycle, a new question came into focus: “How do we know it’s working?”

In the early days, we paid close attention to usage. Which tools engineers were trying, where adoption was growing, or where it stalled. Those signals mattered and adoption was the leading indicator that people were getting comfortable and starting to integrate AI into real work.

“Adoption was always the starting point. But we were clear from the beginning that usage isn’t the destination. The real goal is impact; more time for engineers to focus on the work that truly matters.”

Ullas Kumble, principal group software engineering manager, Microsoft Digital

But using AI doesn’t automatically mean better outcomes. So, we shifted the conversation and started asking, “What’s different now that our engineers are using AI?”

That change reframed how we thought about measurement. We began looking beyond tool activity to understand impact across the engineering system. Faster design cycles. Earlier defect detection. Reduced time spent on repetitive operational work. Shorter incident resolution. Clearer documentation. Fewer handoffs. Less rework.

These weren’t abstract metrics. They showed up in the flow of work.

We were intentional about not forcing a single definition of value across every role. Software engineers, service engineers, and cloud network engineers experience impact differently. What mattered was that each team could point to tangible improvements in how work moved through the system.

That perspective shaped how leadership talked about success.

“Adoption was always the starting point,” says Ullas Kumble, a principal group software engineering manager at Microsoft Digital. “But we were clear from the beginning that usage isn’t the destination. The real goal is impact; more time for engineers to focus on the work that truly matters.”

Over time, this approach changed the quality of our conversations. Instead of debating whether AI was worth the investment, teams talked about where it was removing friction and where it still wasn’t delivering enough value. Measurement became a tool for learning and prioritization.

Moving forward

Looking ahead, one lesson stands out: this journey isn’t complete.

AI tools will continue to evolve. Agents will become more capable. Roles will keep shifting. What it means to be an engineer will continue to change. And that means our approach must stay grounded in the same principles that guided us from the start: invest in people, reinforce fundamentals, embed AI into real workflows, and stay honest about what’s working and what isn’t.

We didn’t set out to build an AI‑driven engineering organization overnight, we built it phase by phase.

By meeting engineers where they were
By reshaping culture before redefining roles.
By embedding AI across the lifecycle, not bolting it on.
By reducing toil and measuring impact where it mattered most.

The result is better engineering: powered by AI, guided by human judgment, and built to keep evolving.

Key takeaways

Here’s a set of approaches you can take to establish AI-led engineering for your organization:

  • Start with access and understanding. Give engineers safe, easy access to AI in the tools they already use so curiosity and confidence can develop organically before you push for outcomes.
  • Frame AI as a partner, not a replacement. Position AI as an assistant that strengthens engineering judgment and fundamentals rather than a shortcut or a threat to roles.
  • Normalize experimentation without pressure. Encourage low‑risk experimentation and peer sharing instead of mandates, allowing adoption to grow through visible, practical wins.
  • Invest in upskilling. Focus on evolving skills and expectations within existing roles so engineers learn how to guide, review, and stay accountable for AI‑assisted work.
  • Embed AI across the full engineering lifecycle. Look beyond isolated productivity gains and integrate AI into design, build, test, operate, and improve workflows to unlock system‑level impact.
  • Measure impact where engineers feel it. Move past usage metrics and track outcomes like reduced toil, faster feedback, and improved flow so teams can see where AI is truly making work better.

Try it out

Try GitHub Copilot.

The post Powering the new age of AI-led engineering in IT at Microsoft appeared first on Inside Track Blog.

]]>
22539
Getting started with Windows Hello for Business and Day 1 authentication at Microsoft http://approjects.co.za/?big=insidetrack/blog/getting-started-with-windows-hello-for-business-and-day-1-authentication-at-microsoft/ Thu, 05 Mar 2026 17:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=22530 At Microsoft, we’re relentlessly focused on modernizing our passwordless protections in ways that strengthen our identity and security for everyone at the company. At an organization the size of ours—with a global workforce, massive cloud footprint, and millions of identities to protect—relying on passwords wasn’t a sustainable security posture. We needed something stronger, simpler, and […]

The post Getting started with Windows Hello for Business and Day 1 authentication at Microsoft appeared first on Inside Track Blog.

]]>
At Microsoft, we’re relentlessly focused on modernizing our passwordless protections in ways that strengthen our identity and security for everyone at the company.

At an organization the size of ours—with a global workforce, massive cloud footprint, and millions of identities to protect—relying on passwords wasn’t a sustainable security posture. We needed something stronger, simpler, and more secure.

This led to the introduction of Windows Hello for Business, which was first built into Windows 10 and then Windows 11. Windows Hello for Business replaces traditional passwords with hardware‑backed keys tied to a user’s device.

So, instead of typing a “secret phrase” that can be phished or leaked, our employees authenticate with biometrics or a PIN that never leaves the device. It’s fast, intuitive, and—most importantly—resistant to the kinds of attacks that plague password‑based systems.

A photo of Kabir.

“This wasn’t just a technology shift—it was a structural change in how we establish trust across the organization. The lessons we learned offer a practical blueprint for any organization looking to strengthen their security while also reducing friction for their workforce.”

Abu Kabir, director of IT service management, Microsoft Digital

Rolling out passwordless authentication at a large company like ours took more than just introducing new technology. It also required that we come up with a new way to onboard our employees securely, no matter where they work.  

The first step we took toward passwordless credentials was to create Identity Pass, which included an emphasis on Day 1 authentication (on a new employee’s first day at Microsoft). By combining strong identity proofing, a Temporary Access Pass (TAP), and automated onboarding workflows, we forged an identification system where employees could unbox their device, sign in securely, and register their credentials without ever needing a password.

The result wasn’t just a smoother user experience.

“This wasn’t just a technology shift—it was a structural change in how we establish trust across the organization,” says Abu Kabir, a director of IT service management in Microsoft Digital, the company’s IT organization. “The lessons we learned offer a practical blueprint for any organization looking to strengthen their security while also reducing friction for their workforce.”

How we launched passwordless authentication

To understand how we worked through the details of passwordless authentication, it’s helpful to explain how it was implemented in the first place.

Our passwordless security system includes several components, including face or fingerprint, a PIN tied to their device, and a physical security key (like a YubiKey), but this story focuses these on two:

  • Identity Pass: the internal system for secure, passwordless onboarding and recovery
  • Windows Hello for Business: the phishing‑resistant credential that Identity Pass helps users register

Identity Pass

Identity Pass, which is only used internally here at Microsoft, uses several tools to “bootstrap” the user, which is the first step in establishing trust among a user, a device, and an identity system. It’s the moment when you go from “nothing trusted” tosomething trusted.” Everything that happens afterward depends on getting that moment right.

Identity Pass relies on three core elements:

  • Verified ID is what we use internally to establish proof of identity. It’s an initial step and is valid for 30 days.
  • Temporary Access Pass (TAP) establishes authentication.
  • Conditional access enforces policy.

Identity Pass is where risk signals matter most, because onboarding and recovery are the moments when identity assurance is weakest. Those risk signals include:

  • Authentication behavior detection: If a user tries to redeem a TAP or Verified ID from an unusual location, device, or pattern, Authentication Behavior Detection can flag a sign in as risky. Identity Pass can then require stronger identity proofing or block the flow.
  • Global high‑risk detection: If our threat intelligence determines the user is likely compromised, Identity Pass will not allow TAP issuance or passwordless registration until the risk is remediated.
  • Strong fraud indicators: If the user’s session or token shows signs of fraud (token replay, hijacking, malicious infrastructure), Identity Pass will force remediation and block bootstrap flows.
  • Risk‑based identity assurance: This is the decision engine that takes security signals and determines what level of assurance is required. For example:
    • Low risk = allow TAP issuance
    • Medium risk = require Verified ID reproofing
    • High risk = block and escalate

Identity Pass is essentially the front door where these signals decide whether a user can even begin the passwordless journey.

Windows Hello for Business

Windows Hello for Business is the strong, phishing‑resistant credential that Identity Pass helps users register. Once this is in place, the risk signals listed above continue to influence authentication.

  • Authentication behavior detection: Windows Hello for Business sign‑ins are evaluated like any other. If the user suddenly authenticates from an impossible location or unusual device, this system flags it as a sign‑in risk.
  • Global high‑risk detection: If our detects a high‑confidence compromise, Windows Hello for Business sessions can be revoked via Continuous Access Evaluation. The user then reregisters through Identity Pass.
  • Strong fraud indicators: If a Windows Hello for Business token is replayed or misused, this system triggers immediate revocation and forces secure recovery.
  • Risk‑based identity assurance: This determines whether Windows Hello for Business alone is sufficient, or whether the user must step up to a stronger method based on risk.

Windows Hello for Business is the credential, but the risk signals determine whether that credential is trusted at any given moment.

What we learned: Rollout and implementation

While our toolsets and protocols offer a clear path for any organization moving toward passwordless authentication, transferring users from a typical user/password security setup can have a variety of challenges—especially at the outset.

Devices, environments, and remote work all matter

When an organization adopts identity‑based, passwordless authentication, one of the first realities it confronts is that the onboarding experience isn’t uniform. Employees don’t all show up with the same hardware, the same operating system version, or the same security capabilities. That diversity has a direct impact on how smoothly a user can complete the initial Day 1 setup and register a strong, phishing‑resistant credential.

A photo of Scott.

“It’s not one-size-fits-all. The onboarding experience can be different by platform, version, and device. The further away you get from a homogenized environment, the more complexity you introduce.”

Matt Scott, senior IT service manager, Microsoft Digital

Device and platform diversity is one of the defining factors in designing a successful passwordless onboarding experience. Any organization adopting identity‑based authentication needs an onboarding system that can adapt to a wide range of hardware, OS versions, and security capabilities while still enforcing a consistent, high‑assurance security model.

Identity proofing and credential registration don’t look the same across platforms. A laptop might support credential setup directly at the login screen, while a mobile device might require an app‑based flow, and a non‑traditional platform might rely entirely on browser‑based enrollment. The underlying model stays consistent, but the user experience varies depending on where the user begins.

“It’s not one-size-fits-all,” says Matt Scott, a senior IT service manager in Microsoft Digital. “The onboarding experience can be different by platform, version, and device. The further away you get from a homogenized environment, the more complexity you introduce.”

Support volume

With Identity Pass in place, we have seen dramatic reductions in password reset volume (80%), onboarding delays, and help desk tickets related to account access. At the initial rollout stage, however, most organizations should anticipate a temporary spike in support needs.

“We expected an increase in volume, because we had recently gotten to 99% in terms of users being identified through Phish-Resistant Multi-Factor Authentication,” Scott says. “In reality, what’s happening is you have a lot of users who are unhappy with the experience as part of the move to a passwordless environment.”

No matter how solid the argument is for a passwordless approach or how cleanly an organization implements it, our experience shows that organizations should expect initial confusion from employees and increased pressure on support teams.

“Moving into a passwordless environment is obviously good for everyone, but we needed to make it easier for users to get the information they needed,” Scott says. “It’s not just one fell swoop of moving from password to passwordless. It’s truly a journey. And it’s very important that change management is part of that journey.”

Helping employees help themselves

Another key learning during our implementation of passwordless authentication was the importance of accessible documentation. This gives users who have yet to establish their identity credentials a way to get unblocked without having to immediately call IT support.

That documentation must stay accurate over time, so it’s crucial to build a governance strategy that ensures updates are made quickly as new devices, platforms, and scenarios emerge.

“During onboarding, if there’s a problem and a user is locked out, they may not have access to the corporate network,” Kabir says. “Having a site that they could access, with actual instruction based on which device they’re using and that shows them how to get past key blockers, was very helpful.”

Maintaining a direct line to leadership in order to help unblock lingering change requests also proved to be essential. In one case, bugs lingered in the engineering queue for days, even weeks, because the escalation path was limited (by design).

“Approval requests were blocked, and so approvals needed to be accelerated to the skip-level approver,” Kabir says. “We were able to move fast to fix that, because we had a clear understanding of the pain that folks were feeling on our side and could effectively communicate that to leadership.”

Short-term pain, long-term gain

The impact has been significant. Instead of spending long cycles troubleshooting forgotten passwords or manually verifying user identities, IT teams can focus on higher‑value work: strengthening identity protection, refining automation, and improving the user experience. This shift not only reduces operational overhead, it also aligns with our Zero Trust principles by removing weak authentication steps from the identity lifecycle.

For employees, the experience is equally transformative. New hires can unbox a device, authenticate using a TAP delivered through a secure Verified ID workflow, and immediately register passwordless methods like Windows Hello for Business. Although the onboarding journey may vary across platforms and devices, the process is fast and intuitive.

For existing users who lose access—whether due to a forgotten PIN, a lost device, or a credential reset—Identity Pass provides a self‑service recovery path that avoids the delays and security risks of traditional reset processes.

Our experience demonstrates that when these processes are redesigned around strong, hardware‑backed, phishing‑resistant credentials, organizations gain both security and efficiency. The result is a more resilient identity foundation that supports the realities of modern work.

Key takeaways

Here are some suggestions for getting started with Windows Hello for Business and Day 1 onboarding:

  • Passwordless authentication start with strong identity proofing. Establishing user identity up front is essential to creating a secure foundation for all future authentication.
  • Day 1 onboarding is the riskiest moment. The initial bootstrap step is where trust is first established, and risk signals matter most.
  • Temporary Access Pass replaces temporary passwords. TAP provides a secure, time‑bound way for users to authenticate and register passwordless credentials without exposing the network to attack.
  • Device and platform diversity shapes the user experience. Different hardware, operating systems, and compute environments require flexible onboarding paths that still enforce consistent security.
  • Support demand spikes before it drops. Organizations should expect short‑term confusion and increased help‑desk volume before passwordless security benefits fully materialize.
  • Long‑term gains are significant. Once deployed, passwordless authentication reduces operational overhead, strengthens security, and improves the user experience across the identity lifecycle.

The post Getting started with Windows Hello for Business and Day 1 authentication at Microsoft appeared first on Inside Track Blog.

]]>
22530
Protecting anonymity at scale: How we built cloud-first hidden membership groups at Microsoft http://approjects.co.za/?big=insidetrack/blog/protecting-anonymity-at-scale-how-we-built-cloud-first-hidden-membership-groups-at-microsoft/ Thu, 26 Feb 2026 17:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=22465 Some Microsoft employee groups can’t afford to be visible. For years, we supported email‑based communities internally here at Microsoft whose very existence depends on anonymity. These include employee resource groups, confidential project teams, and other sensitive audiences where simply revealing who belongs can create real‑world risk. Traditional distribution groups make membership discoverable by default. Owners […]

The post Protecting anonymity at scale: How we built cloud-first hidden membership groups at Microsoft appeared first on Inside Track Blog.

]]>
Some Microsoft employee groups can’t afford to be visible.

For years, we supported email‑based communities internally here at Microsoft whose very existence depends on anonymity. These include employee resource groups, confidential project teams, and other sensitive audiences where simply revealing who belongs can create real‑world risk.

Traditional distribution groups make membership discoverable by default. Owners can see members. Admins can see members. In some cases, other users can infer membership through directory queries or tooling.

That model doesn’t work when anonymity is a requirement.

A photo of Reifers.

“When the SFI wave hit, it was made clear to us that we needed to keep our people safe, and to do that, we needed to build a new hidden memberships group MVP. We needed to raise the bar with modern groups, and we needed to do it in six months or miss meeting our goals.”

Brett Reifers, senior product manager, Microsoft Digital

For over 15 years, we relied on a custom, on‑premises solution that enabled employees to send and receive messages through groups with fully hidden memberships.

The system worked, but we were deprecating the Microsoft Exchange servers that it ran on. At the same time, we were also deploying our Secure Future Initiative (SFI), which required us to reassess legacy systems that could expose sensitive data or slow incident response, including hidden membership groups.

The system wasn’t broken, but it represented concentrated risk simply by existing outside our modern cloud controls and monitoring.

“When the SFI wave hit, it was made clear to us that we needed to keep our people safe, and to do that, we needed to build a new hidden memberships group MVP,” says Brett Reifers, a product manager in Microsoft Digital, the company’s IT organization. “We needed to raise the bar with modern groups, and we needed to do it in six months or miss meeting our goals.”

The mandate was clear. Preserve anonymity, eliminate on‑premises dependencies, and do it quickly.

A photo of Carson.

“Our solution would enable us to deprecate our legacy on-premises Exchange hardware while maintaining the privacy of our employee groups, and it would do so in a cloud-first manner.”

Nate Carson, principal service engineer, Microsoft Digital

Instead of retrofitting hidden membership into standard Microsoft 365 groups, we asked a different question: What if the group lived somewhere else entirely? What if users interacted with a simple, secure front end, while all membership expansion and mail flow occurred in a locked‑down tenant built specifically for this purpose?

That idea became the foundation for Hidden Membership Groups: A new cloud‑first architecture that would separate user experience, leverage first‑party Microsoft services, and keep our group memberships hidden from everyone—including owners and administrators—by design.

“Our solution would enable us to deprecate our legacy on-premises Exchange hardware while maintaining the privacy of our employee groups, and it would do so in a cloud-first manner,” says Nate Carson, a principal service engineer in Microsoft Digital.

Once we settled on a solution, our next step was to get support for solving a problem not many people thought much about.

“Not everyone was aware of how serious of a situation we were in,” Carson says. “We had to show everyone what was at stake, and to share our solution with them.”

After taking their plan on the road, the team got the buy in it needed, and that’s when the real work started.  

Planning to solve business problems with security built-in

Before we designed anything, we had to be clear about what success meant.

Hidden Membership Groups aren’t just another collaboration feature. They support scenarios where anonymity wasn’t optional—it’s foundational. That reality shaped every requirement that we built into our solution, including:

1. Absolute privacy

Group membership couldn’t be immediately visible to users, group owners, or administrators–under any circumstances. That requirement immediately ruled out standard group models.

2. Cloud only

Any new solution had to live entirely in our cloud, use first‑party services, and align with modern identity, security, and compliance practices. On‑premises infrastructure wasn’t an option.

3. Scale

Some groups had a handful of members. Others had tens of thousands. Membership changed frequently, and those changes had to propagate safely and predictably without exposing data or degrading performance.

4. Separation of concerns

User interaction and membership truth couldn’t live in the same place. Employees needed a simple way to discover groups, request access, and manage participation, without ever interacting with the system that stored or expanded membership.

5. Self‑service with guardrails

The solution needed to reduce operational overhead, not introduce a new bottleneck. Group lifecycle management had to be automated, auditable, and secure, while still giving teams flexibility.

6. Simple to use

Employees shouldn’t need special training. They shouldn’t need to understand tenants, identity synchronization, or mail routing. The experience needed to be intuitive, consistent, and accessible—without compromising security.

Once those requirements were clear, our solution started to emerge. Incremental changes wouldn’t be enough. A traditional group model wouldn’t work. The solution required a new architecture—one designed around isolation, automation, and intentional limitation.

That’s when we started the engineering work.

Creating a cloud-first architecture

Designing for hidden membership meant eliminating ambiguity. If any surface could reveal membership, even indirectly, it didn’t belong in the design.

That constraint led us toward a model built on strict isolation, explicit APIs, and intentionally narrow interfaces. The result is straightforward to use, but deliberately difficult to interrogate.

Two tenants, with sharply separated responsibilities

At the foundation of the solution is a two‑tenant model.

Our primary Microsoft 365 tenant is where employees authenticate, discover groups, and initiate actions. A secondary, isolated tenant hosts the distribution lists and performs mail expansion for Hidden Membership Groups.

A photo of Mace.

“Tenant isolation is what makes the privacy guarantee real. By moving membership expansion to a tenant that users and owners can’t access, we removed the possibility of accidental exposure. The system simply doesn’t give you a place where membership can be seen.”

Chad Mace, principal architect, Microsoft Digital

That separation matters because the secondary tenant isn’t designed for interactive use. Only Exchange and the minimum directory constructs required for mail routing and expansion are enabled.

Operationally, when an employee sends email to a Hidden Membership Group, they send to a mail contact visible in the corporate tenant. That contact routes to the corresponding distribution group in the isolated tenant, where membership expansion occurs. Expanded messages are then delivered back in recipients’ inboxes in the corporate tenant, so sent and received mail lives where users already work.

“Tenant isolation is what makes the privacy guarantee real,” says Chad Mace, a principal architect in Microsoft Digital. “By moving membership expansion to a tenant that users and owners can’t access, we removed the possibility of accidental exposure. The system simply doesn’t give you a place where membership can be seen.”

Identity without interactive access

This isolated tenant only works if it can resolve recipients. To enable that, our development team used Microsoft Entra ID multi‑tenant organization identity sync to represent corporate users in the secondary tenant.

These identities are treated as business guest identities, and we disable sign‑in to prevent interactive access. The tenant can perform expansion, but nothing more.

However, complete isolation wasn’t technically possible. Privileged access always exists at some level. The design response was to minimize that exposure. Access to the isolated tenant is tightly restricted, and membership changes flow through automation rather than broad UI-based administration.

The goal: reduce exposure to the smallest viable operational group.

API-first automation as the control plane

With tenancy and identity model established, the team needed a single, consistent way to create groups, connect objects across tenants, and manage changes without introducing new administrative workflows. That’s where the APIs come in.

A photo of Pena II.

“We split the backend into multiple APIs so the system could scale without becoming fragile. That let us separate everyday operations from high-volume membership work and keep performance predictable.”

John Pena II, principal software engineer, Microsoft Digital

The backend is intentionally modular, split into three distinct APIs:

  • The control API handles group creation, configuration, and cross‑tenant coordination.
  • The membership API handles standard add and remove operations.
  • The bulk membership APIs handle large‑scale operations involving tens of thousands of users, with services designed to run long‑lived jobs, manage throttling, and recover from partial failures.

“We split the backend into multiple APIs so the system could scale without becoming fragile,” says John Pena II, a principal software engineer in Microsoft Digital. “That let us separate everyday operations from high-volume membership work and keep performance predictable.”

The APIs run as PowerShell-based Azure Functions and use managed identity patterns, including federated identity credentials, to securely connect across tenants.

Creating the user experience with PowerApps

For the front end, we built a Canvas app in Power Apps, backed by Dataverse. The goal was speed and flexibility, without compromising strict privacy boundaries.

By using Power Apps as the primary interaction layer, we deliver a secure, modern experience without unnecessary custom infrastructure. The Canvas app provides a single, focused surface for discovering, joining, and managing hidden membership groups, while all sensitive operations remain behind controlled APIs and tenant boundaries. This separation allows the team to iterate quickly on experience design without weakening the privacy guarantees that the solution depends on.

Power Platform also simplifies how security is being enforced across the solution. Dataverse enables fine‑grained, role‑based access, ensuring users only see data they’re entitled to see—while keeping sensitive membership information entirely out of the client layer. That reduces long‑term maintenance overhead and makes it easier to evolve the solution as requirements change.

“From the beginning, we designed everything with security roles and workflows in mind,” says Shiva Krishna Gollapelly, senior software engineer in Microsoft Digital. “Dataverse let us control who could see or change data without building additional APIs or storage layers, and keeping everything inside the Power Apps ecosystem saved us a lot of maintenance over time.”

Dataverse plays a precise role here: it maintains the datastore the app needs to function without becoming a secondary membership repository.

A photo of Amanishahrak.

“Using the Power Platform let us move fast, integrate deeply with Microsoft identity, and enforce security without building a full web stack from scratch.”

Bita Amanishahrak, software engineer II, Microsoft Digital

From a security posture perspective, Dataverse security is used intentionally to restrict what different users can see and do, and the Power App was developed with security roles and workflows in mind.

Short version: the app brokers intent, the APIs execute it, and all the pieces that need to stay separate do exactly that.

“Using the Power Platform let us move fast, integrate deeply with Microsoft identity, and enforce security without building a full web stack from scratch,” says Bita Amanishahrak, a software engineer in Microsoft Digital.

The architectural intent is consistent throughout—isolate the sensitive plane and ensure the user plane operates only through controlled interfaces.

Benefits and impact

The most important outcome of the new architecture is also the simplest: Hidden membership stays hidden.

Anonymity isn’t enforced by policy. It’s enforced by architecture. Membership data never appears in the user experience or administrative tooling, and it doesn’t surface as a side effect of scale.

“We’re no longer asking people to trust that we’ll handle sensitive membership carefully through process,” Reifers says. “The system makes exposure structurally impossible.”

The impact was immediate.

At launch, we migrated more than 2,200 hidden membership groups, representing over 200,000 users, from the legacy on‑premises system into the new cloud‑first architecture. Groups ranged from small, tightly controlled communities to audiences with tens of thousands of members, all supported without special handling.

“Some of these groups are massive,” Pena says. “We knew from the beginning we were dealing with memberships in the tens of thousands, which is why we designed bulk operations as a first‑class capability instead of an afterthought.”

The separation between routine APIs and bulk‑membership APIs proved critical, enabling large migrations and ongoing changes without degrading day-to-day performance.

Operationally, moving to a cloud‑only model reduced both risk and complexity. Decommissioning the on‑premises Exchange infrastructure eliminated specialized maintenance requirements and improved monitoring, auditing, and access controls alignment with our modern cloud standards.

Delivery speed also mattered. Driven by Secure Future Initiative urgency and strong executive sponsorship, the team designed and delivered a minimum viable product in less than six months.

“That timeline forced discipline,” Reifers says. “We focused on what mattered: Security, privacy guarantees, scale, and a UX that wouldn’t disrupt group owners and/or members that had relied on a 15-year old tool.”

Everything else was secondary.

A photo of Gollapelly.

“Most users never think about tenants or APIs. They just see a clean experience that does what they need, without exposing anything it shouldn’t.”

Shiva Krishna Gollapelly, senior software engineer, Microsoft Digital

From an employee perspective, the experience became simpler and safer. Users now interact through a Power Platform app consistent with the rest of Microsoft 365.

Discovering a group, requesting access, or leaving a group no longer requires understanding the architecture behind it.

“Most users never think about tenants or APIs,” Gollapelly says. “They just see a clean experience that does what they need, without exposing anything it shouldn’t.”

The result is sustainable. The platform protects anonymity at scale, simplifies operations, boosts resiliency, and can evolve without reopening core privacy questions.

Moving forward

Delivering the initial solution was only the beginning.

The team sees Hidden Membership Groups as more than a single solution. It’s a reusable pattern for sensitive collaboration in a cloud‑first world: isolate what matters most, automate everything else, and design experiences that don’t require trust to be safe.

As adoption grows, the team plans to support additional anonymity-sensitive scenarios while maintaining the same underlying model.

“We don’t want every sensitive scenario inventing its own workaround,” Mace says. “This gives us a pattern we can reuse confidently.”

Future priorities include improving lifecycle and ownership experiences, strengthening auditing and reporting for approved administrators, and enhancing self‑service workflows—without compromising membership privacy. If it risks exposing membership, it doesn’t ship.

With the legacy system fully retired, Reifers reflects on what the team accomplished to get here.

“We shipped a new enterprise pattern in six months using our first party tools,” Reifers says. “We achieved this because a stellar team cared about the mission. That’s the takeaway.”

Key takeaways

Use these tips to strengthen your privacy, simplify your operations, and future-proof your organization’s collaboration systems:

  • Prioritize privacy by design. Embed privacy considerations from the start to protect sensitive information in all collaboration scenarios.
  • Architect for scale. Treat bulk operations to support large groups efficiently as a first-class capability.
  • Automate and modernize workflows. Replace legacy systems with cloud-native solutions to reduce risk, improve transparency, and enable continuous improvement.
  • Streamline user experience. Provide intuitive, consistent interfaces that make it easy for users to access, join, or leave groups without requiring technical knowledge.
  • Enforce strict access and auditing controls. Align monitoring and administration with modern cloud standards to maintain security and accountability.
  • Create reusable patterns. Establish and share successful privacy patterns to avoid reinventing solutions for each new case.
  • Focus on operational simplicity and resilience. Design systems that are easy to maintain and improve, freeing up teams to concentrate on innovation rather than upkeep.

The post Protecting anonymity at scale: How we built cloud-first hidden membership groups at Microsoft appeared first on Inside Track Blog.

]]>
22465
Protecting AI conversations at Microsoft with Model Context Protocol security and governance http://approjects.co.za/?big=insidetrack/blog/protecting-ai-conversations-at-microsoft-with-model-context-protocol-security-and-governance/ Thu, 12 Feb 2026 17:05:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=22324 When we gave our Microsoft 365 Copilot agents a simple way to connect to tools and data with Model Context Protocol (MCP), the work spoke for itself. Answers got sharper. Delivery sped up. New patterns of development emerged across teams working with Copilot agents. That ease of communication, however, comes with a responsibility: Protect the […]

The post Protecting AI conversations at Microsoft with Model Context Protocol security and governance appeared first on Inside Track Blog.

]]>
When we gave our Microsoft 365 Copilot agents a simple way to connect to tools and data with Model Context Protocol (MCP), the work spoke for itself.

Answers got sharper. Delivery sped up. New patterns of development emerged across teams working with Copilot agents.

That ease of communication, however, comes with a responsibility: Protect the conversation.

Questions came up like, who’s allowed to speak? What can they say? And what should never leave the room?

Microsoft Digital, the company’s IT organization, and the Chief Information Security Officer (CISO) team, our internal security organization, are leaning on those questions to help us shape our strategy and tooling around MCP internally at Microsoft.

A photo of Kumar.

“With MCP, the problem is not the inherent design; it’s that every improper server implementation becomes a potential vulnerability. Even one misconfigured server can give the AI the keys to your data.”

Swetha Kumar, security assurance engineer, Microsoft CISO

Our approach is intentionally straightforward.

Start secure by default. Use trusted servers. Keep a living catalog so we always know which voices are in the room. Shape how agents communicate by requiring consent before making changes.

We minimize what’s shared outside our walls, watch for drift, and act when something looks off. Our goal is practical governance that lets builders move fast while keeping our data safe.

That’s the risk we design for, and it’s why our controls prioritize clear ownership, simple choices, and visible guardrails.

“With MCP, the problem is not the inherent design; it’s that every improper server implementation becomes a potential vulnerability,” says Swetha Kumar, a security assurance engineer in the Microsoft CISO organization. “Even one misconfigured server can give the AI the keys to your data.”

Understanding MCP and the need for security

MCP is a simple standard that lets AI systems “talk” to the right tools and data without custom integration work. Think of it like USB‑C for AI. Instead of building a new connection every time, teams plug into a common pattern. That standardization delivers speed and flexibility—but it also changes the security equation.

Before MCP, every integration was its own isolated conversation.

“Now, one pattern can unlock many systems,” Kumar says. “It’s a win and a risk. When AI can reach more systems with less effort, we must be precise about who’s allowed to speak, what they can say, and how much gets shared.”

We frame this as communications security.

The question isn’t just, “Is this API secure?” It’s “Is this a conversation we trust?” We want to know which servers are in the room, what actions they’re permitted to take, and how we’ll notice if something changes. At the same time, we keep the cognitive load low for builders. They choose from trusted options, see clear prompts before an agent makes edits, and move on. Simple choices lead to safer outcomes.

“MCP enables granular control over the tools and resources exposed to the Large Language Model,” Kumar says. “But that means the developer is responsible for configuring it correctly—which tools an agent can see, what actions a server can take, and what context is shared.”

This approach helps both sides.

Product teams get a consistent way to extend their agents while security teams get consistent places to add guardrails—at discovery, access, and throughout the flow of requests and responses. Everyone operates from the same playbook.

When we treat MCP this way, we protect the conversation without slowing it down. We know who’s speaking. We know what they can do. And we can prove it.

Assessing MCP security across four layers

Every MCP session creates a conversation graph. An agent discovers a server, ingests its tool descriptions, adds credentials and context, and starts sending requests. Each step—metadata, identity, content, and code—introduces potential risk.

We evaluate those risks across four layers so we can catch failures early, contain blast radius, and keep conversations in bounds.

However, the big picture is just as important as the details.

“We take a holistic view of MCP security: start with the ecosystem, then specify controls across the four layers,” Kumar says. “The layers make the work concrete, but the goal stays the same—unified governance, shared education, and faster detect-and-mitigate when a server is at risk.”

Applications and agents layer

This is where user intent meets execution. Agents parse prompts, discover tools, select actions, and request changes. MCP clients live here, deciding which servers to trust and when to ask for user consent.

  • What can go wrong
    • Tool poisoning or shadowing. A server advertises safe‑looking actions but performs something else.
    • Silent swaps. A tool’s metadata changes and the client keeps trusting an altered “voice.”
    • No sandbox. The agent can request edits or run code without strong guardrails.
  • What we watch for
    • Unexpected tool descriptions or capabilities at connect time.
    • Edit attempts on critical resources without explicit user consent.
    • Abnormal tool‑selection patterns across sessions.

AI platform layer

The AI platform layer includes the AI models and runtimes that interpret prompts and call tools, along with orchestration logic and safety features.

  • What can go wrong
    • Model supply‑chain drift. Unvetted models, unsafe updates, or compromised fine‑tunes change behavior.
    • Prompt injection via tool text. Descriptions and responses steer the model toward unsafe actions.
  • What we watch for
    • Model provenance and update cadence tied to agent behavior changes.
    • Signals of jailbreaks or instruction overrides in prompts and intermediate messages.
    • Output drift linked to specific tools or servers.

Data layer

This layer covers business data, files, and secrets the conversation can touch.

  • What can go wrong
    • Context oversharing. Session data, files, or secrets get packed into the model’s context and leak to a third‑party server.
    • Over‑scoped credentials. Long‑lived tokens, broad scopes, or wrong audience claims enable lateral movement.
  • What we watch for
    • Size and sensitivity of context passed to tools.
    • Token hygiene, including short lifetimes, least‑privilege scopes, and correct audience claims.
    • Data egress patterns that don’t match a tool’s declared purpose.

Infrastructure layer

The infrastructure layer includes compute, network, and runtime environments.

  • What can go wrong
    • Local servers with too much reach. Excessive access to environment variables, file systems, or system processes.
    • Cloud endpoints without a gateway. No TLS enforcement, rate limiting, or centralized logging.
    • Open egress. Servers call out to the internet where they shouldn’t.
  • What we watch for
    • All remote MCP servers registered behind the API gateway.
    • Runtime signals, such as authentication failures, burst traffic, or unusual geographies.
    • Network policies that restrict outbound calls to certain targets.

Across all four layers, the throughline is AI communications security. We decide who can speak and verify what was said—and keep listening for change.

Establishing a secure-by-default strategy

We start by closing the front door. We recommend every remote MCP server sits behind our API gateway, giving us a single place to authenticate, authorize, rate‑limit, and log. There are no direct calls and no blind spots.

A photo of Enjeti

“Everything we do starts with securing the MCP server by default and that begins by registering it in API Center for easier discovery. We rely solely on vetted and attested MCP servers, ensuring every call comes from a trusted footprint.”

Prathiba Enjeti, principal PM manager, Microsoft CISO

Next, we decide who gets a voice.

Teams choose from a vetted list of MCP servers. If someone connects to an unapproved endpoint, they receive a friendly nudge and a clear path to register it. No shaming—just fast correction and a better inventory the next time around.

Identity comes next. Servers expect short‑lived, least‑privilege tokens with the right scopes and audience. Admin paths require strong authentication, and where possible, we use proof‑of‑possession to bind tokens to the client and reduce replay risk. Secrets don’t live in code, keys rotate, and audit trails are in place.

“Everything we do starts with making the MCP server secure by default and that begins by registering it in API Center for easier discovery,” says Prathiba Enjeti, a principal product manager in the Microsoft CISO organization. “We only use vetted and attested MCP servers. That’s how we keep the conversation safe without slowing it down.“

On the client side, we slow agents at the right moments. Agents can’t touch high‑risk tools without explicit consent. Tool descriptions are verified on connection and compared to approved contracts. If a tool’s “voice” drifts, we block the call.

We also minimize what’s shared.

Context is trimmed to what the task requires. Sensitive data isn’t included by default, and third‑party servers get only what they need—not the whole transcript. Output filters and prompt shields sit alongside the model to prevent risky inputs from becoming risky actions.

Isolation completes the design. Local servers run in containers with tight file and network permissions. Hosted servers allow only the outbound calls they need, and inbound traffic flows through the gateway, with TLS and logging enforced.

Simple rules with visible guardrails.

“We only use vetted MCP servers,” Enjeti says. “That’s how we keep the conversation safe without slowing it down.”

How we run MCP at scale: architecture, vetting, and inventory

We keep MCP safe by making three things intentionally boring: architecture, vetting, and inventory. One defined path. One vetting flow. One living catalog.

Architecture

We recommend remote MCP servers sit behind an API gateway, giving us a single place to authenticate, authorize, validate, rate‑limit, and log. Transport Layer Security (TLS) is required by default, and for sensitive endpoints, we can require mutual TLS. Outbound egress is pinned to approved destinations using private endpoints and firewall rules, so servers can’t “call anywhere.” Runtime protection continuously watches for credential abuse, injection patterns, burst traffic, and odd geographies.

Identity is established up front. We issue short‑lived, least‑privilege tokens with the correct audience and scopes, and admin paths require strong authentication. Where supported, tokens are bound to the client to reduce replay risk. Services use managed identities or signed credentials; secrets don’t live in code, and keys rotate on schedule.

Model‑side safety travels with every conversation. Content safety and prompt shields help models ignore risky inputs, while orchestration enforces a per‑tool allowlist, so an agent can’t call tools that aren’t in policy—even if the model suggests it. We also track model versions, allowing behavior changes to be correlated with updates.

Clients enforce consent at the edge. “Ask before edits” is enabled by default for write, delete, and configuration changes. When an agent connects, it verifies tool descriptions against the approved contract.

Observability ties it all together. We’re working toward logging tool calls, resource access, and authorization decisions end‑to‑end with correlation IDs. Detections flag abnormal tool selection, unexpected data egress, or edits without consent. Every server has an owner, a contract, and an approval record, and metadata changes automatically trigger re‑review. Kill switches live at both the client and the gateway when we need them.

Vetting

We don’t “connect and hope.”

Before any MCP server can speak in our environment, it earns trust. Owners declare what the server does (tools and actions), what it touches (data categories and exports), how callers authenticate (scopes and audience), and where it runs (runtime and on‑call ownership).

We start with static checks: manifests must match the contract, side‑effecting actions must be consent‑gated, tokens must be short‑lived and properly scoped. A SBOM (Software Bill of Materials) must be present, dependencies must be current, and no credentials can be embedded in code.

Then we test like a client would. We snapshot tool metadata on connect and compare it to the approved contract, probe for prompt‑injection and tool‑poisoning, and verify that “ask before edits” triggers for destructive actions.

We also confirm context minimization, validate that egress is pinned to approved hosts, and test resilience under load, including health checks, retry behavior, and isolation using containers with least‑privilege file and network access. Servers are published only when security, privacy, and responsible AI reviews are complete, runbooks and on‑call are in place, and the registry entry is created and pinned.

Inventory

A photo of Janardhanan

“Inventory is the foundation—if we miss a server, we miss the conversation. Every server, regardless of where it’s running or how it’s deployed, must be accounted for in our system.”

Priya Janardhanan, principal security assurance engineering manager, Microsoft CISO

You can’t govern what you can’t see, and MCP shows up in more places than a single system of record. To solve that, we’re building the map from signals and stitch them into one catalog.

“Inventory is the foundation—if we miss a server, we miss the conversation,” says Priya Janardhanan, a principal security assurance engineering manager at Microsoft CISO Operations. “Every server, regardless of where it’s running or how it’s deployed, must be accounted for in our system. Without a complete inventory, we lose visibility into critical operations, risk exposing sensitive data, and undermine our ability to ensure compliance and security.”

Our goal state is that Endpoint telemetry catches developer‑run servers on laptops and workstations. Repos and CI pipelines reveal intent before anything ships. IDEs (Integrated Development Environments) surface local extensions and configured endpoints. The gateway and our registries anchor what’s approved for business data, while low‑code environments tell us which connectors are in use and where they point.

We normalize and correlate those signals with stable IDs for servers, tools, and owners. Ownership is proven through repositories, gateway services, and environment administrators—on‑call contacts included. Exposure is scored based on data touches, scopes requested, egress rules, and change history, so high‑risk items rise to the top of the queue.

Freshness is tracked with last‑seen timestamps, and stale entries are retired over time. Builders can discover and reuse approved servers; reviewers can see what changed since the last approval, and admins get instant visibility into coverage and hotspots.

We’re working toward automated identification and notification for unknow servers. In the ideal state, a registration stub is created when we detect an unknown server on an endpoint. Then, the likely owner is notified, and direct calls are blocked until the server is vetted through an automated process. If tool metadata changes after approval, high-risk actions are paused and routed for re-review, then auto-resumed once approved.

“It all revolves around inventory as the foundation,” Janardhanan says. “If we miss a server, we miss the conversation.”

A photo of Hasan

“Agent 365 tooling servers will allow centralized governance for IT admins. That means a single pane where they can see what’s approved, who owns it, what data it touches, and then apply policy.”

Aisha Hasan, principal product manager, Microsoft Digital

Architecture gives us stable choke points. Vetting keeps weak servers out. Inventory keeps our map current. It’s a single pattern for builders and a unified playbook for security.

Governing agents in low‑code and pro-code scenarios

Makers move fast—that’s the point. A Customer Support team needed a Copilot action to pull case history, so they opened Copilot Studio, selected an approved MCP connector, and shipped a first version before lunch. No tickets. No detours. Governance showed up in the flow, not as a blocker.

“Agent 365 tooling servers will allow centralized governance for IT admins,” says Aisha Hasan, a principal product manager at Microsoft Digital. “That means a single pane where they can see what’s approved, who owns it, what data it touches, and then apply policy. We’re moving toward that consolidation so innovation continues while governance gets simpler and more consistent.”

We place guardrails where makers already work. In Copilot Studio, trusted and verified first-party MCP servers are allowed in developer environments to accelerate innovation and encourage experimentation. Riskier or complex MCP integration is available in Copilot Studio custom environments and other pro-code tools such as Microsoft 365 Agent Tool kit in VS Code and Microsoft Foundry, but only with clear checks: service ownership, security and privacy review, responsible AI assessment, and consent gating for high‑impact actions.

The allowlist is our north star.

Approved MCP servers and connectors live in one catalog with documented owners, scopes, and data boundaries. Makers choose from that shelf. If an MCP server uses an unverified tool, we enforce endpoint filtering. If there is misconfiguration, we open a task for the owner and help them build securely.

Permissions stay tight without adding cognitive load. Tokens are short‑lived and scoped to the task. Context is trimmed so only the necessary fields flow to the tool. Third‑party servers never get the full transcript. If a connector’s capabilities change, the runtime compares the new “voice” to what we approved. MCP Clients should pause risky actions, notify the owner, and resume automatically once reviewed.

With agent inventory in Power Platform Admin Center and registry in Agent 365, admins get a clean view on which connectors are active, who owns them, what data they touch, and how often they’re called. Organization policies such as DLP and MIP can be enforced in a unified way , with a re‑review when capabilities change. The goal is simple: let builders innovate confidently and securely while maintaining security and compliance.

“MCP servers are powerful AI tools that enable agents to seamlessly integrate and interact with enterprise data and transform business workflows,” Hasan says. “That means the same enterprise data and governance principles are applied equally to MCP servers and other connectors. A robust inventory, an agile policy framework, and an automated workflow for enforcement are cornerstones for successfully governing agents at scale.”

Securing MCP at scale: Operating, monitoring, and enabling

Our work doesn’t stop at go‑live. Once an MCP server is in the catalog, we operate the conversation like a service: measurable, observable, and responsive. Identity and policy guard the front door, but runtime is where we prove the controls work without slowing anyone down.

In practice, operating MCP at scale comes down to four motions:

Observe every tool call end to end. We make the flow observable. Every tool call carries a correlation ID from client to gateway to server and back. Prompts, tool selections, authorization decisions, and resource access should belogged with consistent schemas. Golden signals—latency, errors, saturation—sit alongside safety signals like unexpected egress or edits without consent. Owners and security teams see the same dashboards.

Detect drift and abnormal behavior early. Detection lives close to the work. We flag abnormal tool patterns, spikes in write operations, burst traffic from new geographies, and context sizes that don’t fit a task. We continuously compare a tool’s “voice” at connect time to the approved version; drift automatically pauses risky actions and pings the owner. Cost controls double as guardrails, using rate limits and budgets to cap blast radius and surface runaway loops early.

Respond with precision instead of blunt shutdowns. Response is graded, not binary. We can block destructive actions and allow reads, or throttle a noisy client without killing the session. Kill switches exist at both the client and the gateway. Playbooks are pre‑approved and integrated into the consoles owners already use, and dry runs are part of muscle memory, so the first switch flip doesn’t happen during an incident.

We treat model behavior as part of operations. Content safety and prompt shields run in production, not just in tests. We pin model versions and watch for output drift after updates. If a model starts suggesting tools out of character, the owner gets paged with the exact prompts and calls that triggered it.

Telemetry respects privacy. Logs avoid sensitive payloads by default and mask what must pass through for forensics. Access is role‑based, retention follows policy, and audit readiness is designed in on day one.

Enable builders through templates, education, and reuse. Adoption and education run in parallel. Builders get templates that enable best practices: sample manifests with consent gates, CI checks for token scope and SBOMs, and gateway stubs with sane defaults. A “ten‑minute preflight” runs locally to verify contracts, test consent flows, and check egress before a pull request is opened. IDE lint rules catch common issues early.

“This is how we operate MCP at scale,” says Janardhanan. “Observe the conversation, detect drift early, respond with precision, and teach habits that make the right path the easy path. We run it like a product because that’s what it is.”

Measuring results and moving forward

This program has changed how we build. Reviews move faster because every server follows the same path. Drift is caught early because clients compare a tool’s “voice” on connection. Shadow servers decline as inventory fills in from endpoint, repo, IDE, and gateway signals. Reuse increases because teams can discover trusted servers instead of creating new ones. Incidents resolve faster with correlation IDs across the conversation and kill switches at both the client and the gateway.

It’s also changed how our admins work. One gateway means one perimeter to manage. Policies land once and apply everywhere. Owners see the same telemetry security sees, so fixes happen where the work happens.

Going forward, we’re focused on more consolidation and automation. We’re moving toward a single pane for MCP governance—approve, monitor, and pause from one place. Policy-as-code will keep allowlists, consent rules, and egress boundaries versioned and testable in CI.

Our preflight checks will get smarter, with stronger injection tests, automatic egress validation, and environment‑aware templates. We’ll expand consent patterns so high‑impact actions remain explicit and auditable, even across multi‑tool chains. And we’ll keep shrinking re‑review time, so drift is measured in minutes, not days.

AI conversations are now part of how we build every day. MCP standardizes how agents talk to tools and data. Secure‑by‑default architecture, rigorous vetting, and a living inventory, ensure the right voices stay in the room, only what’s needed is shared, and drift is caught early.

The result is simple: teams ship faster with fewer surprises, and governance stays visible without getting in the way. We’ll keep tightening the loop, so saying yes remains both easy and safe.

Key takeaways

If you’re implementing MCP security, consider these key actions to ensure secure, efficient adoption in your organization:

  • Build governance into the maker flow. Embed security, consent, and responsible AI checks directly where teams build—so protection shows up by default, not as an afterthought.
  • Maintain a single allowlist and catalog. Centralize approved MCP servers and connectors with clear ownership, scope, and data boundaries.
  • Enforce scoped, short-lived permissions by default. Automatically limit token scope and duration to minimize risk and exposure.
  • Monitor continuously and detect drift early. Observe activity, flag deviations, and pause risky actions until reviewed and approved by owners.
  • Automate incident response and controls. Leverage pre-approved playbooks, kill switches, and rate limits for fast, precise action.
  • Design for privacy and auditability from day one. Mask sensitive data, restrict log access by role, and endure audit readiness.
  • Promote education and reuse. Provide templates, training, and feedback loops to encourage safe development and adoption of trusted servers.

The post Protecting AI conversations at Microsoft with Model Context Protocol security and governance appeared first on Inside Track Blog.

]]>
22324
Powering data governance at Microsoft with Purview Unified Catalog http://approjects.co.za/?big=insidetrack/blog/powering-data-governance-at-microsoft-with-purview-unified-catalog/ Thu, 05 Feb 2026 17:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=22272 Data fuels everything that we do here at Microsoft, from the daily operations that keep the business running to the innovations that shape the future. But as data sprawls across teams, systems, and borders, the task of ensuring that it remains secure, accurate, and well-governed is a daunting one. A sound approach to data governance […]

The post Powering data governance at Microsoft with Purview Unified Catalog appeared first on Inside Track Blog.

]]>
Data fuels everything that we do here at Microsoft, from the daily operations that keep the business running to the innovations that shape the future.

But as data sprawls across teams, systems, and borders, the task of ensuring that it remains secure, accurate, and well-governed is a daunting one. A sound approach to data governance is the backbone of responsible data use across the enterprise, creating clarity around data ownership and access.

In an organization the size of Microsoft, no single team can carry this responsibility on its own. Effective data governance must be a distributed effort across all departments and functions.

This story explains how our marketing organization uses the Microsoft Purview Unified Catalog to organize and standardize the data we rely on daily. By putting clear ownership, consistent definitions, and reliable governance in place, we’re turning fragmented, unreliable data into an advantage that supports faster decisions and more effective campaigns.

Data governance at scale

As companies grow, their data governance becomes increasingly complex, with different teams creating their own versions of key data concepts, often without realizing it. The complexity is most visible in the way users across an organization define foundational terms.

A photo of Doughty.

“We found adoption to be much easier when helping teams focus on building more value in their data instead of driving governance like a compliance effort.”

Nick Doughty, senior product manager, Microsoft Purview Unified Catalog

Examples in marketing include what counts as a customer (active vs. inactive, marketing- or sales-qualified), what constitutes sensitive data (personally identifiable information, behavioral data, partner data), and what a metric means (conversion, engagement, attribution windows).

When inconsistent practices take hold, ownership becomes murky. With the increasing demands that managing data quality and integrity put on our leaders and their teams, effective data governance becomes one more hurdle to productivity.

“We started off implementing data governance like an issue register,” says Nick Doughty, a senior product manager within Microsoft Purview Unified Catalog. “Then we progressed to more of an enforcement method, similar to how we were doing security at the time. We found that when we started to push really hard on teams, similar to how we drove other compliance efforts, it was difficult for them to justify or understand why they would want the added governance.”

The introduction of Microsoft Azure Purview in 2020 marked a turning point.

A united platform for data governance, security, and compliance, Purview helps organizations understand, protect, and manage data across environments. It also addresses fragmented data, lack of visibility into where sensitive data lives and how it moves, compliance complexity with regulations (including GDPR and HIPAA), and security risks.

A photo of Mathur

“Our marketing teams used to spend hours hunting for the right customer list because multiple versions lived in different locations, each with unclear owners and inconsistent labels. Now our marketers can trust they are working from current information, while avoiding compliance risks associated with incorrect or unauthorized data.”

Sourabh Mathur, principal engineering lead, Global Marketing Engines and Experiences

The Purview Unified Catalog serves as the AI-powered backbone, automatically discovering, classifying, and organizing information so users can easily find and trust the data they need.

By launching the unified catalog, we gave our users a consistent way to understand and use their data, while reinforcing strong governance and compliance practices. The result is data that’s more discoverable, reliable, and actionable. (The product was renamed Microsoft Purview in 2022 and became part of Microsoft 365 compliance tools.)

“Our marketing teams used to spend hours hunting for the right customer list because multiple versions lived in different locations, each with unclear owners and inconsistent labels,” says Sourabh Mathur, a principal engineering lead in Global Marketing Engines and Experiences, who helped set up Purview for our marketing organization.

With the unified catalog in place, Purview surfaces the dataset, shows its lineage, and applies the correct sensitivity classifications.

“Now our marketers can trust they are working from current information, while avoiding compliance risks associated with incorrect or unauthorized customer data,” Mathur says.

Powering marketing at Microsoft with Purview

With more than 200 Microsoft Azure subscriptions, our marketing organization manages one of the largest data estates at the company. The team faces the constant challenge of scattered data, unclear data ownership, and inconsistent governance practices that slow down campaigns and increase compliance risk.

A photo of Biswal.

“Marketing can now scale governance across hundreds of data products, support self-service data collection with guardrails, automate access decisions, and enable AI workloads on trusted data.”

Deepak Kumar Biswal, principal software engineering lead, Global Marketing Engines and Experiences

By adopting Purview, our marketing team gained unified visibility, clearer classification standards, and smoother collaboration with other departments, like IT and legal. This reduces friction while strengthening data protection.

The result is an organization that moves faster with greater confidence in how it handles customer and campaign data.

Instead of relying on legacy knowledge, forcing users to dig through different servers and SharePoint sites, or constantly sending queries to the engineering teams, our marketing professionals can now explore the curated Purview Unified Catalog, making streamlined, efficient data discovery possible.

“Marketing can now scale governance across hundreds of data products, support self-service data collection with guardrails, automate access decisions, and enable AI workloads on trusted data,” says Deepak Kumar Biswal, a principal software engineering lead in Global Marketing Engines and Experiences. “Purview turns responsible data use into everyday practice, not extra work.”

Data governance and security: Two sides of the same coin

For our marketing organization, data governance and security are inseparable concepts. As soon as you have customer information, you need to make sure it’s secure—sensitive data must be carefully defined, consistently managed, and protected from misuse or breach.

Purview supports this goal by combining governance capabilities with security and compliance controls that provide added layers of protection.

Within marketing, the governance and security teams work closely together. Good governance measures ensure our data is properly defined and standardized, while strong security policies ensure it’s handled with proper safeguards. By pairing governance with strong security practices, our marketing team can remain compliant with data privacy laws, prevent misuse of sensitive information, and foster trust across their organization.

When our marketing team began its Purview journey five years ago, it adopted a centralized governance model. Much like the structure of a government—where federal, state, and local entities each play a role—our approach allows both centralized standards and local autonomy. This creates consistency across the organization without stifling agility.

Our Data Governance team took on the role of steward, defining standards, onboarding systems, and collaborating with its IT partners to connect data environments. Existing assets like data dictionaries and process flows were used to seed the catalog, ensuring the team started from known ground rather than reinventing definitions from scratch.

This deliberate, incremental approach allowed our marketing team to thoughtfully build out healthy governance practices. By moving slowly, the team learned from each step on its journey, refining processes and establishing consistent practices as it moved along.

For example, working closely with our team in Microsoft Digital allowed them to experiment with different ways of discovering and cataloging their data. This involved taking learn and refine how Purview tuned their data before they rolled anything out broadly.

Our goal is to transition to a completely federated model in which responsibility shifts outward. Rather than the marketing governance team doing all the stewardship, individual groups will take ownership of their data within Purview. This shift distributes accountability, embeds governance deeper into daily operations, and makes it easier for teams to monitor data quality and enforce standards on their own.

Impact across the enterprise

Since adopting Purview Unified Catalog, we’ve seen tangible results across our data estate and our data governance practices in marketing and across all verticals within the company. Here are some companywide highlights:

  • Better consolidation: We’ve unified five catalogs into one.
  • Increased scale: We added 250 data sources onboarded in six months, representing roughly 10 million assets.
  • Higher internal adoption: We set up more than 50 governance domains, an effort we supported with reusable training assets, guides, and onboarding materials.

The benefits also include and extend beyond marketing:

  • Teams across the company are gaining increased confidence in their data definitions.
  • Compliance and privacy obligations are being met more effectively.
  • Business value is being generated through better, more trusted use of data.
  • Organizations are benefiting from faster time-to-insight.

Launching the marketing governance domain

We’re using Purview to combine essential capabilities like data governance, classification, and quality checks across our Microsoft services, which creates a unified foundation for our enterprise-wide metadata management. These unified capabilities make Purview an indispensable tool for us, and for large-scale enterprises.

A photo of Singh

“With various role types like data curator and data reader, we can add more visibility into our data—where it lives, how it’s being used, and who are its primary owners. Clearly defining these parameters helps us use the data governance framework as a starting point and improve our data governance capabilities.”

Vinny Singh, principal program manager, Global Marketing Engines and Experiences

As early adopters of Purview Unified Catalog, the group launched the Marketing Governance domain, registering more than 200 data products using the Unified Catalog’s data map.

The products, spanning various datasets, are aligned with strict internal governance standards. This gives marketing the ability to govern, classify, and track data across its ecosystem—ensuring adherence to GDPR and other regulatory compliance measures.

“With various role types like data curator and data reader, we can add more visibility into our data—where it lives, how it’s being used, and who are its primary owners,” says Vinny Singh, a principal program manager in Global Marketing Engines and Experiences. “Clearly defining these parameters helps us use the data governance framework as a starting point and improve our data governance capabilities.”

Key takeaways

Our journey with Microsoft Purview Unified Catalog has generated key insights that you can apply to your own data governance efforts. These include:

  • Start small: Don’t try to “boil the ocean.” Begin with three to five governance domains and scale from there.
  • Leverage what you have: Data dictionaries, glossaries, and existing documentation provide a strong starting point for a governance platform founded on the Purview Unified Catalog.
  • Focus on value, not enforcement: Governance resonates when teams see how it helps them, not when it’s mandated.
  • Adapt to your organization: Each team at your company will use Purview differently. Flexibility helps encourage adoption.
  • Build community: Data governance is not a solo effort. Collaboration among stakeholders produces stronger standards and better results.

The post Powering data governance at Microsoft with Purview Unified Catalog appeared first on Inside Track Blog.

]]>
22272