Zero Trust Archives - Inside Track Blog http://approjects.co.za/?big=insidetrack/blog/tag/zero-trust/ How Microsoft does IT Wed, 10 Jun 2026 23:57:01 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.4 137088546 Streamlining finance cash collection at Microsoft with AI http://approjects.co.za/?big=insidetrack/blog/streamlining-finance-cash-collection-at-microsoft-with-ai/ Thu, 04 Jun 2026 15:45:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=23944 When it comes to running a business, getting paid on time is critical. Our Global Collection team in the Microsoft Treasury division makes sure payments are seamlessly executed in our fast-moving global enterprise environment. However, our case managers were often losing valuable time figuring out things like who the right contact was for a given […]

The post Streamlining finance cash collection at Microsoft with AI appeared first on Inside Track Blog.

]]>
When it comes to running a business, getting paid on time is critical.

Our Global Collection team in the Microsoft Treasury division makes sure payments are seamlessly executed in our fast-moving global enterprise environment. However, our case managers were often losing valuable time figuring out things like who the right contact was for a given customer, which issues were likely to be challenged by a customer, and where an exception should be routed next. This information was spread across systems or buried in handoffs.

To solve these challenges, our team built a human-led, AI agent-assisted support system to reduce preparation time and streamline their processes.

“Building the AI assistance wasn’t the hard part,” says Kathy Brustad, a director in the Global Treasury and Financial Services division at Microsoft. “The hard part was reimagining the collection experience with AI front and center, and bringing the underlying infrastructure up to speed to get it there.”

In this post, we explain how we did it so you can learn from our experience.

A photo of Brustad.

“We have over 1,000 collectors around the world who perform collections for Microsoft. They had multiple systems they had to go to in order to find out things like the totality of the customer’s invoice and what conversations a different team had with the customer. The information was fragmented.”

Kathy Brustad, director, Global Treasury and Financial Services

Stitching together information across systems

Our AI agent is focused on helping our case managers prioritize high-value work by:

  • Predicting late payments and possible customer disputes
  • Summarizing customer case interactions for use by case managers
  • Routing customer emails to the right collections manager faster and with greater precision Automatically matching payments to invoices
  • Automatically responding to customer inquiries

“We have over 1,000 collectors around the world who perform collections for Microsoft,” Brustad says. “They had multiple systems they had to go to in order to find out things like the totality of the customer’s invoice and what conversations a different team had with the customer. All of this information was fragmented. We didn’t have a single view of how much a customer owed us.”

We started by consolidating these dispersed tools and systems into an SAP and Microsoft Dynamics 365 environment, creating a single source of truth for all relevant customer, invoice, and payment data.

On that foundation, we layered on Microsoft’s IQ intelligence platform to infuse semantic understanding and business context. That standardized our workflows by simplifying templates and worklists to reduce complexity and put consistent global practices into place. Routine communications became fully automated.

We then applied AI to improve payment matching accuracy from 40% to 90%, generate customer response drafts, and intelligently route cases to reduce time-consuming back‑and‑forth.

Copilot assistance was embedded directly into the daily workflow of our case managers to reduce administrative load by providing inline knowledge suggestions, summarizing calls, and automatically drafting replies. With these standardized automated workflows, we could apply 98% of payments within 48 hours.

“In a nutshell, this is the collection story: We have various agents and models deployed to assist our human agents with all the activities they have to do, saving hundreds of thousands of hours that we spent on manually tracking things before.”

Kathy Brustad, director, Global Treasury and Financial Services

Moving faster on ‘act ready’ work

Deploying the agent was only the starting point. The harder work was helping our collection team change established ways of working. Brustad described the shift as learning to “run it in a different way,” moving from manual, fragmented preparation toward workflows where prioritization, context gathering, and routing were increasingly supported within the system.

To make that shift possible, the team introduced a change management work stream program and role-based training focused on real, day-to-day scenarios alongside the rollout. By anchoring the work in clear business pain points and showing tangible improvements, our team saw how the new approach made their work easier. Each morning, the agent prioritized each case manager’s workload according to urgency and past client behavior so case managers could immediately focus on the accounts that were the most pressing.

We reduced repetitive communications using automatically drafted responses and automated statements.

“In a nutshell, this is the collection story: We have various agents and models deployed to assist our human agents with all the activities they have to do, saving hundreds of thousands of hours that we spent on manually tracking things before,” Brustad says.

After deploying this system to our case managers, we saw measurable improvements in both productivity and speed, including:

  • Hundreds of thousands of hours unlocked annually in order to do more human-led high-value work rather than routine administrative tasks
  • 40% reduction in call preparation time
  • 2X growth in automatic cash applications
  • 2.5X acceleration of customer inquiry resolution time

Operationally, the team also saw up to 60% reduction in inquiry handling time through inline suggestions, summarized calls, and automatically drafted replies. To ensure these improvements were real and repeatable, we emphasized observability in our evaluation approach. Our team tracked dollars collected through collections and hours worked to create productivity metrics.

Data, trust, and good governance

When introducing AI systems or agents into finance workflows, leaders often ask two questions:

  1. Can we trust the outputs?
  2. Can we govern the process?

“The biggest takeaway is to know your own process very, very well. You need to understand where all the bottlenecks and pain points are. Start from there to design the new agent-enabled process instead of saying, ‘I’m going to just inject the agent into my existing process.’”

Kathy Brustad, director, Global Treasury and Financial Services

For us, trust came from getting the basics right in the form of right-sizing our enterprise data, standardizing our workflows, and establishing clear ownership for each part of the work. When we tested early and included frontline users throughout the process, outcomes improved.

“The biggest takeaway is to know your own process very, very well,” Brustad says. “You need to understand where all the bottlenecks and pain points are. Start from there to design the new agent-enabled process instead of saying, ‘I’m going to just inject the agent into my existing process.’”

Embed custom agent assistance directly into the moments where time disappears, such as prioritization, preparation, routing, and drafting so adoption feels natural and can be measured. You can prove impact with a small set of metrics like cycle time, throughput, dollars collected, and hours saved, and iterate from there.

Key takeaways

Modernizing collections is about fixing the fundamentals first, before you add AI into the mix. As you begin to streamline your own finance workflows, keep these lessons in mind:

  • Fix fragmented workflows before adding intelligence: AI delivers the most value when it’s layered on top of standardized processes and a unified data foundation rather than disconnected systems and ad hoc handoffs.
  • Embed assistance where time is actually lost: Copilot-style support works best when it shows up directly in prioritization, preparation, routing, and drafting to reduce friction without changing how people work.
  • Focus AI on highROI decisions, not just automation: Predicting late payments, flagging likely invoice disputes, and surfacing context can help teams spend time where it matters.
  • Design around the practitioner’s day: When work arrives prioritized and prepped, case managers spend less time chasing context and more time resolving exceptions.
  • Measure what matters to prove impact: Cycle time, dollars collected, throughput, and hours saved provide a clear, repeatable way to track productivity gains and cashflow velocity.
  • Pair generative AI with strong governance: Trust comes from clear ownership, standardized workflows, quality data, and ongoing human oversight.

Editor’s notes:

  • SAP is an enterprise finance system that many organizations use to manage invoices, payments, and financial records in a single, centralized platform.
  • All metrics cited are based on Microsoft internal data gathered during the writing of this article. They’re best read as directional signals from that period, and they may change as systems, processes, and behaviors evolve. Microsoft makes no warranties, express, implied, or statutory.

The post Streamlining finance cash collection at Microsoft with AI appeared first on Inside Track Blog.

]]>
23944
Staying human: How we’re using AI to transform the sales experience at Microsoft http://approjects.co.za/?big=insidetrack/blog/staying-human-how-were-using-ai-to-transform-the-sales-experience-at-microsoft/ Thu, 21 May 2026 15:15:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=23718 At first glance, AI transformation can look like a technology deployment project: New tools arrive, training programs launch, dashboards go live, and leaders focus on speed, scale, and rollout discipline. But in practice, the technical side of transformation is only part of the story. The missing piece is us humans. When we encounter these kinds […]

The post Staying human: How we’re using AI to transform the sales experience at Microsoft appeared first on Inside Track Blog.

]]>
At first glance, AI transformation can look like a technology deployment project: New tools arrive, training programs launch, dashboards go live, and leaders focus on speed, scale, and rollout discipline.

But in practice, the technical side of transformation is only part of the story. The missing piece is us humans.

When we encounter these kinds of challenges internally at Microsoft, we think of ourselves as “Customer Zero.” We roll out our technology across our own organization first, learning what works and what doesn’t in real time and at scale so we can pass our lessons on to you.

A photo of Bertrand.

“After an early wave of enthusiasm for Copilot, adoption declined. People questioned whether AI was relevant to their role, worried about what it might mean for their work, and disengaged when the change they experienced didn’t match the change they imagined.”

Daniel Bertrand, senior director, AI Transformation Office

We learned valuable lessons about AI adoption and sustainable change when we deployed Microsoft 365 Copilot across our Microsoft Commercial organization, one of the company’s largest sales and service organizations. What we observed led us to reset our strategy and build a more human-centered process for deploying and driving adoption of our AI technology.

Driving AI adoption with role relevance and daily habits

Here on the Customer Zero team in Microsoft Customer and Partner Solutions (MCAPS), our 60,000-employee strong sales organization, we saw that getting access to Copilot didn’t automatically result in widespread AI adoption.

“After an early wave of enthusiasm for Copilot, adoption declined,” says Daniel Bertrand, a senior director on the AI Transformation Office team in MCAPS. “People questioned whether AI was relevant to their role, worried about what it might mean for their work, and disengaged when the change they experienced didn’t match the change they imagined.”

Initially, people used Copilot like a search engine and expected it to make work go away. When that didn’t happen automatically, they didn’t know how to approach prompting the AI, or how to create value with it. The gap between access and know‑how is where adoption slowed.

A photo of Neece Robien.

“I knew from experience that people prefer to hear from—and learn alongside—those closest to their day-to-day work, to build trust and confidence.”

Susan Neece Robien, senior director of adoption and change, AI Transformation Office

We reframed the problem from “How do we scale the technology?” to, “What does this change feel like for people doing the work every day?”

By talking to people in our larger organization about why they were reluctant to work with Copilot, we discovered the adoption barrier was less about the technology being available and more about whether people trusted it, understood how it fit their role, and felt confident enough to build new habits around it.

The ‘Adoption-in-a-Box’ approach

After these conversations, we changed our strategy across the board.

“I knew from experience that people prefer to hear from—and learn alongside—those closest to their day‑to‑day work, to build trust and confidence,” says Susan Neece Robien, a senior director of adoption and change on the AI Transformation Office team. “That led me to conceptualize Adoption‑in‑a‑Box—a repeatable approach that combines behavior‑change guidance, peer influence, habit‑forming activities, and light gamification so people can experiment with AI in a non‑threatening way and build confidence over time.”

We rolled out the Adoption-in-a-Box concept across the team in the following ways:

  • Emphasized visible leadership support: We circulated videos and “day in the life” PowerPoint 1-pagers of how our leaders were using Copilot.
  • Formed a community of early adopters: They becamepeer champions for adoption, evangelizing best practices and leading workshops.
  • Created a Role Hub: The hub contained practical, role-specific learning about how to use Copilot rather than doing high-level general trainings.
  • Ran prompt campaigns: To get our team started with habitually using AI in their daily roles, we ran prompt campaigns to make prompt learning accessible and actionable.
  • Created the Copilot Cup: We encouraged friendly competitions with leadership support. We also ran hackathons and prompt-based scavenger hunts to gamify learning about and using the AI for our team.
  • Created ongoing measurement mechanisms: We stood up dashboards with monthly, weekly, and daily average usage reports. We also ran quarterly surveys to track sentiment around AI adoption on the team.

After our initial success with Adoption-in-a-Box, we scaled it to adoption leads, who brought the model to life within their teams.

When people feel safe in experimenting with AI and incorporating it into their day-to-day work, that’s when it provides real value for the organization and the individual. We’ve learned that sustainable, scalable AI transformation succeeds when we put people first.

Key takeaways

If you’re wondering how to encourage your own team to adopt new AI technology into their workflows, you can learn from our experience:

  • Prioritize visible leadership participation. Leaders set the tone of any transformation, and AI adoption is no exception.
  • Roll out for role relevance. Specificity is the key here: How does AI relate to each person’s individual role? If the tool provides value and saves time, people will incorporate it into their workflow.
  • Establishing habits is crucial. Sustainable adoption means people use the tool on a daily basis in the natural flow of their work. Give them low-friction opportunities to learn the ropes.
  • Encourage peer-to-peer experimentation. Early adopters can be a valuable resource for showing others the way. Lowering the stakes by having a peer guide employees in a workshop or one-on-one can take the pressure off as they experiment with the tech.

The post Staying human: How we’re using AI to transform the sales experience at Microsoft appeared first on Inside Track Blog.

]]>
23718
Microsoft CISO advice: Consider the risks of early integration with mergers and acquisitions http://approjects.co.za/?big=insidetrack/blog/microsoft-ciso-advice-consider-the-risks-of-early-integration-with-mergers-and-acquisitions/ Thu, 14 May 2026 16:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=23592 When considering mergers and acquisitions (M&A), security needs to be an important part of the financial and operational due diligence process. At Microsoft, the security organization does more than fulfill the traditional role of assessing risk. It seeks also to address questions about the speed and costs of integrating new resources and capabilities. Geoff Belknap, […]

The post Microsoft CISO advice: Consider the risks of early integration with mergers and acquisitions appeared first on Inside Track Blog.

]]>
When considering mergers and acquisitions (M&A), security needs to be an important part of the financial and operational due diligence process. At Microsoft, the security organization does more than fulfill the traditional role of assessing risk. It seeks also to address questions about the speed and costs of integrating new resources and capabilities.

Geoff Belknap, CVP and operating CISO shares the questions he asks when considering when and how to integrate technologies with a merged or acquired company.

Watch this video to see Geoff Belknap share questions about integration with M&A. (For a transcript, please view the video on YouTube: https://www.youtube.com/watch?v=mrE2FSXZ-ss.)

Key takeaways

Think about moving slowly with early integration with M&A. Here are some key questions to consider:

  • What do we risk by combining tools or technical capabilities too quickly?
  • Is the deal still valuable if we do not integrate systems?
  • What operational safeguards and governance are needed?

The post Microsoft CISO advice: Consider the risks of early integration with mergers and acquisitions appeared first on Inside Track Blog.

]]>
23592
Microsoft CISO advice: Apply engineering fundamentals to securing AI http://approjects.co.za/?big=insidetrack/blog/microsoft-ciso-advice-apply-engineering-fundamentals-to-securing-ai/ Thu, 30 Apr 2026 16:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=23334 Agentic AI, like any software, is just one part of a business solution. It is not the only element that needs to be secured. Engineers need to approach securing agentic AI in the corporate IT ecosystem the same way they would consider any security problem—from end to end. Yonatan Zunger, CVP and deputy CISO for […]

The post Microsoft CISO advice: Apply engineering fundamentals to securing AI appeared first on Inside Track Blog.

]]>
Agentic AI, like any software, is just one part of a business solution. It is not the only element that needs to be secured. Engineers need to approach securing agentic AI in the corporate IT ecosystem the same way they would consider any security problem—from end to end.

Yonatan Zunger, CVP and deputy CISO for Microsoft, suggests focusing exclusively on hardening a piece of software to security threats may make it difficult to use and introduce a new risk when users get frustrated and try to bypass controls. This is why engineers need to consider not just individual components but how they work together to maintain productivity.

“Think of every system as a socio-technical system containing many parts, and all of them working together in unison have to be secured,” Zunger says.

Watch this video to see Yonatan Zunger explain why engineering fundamentals are critical to building resilient AI systems. (For a transcript, please view the video on YouTube: https://www.youtube.com/watch?v=YU-8lpwPtm0 )

The post Microsoft CISO advice: Apply engineering fundamentals to securing AI appeared first on Inside Track Blog.

]]>
23334
Microsoft CISO advice: How to build trustworthy agentic AI http://approjects.co.za/?big=insidetrack/blog/microsoft-ciso-advice-how-to-build-trustworthy-agentic-ai/ Thu, 16 Apr 2026 15:15:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=23173 Building production-ready solutions with agentic AI comes with inherent risks. When agents make mistakes or hallucinate, the potential impacts can multiply rapidly. “It turns out that it’s very easy to write AI-powered software, but it’s very hard to write AI-powered software that works right in real-world cases,” says Yonatan Zunger, CVP and deputy CISO for […]

The post Microsoft CISO advice: How to build trustworthy agentic AI appeared first on Inside Track Blog.

]]>
Building production-ready solutions with agentic AI comes with inherent risks. When agents make mistakes or hallucinate, the potential impacts can multiply rapidly.

“It turns out that it’s very easy to write AI-powered software, but it’s very hard to write AI-powered software that works right in real-world cases,” says Yonatan Zunger, CVP and deputy CISO for Microsoft.

Yunger explains how important it is to test if you want to build trustworthy agentic AI.

Watch this video to see Yonatan Zunger explain how to build trustworthy agentic AI. (For a transcript, please view the video on YouTube: https://www.youtube.com/watch?v=eNU7c48541M)

Key takeaways

Here are best practices to apply while building trustworthy agentic AI:

  • Prototype. Test. Iterate. Think of and try prompts your real users might give your agentic AI. Use real data. From those trials, build a set of test cases and keep testing.
  • Use AI tools to amplify testing. Evaluating agents requires a “try it and repeat it” mindset. Using AI Foundry with such tools as Python Risk Identification Tool amplifies these assessment capabilities.
  • Record your tests. Applying this practice, as you would with unit testing, enables you to repeat evaluations as your data models and agents evolve.
  • Don’t skimp on testing. Test early, test often, test with real data. This is the best way to understand what your agent might do when it encounters the unexpected.

The post Microsoft CISO advice: How to build trustworthy agentic AI appeared first on Inside Track Blog.

]]>
23173
Microsoft CISO advice: The importance of a written AI safety plan http://approjects.co.za/?big=insidetrack/blog/microsoft-ciso-advice-the-importance-of-a-written-ai-safety-plan/ Thu, 09 Apr 2026 16:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=23016 Yonatan Zunger, CVP and Deputy CISO for Microsoft, has spent his career considering complex questions with security and privacy while building platform infrastructure and solutions. His experience underpins his advice on how to build a safety plan for working with AI. First and foremost, his advice is to have a written plan. “Make it an […]

The post Microsoft CISO advice: The importance of a written AI safety plan appeared first on Inside Track Blog.

]]>
Yonatan Zunger, CVP and Deputy CISO for Microsoft, has spent his career considering complex questions with security and privacy while building platform infrastructure and solutions. His experience underpins his advice on how to build a safety plan for working with AI. First and foremost, his advice is to have a written plan.

“Make it an expectation in your organization that people will create safety plans and have them for everything,” Zunger says. “People get so excited about having clarity in front of them that they end up making much more systematic, careful plans, and the rate of errors goes down dramatically.”

Watch this video to see Yonatan Zunger discuss his advice for creating an AI safety plan. (For a transcript, please view the video on YouTube: https://www.youtube.com/watch?v=H5reZ0uw0EA

Key takeaways

Here are questions and ideas to consider as you create a safety plan for your AI systems:

  • Define the problem. What problem are you trying to solve? A simple and clear problem statement is always a great starting point before building anything, including an AI agent.
  • Outline the solution. What is the basis of your solution? Can you explain your solution to an end user? What does a developer or administrative user of your solution need to know about what it is and does?
  • List the things that can go wrong. What can go wrong with your solution? Creating this list is the first step to figuring out how to deal with those issues.
  • Document your plan. What is your plan to address identified concerns? Identify the process you will follow when something goes wrong.
  • Draft your plan early and update it as your solution matures. Your safety plan can be as simple as a list or outline and should evolve as you prepare to build your solution.
  • Get feedback and buy-in. When you review the plan with stakeholders and leaders in your team and organization, you may uncover risks or issues you had not thought of. You also build awareness and agreement on what to do when something goes wrong.
  • Make a template and build its use into your processes. This tip is for anyone who leads a team or influences process development. Encourage using a safety template in all your projects to bring clarity and structure to how you work with AI.

The post Microsoft CISO advice: The importance of a written AI safety plan appeared first on Inside Track Blog.

]]>
23016
Microsoft CISO advice: The most important thing to know about securing AI http://approjects.co.za/?big=insidetrack/blog/microsoft-ciso-advice-the-most-important-thing-to-know-about-securing-ai/ Thu, 02 Apr 2026 16:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=22863 Using AI comes with inherent risks. In a recent video, Yonatan Zunger, CVP and deputy CISO for Microsoft, suggests thinking about AI as a new intern will help you naturally take the right approach to AI security.  Zunger and his team focus on AI safety and security. They consider all the different ways anything involving […]

The post Microsoft CISO advice: The most important thing to know about securing AI appeared first on Inside Track Blog.

]]>
Using AI comes with inherent risks. In a recent video, Yonatan Zunger, CVP and deputy CISO for Microsoft, suggests thinking about AI as a new intern will help you naturally take the right approach to AI security. 

Zunger and his team focus on AI safety and security. They consider all the different ways anything involving working with AI can go wrong.

“An important thing to know about AI is that AI’s make mistakes,” Zunger says. “You already know how to work with systems that make mistakes, get tricked.”

Watch this video to see Yonatan Zunger discuss his advice for working with AI. (For a transcript, please view the video on YouTube: https://youtu.be/b1x6gDbSWVY. )

The post Microsoft CISO advice: The most important thing to know about securing AI appeared first on Inside Track Blog.

]]>
22863
Deploying Microsoft Baseline Security Mode at Microsoft: Our virtuous learning cycle http://approjects.co.za/?big=insidetrack/blog/deploying-microsoft-baseline-security-mode-at-microsoft-our-virtuous-learning-cycle/ Thu, 26 Mar 2026 16:05:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=22829 The enterprise security frontier isn’t just evolving. It’s accelerating beyond the limits of traditional security models. AI acceleration, cloud adoption, and rapid growth of enterprise apps have dramatically expanded the attack surface. Every new app introduces a new identity. Every identity carries permissions. Over time, those permissions accumulate, often without clear ownership or regular review. […]

The post Deploying Microsoft Baseline Security Mode at Microsoft: Our virtuous learning cycle appeared first on Inside Track Blog.

]]>
The enterprise security frontier isn’t just evolving. It’s accelerating beyond the limits of traditional security models.

AI acceleration, cloud adoption, and rapid growth of enterprise apps have dramatically expanded the attack surface. Every new app introduces a new identity. Every identity carries permissions. Over time, those permissions accumulate, often without clear ownership or regular review.

A photo of Ganti.

“An app is another form of identity. In a cloud-first, Zero Trust world, identity becomes the primary security perimeter, and access is governed by the principle of least privilege. Whether it is a user, an app, or an agent, when permissions are overly broad or elevated the blast radius expands dramatically, increasing risk exponentially.”

B. Ganti, principal architect, Microsoft Digital

Inside Microsoft Digital—the company’s IT organization—we recognized this early. Many of our highest‑risk security scenarios didn’t start with malware or phishing. They started with access. Specifically, apps running with permissions beyond what they required.

“An app is another form of identity,” says B. Ganti, principal architect in Microsoft Digital. “In a cloud-first, Zero Trust world, identity becomes the primary security perimeter, and access is governed by the principle of least privilege. Whether it is a user, an app, or an agent, when permissions are overly broad or elevated the blast radius expands dramatically, increasing risk exponentially.

Traditional security approaches such as periodic reviews, best‑practice guidance, and point‑in‑time hardening weren’t enough in an environment that changes daily. Configurations drift, new apps appear, and risk grows quietly in places that are hard to see at scale.

That reality forced a mindset shift internally here at Microsoft. Security couldn’t be optional. It couldn’t be advisory. And it couldn’t be static.

Our team operates one of the largest enterprise environments in the world, with tens of thousands of apps and a culture built on self‑service and autonomy. That scale drives innovation, but it also amplifies risk.

Our application identities became one of the most complex governance challenges we faced. Our ownership wasn’t always clear. Our permissions were often granted broadly to avoid disruption. And once approved, access rarely came under scrutiny again.

“As a self‑service organization, we empower people to move fast,” Ganti says. “But that also means apps get created, permissions get granted, and not everyone always remembers why.”

The rise of AI‑powered apps and agents—often requiring access to large volumes of data—increased our risk further.

Photo of Fielder

“We’re using Microsoft Baseline Security Mode to move security from guidance to enforcement. It establishes secure‑by‑default configurations that scale across our environment, so teams can innovate quickly without inheriting unnecessary risk.”

Brian Fielder, vice president, Microsoft Digital

We needed a system to reduce that risk systematically, not one app at a time.

Microsoft Baseline Security Mode (BSM) became that system—a prescriptive, enforceable baseline that defines what “secure” means and keeps it that way.

“We’re using Microsoft Baseline Security Mode to move security from guidance to enforcement,” says Brian Fielder, vice president of Microsoft Digital. “It establishes secure‑by‑default configurations that scale across our environment, so teams can innovate quickly without inheriting unnecessary risk.”

Defining Microsoft Baseline Security Mode

BSM is more than just a checklist of recommended settings. It’s an enforced security baseline built directly into the Microsoft 365 admin center, designed to reduce attack surface by default across core Microsoft 365 workloads.

It was developed and then deployed internally at Microsoft, with our team in Microsoft Digital serving as a close design and deployment partner throughout the process.

A photo of Wood.

“The settings in the Microsoft Baseline Security Mode were informed by years of experience in running our planet-scale services, and by analyzing historical security incidents across Microsoft to harden the security posture of tenants. The team identified concrete security settings that would prevent or significantly reduce known security vulnerabilities.”

Adriana Wood, principal product manager, Microsoft 365 security

At a technical level, BSM establishes a minimum required security posture by applying Microsoft‑managed policies and configuration states across services including Exchange Online, SharePoint Online, OneDrive, Teams, and Entra ID. The focus is on eliminating common misconfigurations, rather than theoretical or edge‑case risks.

“The settings in the Microsoft Baseline Security Mode were informed by years of experience in running our planet-scale services, and by analyzing historical security incidents across Microsoft to harden the security posture of tenants,” says Adriana Wood, a principal product manager for Microsoft 365 security. “The team identified concrete security settings that would prevent or significantly reduce known security vulnerabilities. The resulting mitigation controls were implemented and validated in Microsoft’s enterprise tenant, with Microsoft Digital evaluating operational impact, rollout characteristics, and failure modes before making it more broadly available to our customers.”

Legacy baselines rely on documentation and manual implementation. Administrators interpret guidance, apply settings where feasible, and revisit them periodically. In dynamic cloud environments, that model breaks down fast. Configurations drift, exceptions accumulate, and security degrades.

A photo of Bunge.

“Before enforcement, administrators can use reporting and simulation tools to understand how a baseline will affect users, apps, and workflows. That visibility allows teams to identify noncompliant assets, prioritize remediation by risk, and avoid unexpected disruptions.”

Keith Bunge, principal software engineer, Microsoft Digital

BSM replaces that approach with policy‑driven enforcement.

Now our controls are applied consistently across the tenant and continuously validated. When our configurations fall out of compliance, our risk surfaces immediately—it’s not discovered months later in an audit. The model is simple: get clean, stay clean.

Another key capability of BSM is impact awareness.

“Before enforcement, administrators can use reporting and simulation tools to understand how a baseline will affect users, apps, and workflows,” says Keith Bunge, a principal software engineer in Microsoft Digital. “That visibility allows teams to identify noncompliant assets, prioritize remediation by risk, and avoid unexpected disruptions. Our team in Microsoft Digital partnered closely with the product group to ensure these capabilities were practical for real enterprise deployments, not just greenfield environments.”

BSM is also not static.

The baseline evolves on a regular cadence to reflect changes in the threat landscape, new Microsoft 365 capabilities, and lessons learned from operating at scale.

From our perspective, BSM is not just a feature. It’s a security operating model. It shifts the default from “secure if configured correctly” to “secure by default.” Security decisions move out of individual teams and into a consistent, centrally enforced baseline. The question is no longer whether a control should be applied, but whether an exception is truly necessary—and how the associated risk will be mitigated.

That shift is what makes BSM sustainable at scale. And it’s why apps—where identities, permissions, and data access converge—became the next focus area for us in Microsoft Digital.

Addressing apps and high-risk surfaces

When we evaluated risk across our environment, one pattern was clear: Our apps represented both our most concentrated and least governed attack surface.

Apps are identities. They authenticate. They’re granted permissions. And unlike human users, they often operate continuously, without reassessment or visibility.

In a large, self‑service environment like ours, apps are created constantly by engineering teams, business groups, and automation workflows. Over time, many of those apps could accumulate permissions beyond what they actually needed, particularly within our Microsoft Graph. Our delegated permissions were especially risky, because they allow apps to act on our employees’ behalf at machine speed across massive data sets.

“As a user, I might not know where all my data lives,” Ganti says. “But an app with delegated permissions doesn’t have that limitation. It can search everything, everywhere, all at once.”

The challenge wasn’t just volume—it was inconsistency.

Our ownership was often unclear. Our permission reviews were infrequent or manual. And once we granted elevated access, we had few systemic controls in place requiring it to be revisited.

Microsoft Baseline Security Mode addresses this directly by treating apps explicitly as identities that must conform to least‑privilege principles.

We started with visibility. We inventoried apps and analyzed permission scopes, authentication models, and potential blast radius. Our apps with broad Microsoft Graph permissions, access to large volumes of unstructured data, or unclear ownership were prioritized. In some cases, we reduced permissions to more granular scopes. In others, we rearchitected apps to use delegated access more safely—or we retired them altogether.

This work was intentionally structured as a burndown, not a one‑time cleanup.

Removing our excess permissions was only half the equation. Preventing them from coming back was just as critical. BSM introduced guardrails earlier in the app lifecycle, to surface and control elevated permission requests before they reached production. New or updated apps requesting high‑risk permissions now trigger consistent review, and in many cases are blocked outright unless they meet strict criteria.

Moving from ‘get clean’ to ‘stay clean’

Reducing risk once is hard. Keeping it reduced is harder.

After our initial application burndown, we quickly learned that cleanup alone wouldn’t scale. Even as we reduced permissions and remediated high‑risk apps, new apps continued to appear. Existing apps evolved, teams changed, and without structural controls, the same risks would inevitably return.

BSM enabled us to shift from remediation to sustainability.

It started with visibility.

We needed a reliable way to detect when apps drifted out of compliance. That meant continuously monitoring permission changes, new consent grants, and scope expansions across our tenant. Instead of periodic reviews, we moved to continuous validation tied directly to the baseline.

Next came risk‑based prioritization.

Not every noncompliance carries equal impact. Our apps with broad Microsoft Graph permissions, access to large volumes of data, or unclear ownership were surfaced first. This ensured our security teams focused on material risk, rather than treating every deviation as equal.

It was equally important for us to control how new risk entered the system.

BSM introduces guardrails earlier in the application lifecycle. Our elevated permission requests are surfaced sooner and reviewed more consistently. In many cases, high‑risk permissions are blocked by default unless clear justification and mitigation are in place. Known‑bad patterns are stopped before our teams build or update apps.

Over time, this enforcement model fundamentally changed the operating posture.

Instead of recurring cleanup campaigns, we moved to continuous alignment. Our environment stays closer to the baseline by default. Our deviations are treated as exceptions that require explicit action, not silent drift.

This “stay clean” capability also reduced operational overhead.

As enforcement and validation moved into Microsoft Baseline Security Mode, we retired custom scripts, dashboards, and manual review processes that were difficult to maintain at scale. Our baseline became the source of truth for application security posture, not a snapshot taken after the fact.

Most importantly, we proved that BSM could scale.

“This isn’t limited to Microsoft 365. This is Microsoft, and it expands over time as more services come into scope.”

Jeff McDowell, principal program manager, OneDrive and SharePoint product group

By combining continuous validation, risk‑based prioritization, and enforced guardrails, we established a repeatable model for sustaining security improvements over time.

That model now serves as our foundation for extending BSM to additional workloads and security surfaces across the enterprise.

“This isn’t limited to Microsoft 365,” says Jeff McDowell, a principal program manager in the OneDrive and SharePoint product group. “This is Microsoft, and it expands over time as more services come into scope.”

Operationalizing Microsoft Baseline Security Mode

Defining a baseline is only the first step. Making it work day‑to‑day is the real challenge.

For us in Microsoft Digital, operationalizing BSM meant embedding it directly into how we run security. That required clear ownership, repeatable processes, and tight integration with our existing workflows.

Governance came first.

BSM creates a clear line between what is centrally enforced and what individual teams can influence. The baseline is owned and managed centrally to ensure consistency across the tenant. Our application owners and engineering teams still make design decisions, but within defined guardrails aligned to enterprise risk tolerance.

This clarity reduces friction.

Instead of debating security settings app by app, our teams start from a shared default. Our security conversations shift away from “Can we make an exception?” to “How do we meet the baseline with the least disruption?”

Operationally, BSM is integrated into our application lifecycle.

New apps are evaluated against baseline requirements early, before permissions are broadly granted or dependencies are established. Changes to existing apps, such as new permission requests or expanded scopes, are surfaced automatically and reviewed in context, rather than discovered months later during audits.

In an environment where apps are constantly being created, updated, and retired, automation is essential. Without policy‑driven enforcement, our security teams would be managing a perpetual backlog of reviews. BSM allows us to focus on true exceptions instead of revalidating the baseline itself.

That baseline is also embedded into our ongoing operations.

Our security posture is monitored continuously, not through periodic snapshots. When our configurations drift or new risks appear, we identify them early and address them while the blast radius is still small. Over time, this reduces both our operational effort and incident response overhead.

Perhaps our most important change was cultural.

BSM normalizes the idea that security defaults are foundational. Our teams still innovate and move quickly—but they do so in an environment where secure is expected, enforced, and sustained.

Embracing the feedback loop as Customer Zero

From the start, our team in Microsoft Digital deployed Microsoft Baseline Security Mode as Customer Zero: We applied early versions in our live, large‑scale enterprise environment, where we fed our real‑world learnings back to the product group. That feedback loop became central to how the platform evolved.

Running BSM at Microsoft scale quickly exposed challenges that don’t appear in smaller tenants. Visibility was one of the first. With thousands of apps and constantly changing permissions, it was difficult to pinpoint which apps violated least‑privilege principles and where security teams should focus first.

Those gaps directly shaped the product. Reporting and analytics were refined to better surface elevated permissions, risky scopes, and noncompliant apps, helping teams move from investigation to action more quickly.

Scalability was another critical lesson.

Controls that worked for dozens of apps didn’t automatically work for thousands. Our team needed policies that were opinionated, enforceable, and operationally sustainable without constant adjustment. That pushed BSM toward clearer defaults and stronger enforcement boundaries.

“What made the collaboration work is that Microsoft Digital was deploying this in a real tenant with real consequences,” Wood says. “Their feedback helped us understand what enterprises actually need to adopt these controls successfully, not just what looks good on paper.”

Over time, this became a virtuous cycle. Our team surfaced friction and risk through deployment. The product group translated those insights into product improvements. We then adopted those same improvements to replace custom tooling and manual processes.

For customers, this matters. The controls in BSM are shaped by operational reality, tested under scale and refined so other organizations don’t have to learn the same lessons the hard way.

What’s next for Microsoft Baseline Security Mode

Future iterations of BSM will expand coverage beyond traditional collaboration services to additional platforms and services, while maintaining the same opinionated approach. The goal is not to restrict environments indiscriminately, but to ensure new capabilities are introduced with security baked in from the start.

As compliance requirements grow more complex and more global, organizations need a consistent, defensible security baseline. BSM provides a Microsoft‑managed standard informed by real‑world attack patterns and enterprise deployment realities.

Controls evolve. Scope expands. Feedback loops remain active. As new risks emerge, the baseline adapts, without requiring organizations to redefine their security posture from scratch.

It’s a foundation designed to support whatever comes next.

Key takeaways

If you’re ready to strengthen your organization’s security posture with Microsoft Baseline Security Mode, consider these immediate actions:

  • Establish clear ownership. Assign responsibility for baseline security management to ensure consistency and accountability.
  • Implement repeatable processes. Develop standardized procedures to evaluate and enforce baseline requirements throughout the app lifecycle.
  • Integrate with existing workflows. Embed security controls into daily operations to reduce friction and streamline compliance.
  • Prioritize automation and monitoring. Use automated enforcementand continuous validation for early risk detection and response.
  • Foster a security-first culture. Normalize secure defaults and encourage teams to innovate within defined guardrails.
  • Design for evolution. Design your baseline to adapt as new services, platforms, and compliance needs arise.

The post Deploying Microsoft Baseline Security Mode at Microsoft: Our virtuous learning cycle appeared first on Inside Track Blog.

]]>
22829
Microsoft CISO advice: Read our four tips for securing your network http://approjects.co.za/?big=insidetrack/blog/microsoft-ciso-advice-read-our-four-tips-for-securing-your-network/ Thu, 19 Mar 2026 16:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=22779 Geoff Belknap, CVP and operating CISO for Core and Enterprise, shares four key practices your business can use to be prepared for managing network security incidents. Learn from our experience Network isolation (Secure Future Initiative) “Knowing where devices are, who owns them, and what they’re supposed to be doing is pretty important in the middle […]

The post Microsoft CISO advice: Read our four tips for securing your network appeared first on Inside Track Blog.

]]>
Geoff Belknap, CVP and operating CISO for Core and Enterprise, shares four key practices your business can use to be prepared for managing network security incidents.

“Knowing where devices are, who owns them, and what they’re supposed to be doing is pretty important in the middle of an incident,” Belknap says.

Watch this video to see Geoff Belknap discuss how we’re securing our network at Microsoft. (For a transcript, please view the video on YouTube: https://www.youtube.com/watch?v=nWPaaTHGE-M.)

Key takeaways

Here are best practices you can use to secure your network:

  • Build a complete inventory. Keep track of what your network devices are, who owns them, and what they do.
  • Capture robust telemetry. Make sure your operational teams have the tools they need to see and analyze access and authentication logs.
  • Use dynamic access control. Manage who can send packets on the corporate network by applying policies.
  • Deprecate old network assets. Cyberattackers know to look for older, unpatched network devices. You can reduce the attack surface by replacing older devices.

The post Microsoft CISO advice: Read our four tips for securing your network appeared first on Inside Track Blog.

]]>
22779
Microsoft CISO advice: Explore our four tips for securing your customer support ecosystem http://approjects.co.za/?big=insidetrack/blog/microsoft-ciso-advice-explore-our-four-tips-for-securing-your-customer-support-ecosystem/ Thu, 12 Mar 2026 16:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=22635 Microsoft business operations teams know all too well that cyberattackers seek to exploit customer support pathways. Tools that can unlock customer accounts or aid in troubleshooting issues in complex environments are a rich target. “The path attackers really like to use is to compromise support tooling and laterally move to your core tooling,” says Raji […]

The post Microsoft CISO advice: Explore our four tips for securing your customer support ecosystem appeared first on Inside Track Blog.

]]>
Microsoft business operations teams know all too well that cyberattackers seek to exploit customer support pathways. Tools that can unlock customer accounts or aid in troubleshooting issues in complex environments are a rich target.

“The path attackers really like to use is to compromise support tooling and laterally move to your core tooling,” says Raji Dani, Deputy Chief Information Security Officer (CISO) for Microsoft business operations.

Dani and her team focus on understanding and mitigating the risks within customer support operations. In this video, she shares principles and practices for every business that relies on online tools in their customer support ecosystem.

Watch this video to see Raji Dani discuss four customer support ecosystem security principles. (For a transcript, please view the video on YouTube: https://www.youtube.com/watch?v=rJ87jjz3vvo .)

Key takeaways

Here are best practices you can apply to your customer support ecosystem:

  • Create dedicated and isolated support identities. Use standardized support identities with phish-resistant multifactor authentication based in a separate identity ecosystem.
  • Implement least privilege and enforce device protection. Only grant the access needed for a given task and nothing more.
  • Ensure tooling does not have high privilege access to customer data. Architect secure tools and manage service-to-service trust and high privileged access.
  • Implement strong telemetry. Anomalous patterns in logs and telemetry data are often the first clue a cyberattack is underway.

The post Microsoft CISO advice: Explore our four tips for securing your customer support ecosystem appeared first on Inside Track Blog.

]]>
22635
Getting started with Windows Hello for Business and Day 1 authentication at Microsoft http://approjects.co.za/?big=insidetrack/blog/getting-started-with-windows-hello-for-business-and-day-1-authentication-at-microsoft/ Thu, 05 Mar 2026 17:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=22530 At Microsoft, we’re relentlessly focused on modernizing our passwordless protections in ways that strengthen our identity and security for everyone at the company. At an organization the size of ours—with a global workforce, massive cloud footprint, and millions of identities to protect—relying on passwords wasn’t a sustainable security posture. We needed something stronger, simpler, and […]

The post Getting started with Windows Hello for Business and Day 1 authentication at Microsoft appeared first on Inside Track Blog.

]]>
At Microsoft, we’re relentlessly focused on modernizing our passwordless protections in ways that strengthen our identity and security for everyone at the company.

At an organization the size of ours—with a global workforce, massive cloud footprint, and millions of identities to protect—relying on passwords wasn’t a sustainable security posture. We needed something stronger, simpler, and more secure.

This led to the introduction of Windows Hello for Business, which was first built into Windows 10 and then Windows 11. Windows Hello for Business replaces traditional passwords with hardware‑backed keys tied to a user’s device.

So, instead of typing a “secret phrase” that can be phished or leaked, our employees authenticate with biometrics or a PIN that never leaves the device. It’s fast, intuitive, and—most importantly—resistant to the kinds of attacks that plague password‑based systems.

A photo of Kabir.

“This wasn’t just a technology shift—it was a structural change in how we establish trust across the organization. The lessons we learned offer a practical blueprint for any organization looking to strengthen their security while also reducing friction for their workforce.”

Abu Kabir, director of IT service management, Microsoft Digital

Rolling out passwordless authentication at a large company like ours took more than just introducing new technology. It also required that we come up with a new way to onboard our employees securely, no matter where they work.  

The first step we took toward passwordless credentials was to create Identity Pass, which included an emphasis on Day 1 authentication (on a new employee’s first day at Microsoft). By combining strong identity proofing, a Temporary Access Pass (TAP), and automated onboarding workflows, we forged an identification system where employees could unbox their device, sign in securely, and register their credentials without ever needing a password.

The result wasn’t just a smoother user experience.

“This wasn’t just a technology shift—it was a structural change in how we establish trust across the organization,” says Abu Kabir, a director of IT service management in Microsoft Digital, the company’s IT organization. “The lessons we learned offer a practical blueprint for any organization looking to strengthen their security while also reducing friction for their workforce.”

How we launched passwordless authentication

To understand how we worked through the details of passwordless authentication, it’s helpful to explain how it was implemented in the first place.

Our passwordless security system includes several components, including face or fingerprint, a PIN tied to their device, and a physical security key (like a YubiKey), but this story focuses these on two:

  • Identity Pass: the internal system for secure, passwordless onboarding and recovery
  • Windows Hello for Business: the phishing‑resistant credential that Identity Pass helps users register

Identity Pass

Identity Pass, which is only used internally here at Microsoft, uses several tools to “bootstrap” the user, which is the first step in establishing trust among a user, a device, and an identity system. It’s the moment when you go from “nothing trusted” tosomething trusted.” Everything that happens afterward depends on getting that moment right.

Identity Pass relies on three core elements:

  • Verified ID is what we use internally to establish proof of identity. It’s an initial step and is valid for 30 days.
  • Temporary Access Pass (TAP) establishes authentication.
  • Conditional access enforces policy.

Identity Pass is where risk signals matter most, because onboarding and recovery are the moments when identity assurance is weakest. Those risk signals include:

  • Authentication behavior detection: If a user tries to redeem a TAP or Verified ID from an unusual location, device, or pattern, Authentication Behavior Detection can flag a sign in as risky. Identity Pass can then require stronger identity proofing or block the flow.
  • Global high‑risk detection: If our threat intelligence determines the user is likely compromised, Identity Pass will not allow TAP issuance or passwordless registration until the risk is remediated.
  • Strong fraud indicators: If the user’s session or token shows signs of fraud (token replay, hijacking, malicious infrastructure), Identity Pass will force remediation and block bootstrap flows.
  • Risk‑based identity assurance: This is the decision engine that takes security signals and determines what level of assurance is required. For example:
    • Low risk = allow TAP issuance
    • Medium risk = require Verified ID reproofing
    • High risk = block and escalate

Identity Pass is essentially the front door where these signals decide whether a user can even begin the passwordless journey.

Windows Hello for Business

Windows Hello for Business is the strong, phishing‑resistant credential that Identity Pass helps users register. Once this is in place, the risk signals listed above continue to influence authentication.

  • Authentication behavior detection: Windows Hello for Business sign‑ins are evaluated like any other. If the user suddenly authenticates from an impossible location or unusual device, this system flags it as a sign‑in risk.
  • Global high‑risk detection: If our detects a high‑confidence compromise, Windows Hello for Business sessions can be revoked via Continuous Access Evaluation. The user then reregisters through Identity Pass.
  • Strong fraud indicators: If a Windows Hello for Business token is replayed or misused, this system triggers immediate revocation and forces secure recovery.
  • Risk‑based identity assurance: This determines whether Windows Hello for Business alone is sufficient, or whether the user must step up to a stronger method based on risk.

Windows Hello for Business is the credential, but the risk signals determine whether that credential is trusted at any given moment.

What we learned: Rollout and implementation

While our toolsets and protocols offer a clear path for any organization moving toward passwordless authentication, transferring users from a typical user/password security setup can have a variety of challenges—especially at the outset.

Devices, environments, and remote work all matter

When an organization adopts identity‑based, passwordless authentication, one of the first realities it confronts is that the onboarding experience isn’t uniform. Employees don’t all show up with the same hardware, the same operating system version, or the same security capabilities. That diversity has a direct impact on how smoothly a user can complete the initial Day 1 setup and register a strong, phishing‑resistant credential.

A photo of Scott.

“It’s not one-size-fits-all. The onboarding experience can be different by platform, version, and device. The further away you get from a homogenized environment, the more complexity you introduce.”

Matt Scott, senior IT service manager, Microsoft Digital

Device and platform diversity is one of the defining factors in designing a successful passwordless onboarding experience. Any organization adopting identity‑based authentication needs an onboarding system that can adapt to a wide range of hardware, OS versions, and security capabilities while still enforcing a consistent, high‑assurance security model.

Identity proofing and credential registration don’t look the same across platforms. A laptop might support credential setup directly at the login screen, while a mobile device might require an app‑based flow, and a non‑traditional platform might rely entirely on browser‑based enrollment. The underlying model stays consistent, but the user experience varies depending on where the user begins.

“It’s not one-size-fits-all,” says Matt Scott, a senior IT service manager in Microsoft Digital. “The onboarding experience can be different by platform, version, and device. The further away you get from a homogenized environment, the more complexity you introduce.”

Support volume

With Identity Pass in place, we have seen dramatic reductions in password reset volume (80%), onboarding delays, and help desk tickets related to account access. At the initial rollout stage, however, most organizations should anticipate a temporary spike in support needs.

“We expected an increase in volume, because we had recently gotten to 99% in terms of users being identified through Phish-Resistant Multi-Factor Authentication,” Scott says. “In reality, what’s happening is you have a lot of users who are unhappy with the experience as part of the move to a passwordless environment.”

No matter how solid the argument is for a passwordless approach or how cleanly an organization implements it, our experience shows that organizations should expect initial confusion from employees and increased pressure on support teams.

“Moving into a passwordless environment is obviously good for everyone, but we needed to make it easier for users to get the information they needed,” Scott says. “It’s not just one fell swoop of moving from password to passwordless. It’s truly a journey. And it’s very important that change management is part of that journey.”

Helping employees help themselves

Another key learning during our implementation of passwordless authentication was the importance of accessible documentation. This gives users who have yet to establish their identity credentials a way to get unblocked without having to immediately call IT support.

That documentation must stay accurate over time, so it’s crucial to build a governance strategy that ensures updates are made quickly as new devices, platforms, and scenarios emerge.

“During onboarding, if there’s a problem and a user is locked out, they may not have access to the corporate network,” Kabir says. “Having a site that they could access, with actual instruction based on which device they’re using and that shows them how to get past key blockers, was very helpful.”

Maintaining a direct line to leadership in order to help unblock lingering change requests also proved to be essential. In one case, bugs lingered in the engineering queue for days, even weeks, because the escalation path was limited (by design).

“Approval requests were blocked, and so approvals needed to be accelerated to the skip-level approver,” Kabir says. “We were able to move fast to fix that, because we had a clear understanding of the pain that folks were feeling on our side and could effectively communicate that to leadership.”

Short-term pain, long-term gain

The impact has been significant. Instead of spending long cycles troubleshooting forgotten passwords or manually verifying user identities, IT teams can focus on higher‑value work: strengthening identity protection, refining automation, and improving the user experience. This shift not only reduces operational overhead, it also aligns with our Zero Trust principles by removing weak authentication steps from the identity lifecycle.

For employees, the experience is equally transformative. New hires can unbox a device, authenticate using a TAP delivered through a secure Verified ID workflow, and immediately register passwordless methods like Windows Hello for Business. Although the onboarding journey may vary across platforms and devices, the process is fast and intuitive.

For existing users who lose access—whether due to a forgotten PIN, a lost device, or a credential reset—Identity Pass provides a self‑service recovery path that avoids the delays and security risks of traditional reset processes.

Our experience demonstrates that when these processes are redesigned around strong, hardware‑backed, phishing‑resistant credentials, organizations gain both security and efficiency. The result is a more resilient identity foundation that supports the realities of modern work.

Key takeaways

Here are some suggestions for getting started with Windows Hello for Business and Day 1 onboarding:

  • Passwordless authentication start with strong identity proofing. Establishing user identity up front is essential to creating a secure foundation for all future authentication.
  • Day 1 onboarding is the riskiest moment. The initial bootstrap step is where trust is first established, and risk signals matter most.
  • Temporary Access Pass replaces temporary passwords. TAP provides a secure, time‑bound way for users to authenticate and register passwordless credentials without exposing the network to attack.
  • Device and platform diversity shapes the user experience. Different hardware, operating systems, and compute environments require flexible onboarding paths that still enforce consistent security.
  • Support demand spikes before it drops. Organizations should expect short‑term confusion and increased help‑desk volume before passwordless security benefits fully materialize.
  • Long‑term gains are significant. Once deployed, passwordless authentication reduces operational overhead, strengthens security, and improves the user experience across the identity lifecycle.

The post Getting started with Windows Hello for Business and Day 1 authentication at Microsoft appeared first on Inside Track Blog.

]]>
22530
Keeping our in-house optical network safe with a Zero Trust mentality http://approjects.co.za/?big=insidetrack/blog/keeping-our-in-house-optical-network-safe-with-a-zero-trust-mentality/ Thu, 16 Oct 2025 16:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=20611 When it comes to corporate connectivity at Microsoft, a minute of lost connection can lead to catastrophic disruptions for our product teams, sleepless nights for our network engineers, and millions of dollars of lost value for the company. That’s why we built our own optical network at our headquarters in Washington state, and that’s why […]

The post Keeping our in-house optical network safe with a Zero Trust mentality appeared first on Inside Track Blog.

]]>
When it comes to corporate connectivity at Microsoft, a minute of lost connection can lead to catastrophic disruptions for our product teams, sleepless nights for our network engineers, and millions of dollars of lost value for the company.

That’s why we built our own optical network at our headquarters in Washington state, and that’s why we’re building similar networks at other regional campuses around the United States and the rest of the world.

With so much on the line, we need to make sure these in-house networks never go down.

But how are we doing that?

We’re applying the same robust Zero Trust approach we take to security and identity. While our optical networks are extremely reliable, any complex system can be knocked offline. In alignment with the Zero Trust mentality we have as a company, we trusted the integrity of what we’ve built, but we needed a resilient backup system that went beyond redundancy to provide true resilience.

Driven by this goal, we created a Zero Trust Optical Business Continuity Disaster Recovery (BCDR) network that combines two fully independent optical systems designed to sustain uninterrupted services, even during systemic failures. The result is more confidence for our employees and vendors, less pressure on our network engineers, and comprehensive network resilience that will protect us against a major outage.

The urgency of resilience

In 2021, our team in Microsoft Digital, the company’s IT organization, deployed our first next-generation optical network to serve the exclusive network needs of our Puget Sound metro campuses. It offers more bandwidth on less fiber for a lower operational cost than leasing from traditional carriers.

“Puget Sound is a highly concentrated developer network where we need to provide very high throughput,” says Patrick Alverio, principal group software engineering manager for Infrastructure and Engineering Services within Microsoft Digital. “Our optical system is the backbone of all that traffic.”

Our state-of-the-art optical network fulfills our need for fast and reliable connectivity at up to 400 Gbps between core sites, labs, data centers, and the internet edge. We built this network on the Reconfigurable Optical Add/Drop Multiplexer (ROADM) technology, delivering dynamic reconfiguration, colorless, directionless, contentionless (CDC) capabilities, flexible grid support, remote provisioning, and automation. It also features a full-mesh topology that provides a layer of redundancy.

But what if the entire ROADM-based system fails?

There are plenty of operational risks that can derail even the most robust network. Anything from misconfigured automation scripts to policy changes to misaligned software versioning to simple human error can cause outages.

A photo of Elangovan

“We don’t want even a second of downtime. We needed a life raft for when failures occur that could also function as a standby network for core site migrations or platform upgrades.”

Vinoth Elangovan, senior network engineer, Hybrid Core Network Services, Microsoft Digital

To some degree, those kinds of minor disruptions are inevitable. But catastrophic events like fiber cuts, failures in the ROADM operating system, or even natural disasters have the potential for even more wide-ranging disruption.

During a catastrophic outage, thousands of engineers, developers, researchers, and other technical employees who need access to crucial lab environments and data centers could lose connectivity. That can sabotage feature delivery, disrupt product patches, interrupt updates, and halt all kinds of core product functions.

On top of normal software development operations, new AI tools demand massive bandwidth and consistent uptime. Finally, our hybrid networks feature paths integrated with Microsoft Azure that consume on-premises resources, so they also stand to benefit from increased resilience.

A catastrophic network outage can cause incredible damage to all of these business functions. In fact, we experienced exactly that in 2022.

A fiber cut combined with a ROADM system hardware reboot caused a five-minute outage at our Puget Sound metro region. In this environment, every minute of lost connectivity can result in significant financial impact, making network resilience absolutely essential.

“We don’t want even a second of downtime,” says Vinoth Elangovan, senior network engineer, who designed and implemented the Zero Trust Optical BCDR network for Microsoft. “We needed a life raft for when failures occur that could also function as a standby network for core site migrations or platform upgrades.”

Delivering greater network resilience

To ensure we could deliver uninterrupted network connectivity even in the midst of a catastrophic outage, we needed to consider the technical demands of a truly resilient system. Five design pillars helped us assemble our architectural criteria:

  1. Independent optical systems: To provide true resilience, our primary and BCDR platforms needed to operate autonomously.
  2. Physically independent paths: Circuits should avoid shared conduits, fibers, and splices to operate completely independently.
  3. Separate control software: The primary and backup networks should operate through dedicated network management systems (NMSs), automation, and provisioning domains.
  4. Unified client interface: Both systems needed to terminate into the same interface to unify service for clients and applications.
  5. Survivability by design: We couldn’t assume that any system would be immune to failure. Instead, we built for the best possible outcomes.

The result was the Zero Trust Optical BCDR architecture, a layered approach to optical networking. It consists of our primary, ROADM-based transport layer and a secondary, MUX-based transport layer, both terminating into a single logical port channel.

“Our core responsibility is the employee experience, so our main design thrust was making sure service is seamless and uninterrupted—even during an outage.”

Vinoth Elangovan, senior network engineer, Hybrid Core Network Services, Microsoft Digital

Both systems are live and active, which means they deliver production services through their own independent fibers, power supplies, and software stacks. By layering fully independent optical domains and logically unifying them at the Ethernet edge, the network can sustain a complete failure of one system and maintain continuity.

That physical and operational independence is the difference between simple redundancy and robust resilience.

“Our core responsibility is the employee experience, so our main design thrust was making sure it’s seamless and uninterrupted—even during an outage,” Elangovan says.

Optical network backed by a BCDR network

A schematic of an optical network running between different nodes and backed up by a BCDR network.
The optical network in our Puget Sound region connects core sites to labs, datacenters, and the internet edge, while the BCDR network provides backup connections to deliver resilience in case of a catastrophic network failure.

A typical ROADM optical network connects campus and data center sites to the internet edge. Our design features three interconnected optical rings, with two internet edges as multi-directional nodes, while other sites operate as dual-degree nodes with bidirectional redundancy. Meanwhile, our campuses and datacenters are designated as critical sites and equipped with Optical BCDR links to ensure enhanced resiliency. In the event of a complete Optical ROADM line failure, these critical sites retain connectivity.

In the event of an outage on the primary network, the port channel handles forward continuity automatically, shifting WAN traffic between optical paths in real time.

The transition occurs seamlessly and transparently, with no noticeable impact to clients.

A photo of Martin

“Our initial goal was to provide high-throughput connectivity for major labs, with less than six minutes of downtime per year. That represents a service level of 99.999% network continuity, and we’re aiming for even better moving forward.”

Blaine Martin, principal engineering manager, Hybrid Core Network Services, Microsoft Digital

Coupling at the Ethernet layer provides clients and applications with one logical interface, automatic load balancing and traffic distribution, and seamless failover, regardless of which optical domain is providing service.

“Our initial goal was to provide high-throughput connectivity for major labs, with less than six minutes of downtime per year,” says Blaine Martin, principal engineering manager for Hybrid Core Network Services in Microsoft Digital. “That represents a service level of 99.999% network continuity, and we’re aiming for even better moving forward.”

A new era of confidence for network engineers

For the network engineers who keep Microsoft employees and resources connected, the Zero Trust Optical BCDR network relieves much of the pressure that comes from resolving outages.

“Before, we were dependent on a single system, even with redundancies, so the human experience was like firefighting. Now, if the primary optical network is having a problem, I don’t even see it.”

Kevin Bullard, principal cloud network engineering manager, Microsoft Digital

When a network goes down, engineers have an enormous set of responsibilities to manage: processing the incident report, assigning severity, performing checks, notifying internal teams, providing updates, and engaging with physical support teams—all with a profound urgency to restore productivity.

Dialing those pressures back has been a huge benefit.

“Before, we were dependent on a single system, even with redundancies, so the human experience was like firefighting,” says Kevin Bullard, Microsoft Digital principal cloud network engineering manager responsible for maintaining WAN interconnectivity between labs. “Now, if the primary optical network is having a problem, I don’t even see it.”

There will always be pressure on network engineers to restore connectivity during an outage, but they can breathe easier knowing it won’t cost the company millions of dollars as the time to resolve ticks away. And in non-emergency situations like core site migrations, the BCDR network provides a much easier way to shunt services while the main network is offline.

“Our internal users have become more confident that they can stay connected, no matter what,” says Chakri Thammineni, principal cloud network engineer for Infrastructure and Engineering Services in Microsoft Digital. “That gives the people responsible for maintaining our enterprise networks incredible peace of mind.”

Fortunately, there hasn’t been a substantial network outage in the Puget Sound metro area since 2022. But our network engineering teams know that if and when it happens, the BCDR network will be ready to maintain service continuity.

A photo of Alverio.

“We’re always looking ahead into industry trends to stay at the bleeding edge, whether that’s in the technology we provide for our customers or the networks we use to do our own work.”

Patrick Alverio, principal group software engineering manager, Infrastructure and Engineering Services, Microsoft Digital

With our Puget Sound network protected, we have plans in place to extend this model to other metro areas. Naturally, we have to balance population, criticality, and the knowledge that elevated reliability and availability come with a cost.

Our selection criteria for new BCDR networks have largely centered around two factors: expansions of AI-critical infrastructure and concentrations of secure access workspaces (SAWs) for technical employees. With these criteria in mind, we’re planning new BCDR networks first in the Bay Area and Dublin, then in Virginia, Atlanta, and London.

Zero Trust optical BCDR architecture represents a paradigm shift in enterprise network resilience, and we’re committed to expanding the model to benefit both conventional workloads and the expanding infrastructure demands of AI.

“We’re always looking ahead into industry trends to stay at the bleeding edge, whether that’s in the technology we provide for our customers or the networks we use to do our own work,” Alverio says. “We refuse to accept the status quo, and we’re elevating the experience for employees across Puget Sound and Microsoft as a whole.”

Driving AI innovation in optical network resilience

Our journey towards an AI-driven optical network is gaining momentum.

As part of our Secure Future initiative, we’ve automated our Optical Management Platform credential rotation and are actively developing intelligent incident management ticket enrichment, auto-remediation, link provisioning, deployment validation, and capacity planning.

AI plays a central role in this transformation.

With Microsoft 365 Copilot and GitHub Copilot integrated into our engineering workflows, we’re accelerating development cycles, improving code accuracy, and uncovering optimization opportunities that would otherwise take hours of manual effort.

These Copilots are also helping our engineers analyze network patterns, simulate outcomes, and validate deployment logic before execution, reducing human error and strengthening our Zero Trust posture. Over time, we’re evolving toward a system where AI not only assists but proactively predicts potential disruptions, recommends remediations, and continuously learns from operational telemetry.

These advancements are paving the way for a future where our optical infrastructure can anticipate issues, recover faster, and operate with the agility and assurance expected in a Zero Trust environment.

Key takeaways

If you’re considering implementing your own optical and BCDR networks, consider these tips:

  • Understand the technical components of resilience: Independent optical systems, physically independent paths, separate control software, a unified client interface, and survivability by design are the key technical components of true resilience.
  • Plan from a preparedness and value perspective: Evaluate the critical points in your infrastructure and determine where you can get the most value out of resilient connectivity.
  • Ensure your teams have the right skillset: Carefully consider the right workforce to run those systems and be accountable for their operation.

The post Keeping our in-house optical network safe with a Zero Trust mentality appeared first on Inside Track Blog.

]]>
20611