Zero Trust Archives - Inside Track Blog http://approjects.co.za/?big=insidetrack/blog/tag/zero-trust/ How Microsoft does IT Fri, 10 Apr 2026 21:04:08 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.4 137088546 Microsoft CISO advice: The importance of a written AI safety plan http://approjects.co.za/?big=insidetrack/blog/microsoft-ciso-advice-the-importance-of-a-written-ai-safety-plan/ Thu, 09 Apr 2026 16:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=23016 Yonatan Zunger, CVP and Deputy CISO for Microsoft, has spent his career considering complex questions with security and privacy while building platform infrastructure and solutions. His experience underpins his advice on how to build a safety plan for working with AI. First and foremost, his advice is to have a written plan. “Make it an […]

The post Microsoft CISO advice: The importance of a written AI safety plan appeared first on Inside Track Blog.

]]>
Yonatan Zunger, CVP and Deputy CISO for Microsoft, has spent his career considering complex questions with security and privacy while building platform infrastructure and solutions. His experience underpins his advice on how to build a safety plan for working with AI. First and foremost, his advice is to have a written plan.

“Make it an expectation in your organization that people will create safety plans and have them for everything,” Zunger says. “People get so excited about having clarity in front of them that they end up making much more systematic, careful plans, and the rate of errors goes down dramatically.”

Watch this video to see Yonatan Zunger discuss his advice for creating an AI safety plan. (For a transcript, please view the video on YouTube: https://www.youtube.com/watch?v=H5reZ0uw0EA

Key takeaways

Here are questions and ideas to consider as you create a safety plan for your AI systems:

  • Define the problem. What problem are you trying to solve? A simple and clear problem statement is always a great starting point before building anything, including an AI agent.
  • Outline the solution. What is the basis of your solution? Can you explain your solution to an end user? What does a developer or administrative user of your solution need to know about what it is and does?
  • List the things that can go wrong. What can go wrong with your solution? Creating this list is the first step to figuring out how to deal with those issues.
  • Document your plan. What is your plan to address identified concerns? Identify the process you will follow when something goes wrong.
  • Draft your plan early and update it as your solution matures. Your safety plan can be as simple as a list or outline and should evolve as you prepare to build your solution.
  • Get feedback and buy-in. When you review the plan with stakeholders and leaders in your team and organization, you may uncover risks or issues you had not thought of. You also build awareness and agreement on what to do when something goes wrong.
  • Make a template and build its use into your processes. This tip is for anyone who leads a team or influences process development. Encourage using a safety template in all your projects to bring clarity and structure to how you work with AI.

The post Microsoft CISO advice: The importance of a written AI safety plan appeared first on Inside Track Blog.

]]>
23016
Microsoft CISO advice: The most important thing to know about securing AI http://approjects.co.za/?big=insidetrack/blog/microsoft-ciso-advice-the-most-important-thing-to-know-about-securing-ai/ Thu, 02 Apr 2026 16:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=22863 Using AI comes with inherent risks. In a recent video, Yonatan Zunger, CVP and deputy CISO for Microsoft, suggests thinking about AI as a new intern will help you naturally take the right approach to AI security.  Zunger and his team focus on AI safety and security. They consider all the different ways anything involving […]

The post Microsoft CISO advice: The most important thing to know about securing AI appeared first on Inside Track Blog.

]]>
Using AI comes with inherent risks. In a recent video, Yonatan Zunger, CVP and deputy CISO for Microsoft, suggests thinking about AI as a new intern will help you naturally take the right approach to AI security. 

Zunger and his team focus on AI safety and security. They consider all the different ways anything involving working with AI can go wrong.

“An important thing to know about AI is that AI’s make mistakes,” Zunger says. “You already know how to work with systems that make mistakes, get tricked.”

Watch this video to see Yonatan Zunger discuss his advice for working with AI. (For a transcript, please view the video on YouTube: https://youtu.be/b1x6gDbSWVY. )

The post Microsoft CISO advice: The most important thing to know about securing AI appeared first on Inside Track Blog.

]]>
22863
Deploying Microsoft Baseline Security Mode at Microsoft: Our virtuous learning cycle http://approjects.co.za/?big=insidetrack/blog/deploying-microsoft-baseline-security-mode-at-microsoft-our-virtuous-learning-cycle/ Thu, 26 Mar 2026 16:05:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=22829 The enterprise security frontier isn’t just evolving. It’s accelerating beyond the limits of traditional security models. AI acceleration, cloud adoption, and rapid growth of enterprise apps have dramatically expanded the attack surface. Every new app introduces a new identity. Every identity carries permissions. Over time, those permissions accumulate, often without clear ownership or regular review. […]

The post Deploying Microsoft Baseline Security Mode at Microsoft: Our virtuous learning cycle appeared first on Inside Track Blog.

]]>
The enterprise security frontier isn’t just evolving. It’s accelerating beyond the limits of traditional security models.

AI acceleration, cloud adoption, and rapid growth of enterprise apps have dramatically expanded the attack surface. Every new app introduces a new identity. Every identity carries permissions. Over time, those permissions accumulate, often without clear ownership or regular review.

A photo of Ganti.

“An app is another form of identity. In a cloud-first, Zero Trust world, identity becomes the primary security perimeter, and access is governed by the principle of least privilege. Whether it is a user, an app, or an agent, when permissions are overly broad or elevated the blast radius expands dramatically, increasing risk exponentially.”

B. Ganti, principal architect, Microsoft Digital

Inside Microsoft Digital—the company’s IT organization—we recognized this early. Many of our highest‑risk security scenarios didn’t start with malware or phishing. They started with access. Specifically, apps running with permissions beyond what they required.

“An app is another form of identity,” says B. Ganti, principal architect in Microsoft Digital. “In a cloud-first, Zero Trust world, identity becomes the primary security perimeter, and access is governed by the principle of least privilege. Whether it is a user, an app, or an agent, when permissions are overly broad or elevated the blast radius expands dramatically, increasing risk exponentially.

Traditional security approaches such as periodic reviews, best‑practice guidance, and point‑in‑time hardening weren’t enough in an environment that changes daily. Configurations drift, new apps appear, and risk grows quietly in places that are hard to see at scale.

That reality forced a mindset shift internally here at Microsoft. Security couldn’t be optional. It couldn’t be advisory. And it couldn’t be static.

Our team operates one of the largest enterprise environments in the world, with tens of thousands of apps and a culture built on self‑service and autonomy. That scale drives innovation, but it also amplifies risk.

Our application identities became one of the most complex governance challenges we faced. Our ownership wasn’t always clear. Our permissions were often granted broadly to avoid disruption. And once approved, access rarely came under scrutiny again.

“As a self‑service organization, we empower people to move fast,” Ganti says. “But that also means apps get created, permissions get granted, and not everyone always remembers why.”

The rise of AI‑powered apps and agents—often requiring access to large volumes of data—increased our risk further.

Photo of Fielder

“We’re using Microsoft Baseline Security Mode to move security from guidance to enforcement. It establishes secure‑by‑default configurations that scale across our environment, so teams can innovate quickly without inheriting unnecessary risk.”

Brian Fielder, vice president, Microsoft Digital

We needed a system to reduce that risk systematically, not one app at a time.

Microsoft Baseline Security Mode (BSM) became that system—a prescriptive, enforceable baseline that defines what “secure” means and keeps it that way.

“We’re using Microsoft Baseline Security Mode to move security from guidance to enforcement,” says Brian Fielder, vice president of Microsoft Digital. “It establishes secure‑by‑default configurations that scale across our environment, so teams can innovate quickly without inheriting unnecessary risk.”

Defining Microsoft Baseline Security Mode

BSM is more than just a checklist of recommended settings. It’s an enforced security baseline built directly into the Microsoft 365 admin center, designed to reduce attack surface by default across core Microsoft 365 workloads.

It was developed and then deployed internally at Microsoft, with our team in Microsoft Digital serving as a close design and deployment partner throughout the process.

A photo of Wood.

“The settings in the Microsoft Baseline Security Mode were informed by years of experience in running our planet-scale services, and by analyzing historical security incidents across Microsoft to harden the security posture of tenants. The team identified concrete security settings that would prevent or significantly reduce known security vulnerabilities.”

Adriana Wood, principal product manager, Microsoft 365 security

At a technical level, BSM establishes a minimum required security posture by applying Microsoft‑managed policies and configuration states across services including Exchange Online, SharePoint Online, OneDrive, Teams, and Entra ID. The focus is on eliminating common misconfigurations, rather than theoretical or edge‑case risks.

“The settings in the Microsoft Baseline Security Mode were informed by years of experience in running our planet-scale services, and by analyzing historical security incidents across Microsoft to harden the security posture of tenants,” says Adriana Wood, a principal product manager for Microsoft 365 security. “The team identified concrete security settings that would prevent or significantly reduce known security vulnerabilities. The resulting mitigation controls were implemented and validated in Microsoft’s enterprise tenant, with Microsoft Digital evaluating operational impact, rollout characteristics, and failure modes before making it more broadly available to our customers.”

Legacy baselines rely on documentation and manual implementation. Administrators interpret guidance, apply settings where feasible, and revisit them periodically. In dynamic cloud environments, that model breaks down fast. Configurations drift, exceptions accumulate, and security degrades.

A photo of Bunge.

“Before enforcement, administrators can use reporting and simulation tools to understand how a baseline will affect users, apps, and workflows. That visibility allows teams to identify noncompliant assets, prioritize remediation by risk, and avoid unexpected disruptions.”

Keith Bunge, principal software engineer, Microsoft Digital

BSM replaces that approach with policy‑driven enforcement.

Now our controls are applied consistently across the tenant and continuously validated. When our configurations fall out of compliance, our risk surfaces immediately—it’s not discovered months later in an audit. The model is simple: get clean, stay clean.

Another key capability of BSM is impact awareness.

“Before enforcement, administrators can use reporting and simulation tools to understand how a baseline will affect users, apps, and workflows,” says Keith Bunge, a principal software engineer in Microsoft Digital. “That visibility allows teams to identify noncompliant assets, prioritize remediation by risk, and avoid unexpected disruptions. Our team in Microsoft Digital partnered closely with the product group to ensure these capabilities were practical for real enterprise deployments, not just greenfield environments.”

BSM is also not static.

The baseline evolves on a regular cadence to reflect changes in the threat landscape, new Microsoft 365 capabilities, and lessons learned from operating at scale.

From our perspective, BSM is not just a feature. It’s a security operating model. It shifts the default from “secure if configured correctly” to “secure by default.” Security decisions move out of individual teams and into a consistent, centrally enforced baseline. The question is no longer whether a control should be applied, but whether an exception is truly necessary—and how the associated risk will be mitigated.

That shift is what makes BSM sustainable at scale. And it’s why apps—where identities, permissions, and data access converge—became the next focus area for us in Microsoft Digital.

Addressing apps and high-risk surfaces

When we evaluated risk across our environment, one pattern was clear: Our apps represented both our most concentrated and least governed attack surface.

Apps are identities. They authenticate. They’re granted permissions. And unlike human users, they often operate continuously, without reassessment or visibility.

In a large, self‑service environment like ours, apps are created constantly by engineering teams, business groups, and automation workflows. Over time, many of those apps could accumulate permissions beyond what they actually needed, particularly within our Microsoft Graph. Our delegated permissions were especially risky, because they allow apps to act on our employees’ behalf at machine speed across massive data sets.

“As a user, I might not know where all my data lives,” Ganti says. “But an app with delegated permissions doesn’t have that limitation. It can search everything, everywhere, all at once.”

The challenge wasn’t just volume—it was inconsistency.

Our ownership was often unclear. Our permission reviews were infrequent or manual. And once we granted elevated access, we had few systemic controls in place requiring it to be revisited.

Microsoft Baseline Security Mode addresses this directly by treating apps explicitly as identities that must conform to least‑privilege principles.

We started with visibility. We inventoried apps and analyzed permission scopes, authentication models, and potential blast radius. Our apps with broad Microsoft Graph permissions, access to large volumes of unstructured data, or unclear ownership were prioritized. In some cases, we reduced permissions to more granular scopes. In others, we rearchitected apps to use delegated access more safely—or we retired them altogether.

This work was intentionally structured as a burndown, not a one‑time cleanup.

Removing our excess permissions was only half the equation. Preventing them from coming back was just as critical. BSM introduced guardrails earlier in the app lifecycle, to surface and control elevated permission requests before they reached production. New or updated apps requesting high‑risk permissions now trigger consistent review, and in many cases are blocked outright unless they meet strict criteria.

Moving from ‘get clean’ to ‘stay clean’

Reducing risk once is hard. Keeping it reduced is harder.

After our initial application burndown, we quickly learned that cleanup alone wouldn’t scale. Even as we reduced permissions and remediated high‑risk apps, new apps continued to appear. Existing apps evolved, teams changed, and without structural controls, the same risks would inevitably return.

BSM enabled us to shift from remediation to sustainability.

It started with visibility.

We needed a reliable way to detect when apps drifted out of compliance. That meant continuously monitoring permission changes, new consent grants, and scope expansions across our tenant. Instead of periodic reviews, we moved to continuous validation tied directly to the baseline.

Next came risk‑based prioritization.

Not every noncompliance carries equal impact. Our apps with broad Microsoft Graph permissions, access to large volumes of data, or unclear ownership were surfaced first. This ensured our security teams focused on material risk, rather than treating every deviation as equal.

It was equally important for us to control how new risk entered the system.

BSM introduces guardrails earlier in the application lifecycle. Our elevated permission requests are surfaced sooner and reviewed more consistently. In many cases, high‑risk permissions are blocked by default unless clear justification and mitigation are in place. Known‑bad patterns are stopped before our teams build or update apps.

Over time, this enforcement model fundamentally changed the operating posture.

Instead of recurring cleanup campaigns, we moved to continuous alignment. Our environment stays closer to the baseline by default. Our deviations are treated as exceptions that require explicit action, not silent drift.

This “stay clean” capability also reduced operational overhead.

As enforcement and validation moved into Microsoft Baseline Security Mode, we retired custom scripts, dashboards, and manual review processes that were difficult to maintain at scale. Our baseline became the source of truth for application security posture, not a snapshot taken after the fact.

Most importantly, we proved that BSM could scale.

“This isn’t limited to Microsoft 365. This is Microsoft, and it expands over time as more services come into scope.”

Jeff McDowell, principal program manager, OneDrive and SharePoint product group

By combining continuous validation, risk‑based prioritization, and enforced guardrails, we established a repeatable model for sustaining security improvements over time.

That model now serves as our foundation for extending BSM to additional workloads and security surfaces across the enterprise.

“This isn’t limited to Microsoft 365,” says Jeff McDowell, a principal program manager in the OneDrive and SharePoint product group. “This is Microsoft, and it expands over time as more services come into scope.”

Operationalizing Microsoft Baseline Security Mode

Defining a baseline is only the first step. Making it work day‑to‑day is the real challenge.

For us in Microsoft Digital, operationalizing BSM meant embedding it directly into how we run security. That required clear ownership, repeatable processes, and tight integration with our existing workflows.

Governance came first.

BSM creates a clear line between what is centrally enforced and what individual teams can influence. The baseline is owned and managed centrally to ensure consistency across the tenant. Our application owners and engineering teams still make design decisions, but within defined guardrails aligned to enterprise risk tolerance.

This clarity reduces friction.

Instead of debating security settings app by app, our teams start from a shared default. Our security conversations shift away from “Can we make an exception?” to “How do we meet the baseline with the least disruption?”

Operationally, BSM is integrated into our application lifecycle.

New apps are evaluated against baseline requirements early, before permissions are broadly granted or dependencies are established. Changes to existing apps, such as new permission requests or expanded scopes, are surfaced automatically and reviewed in context, rather than discovered months later during audits.

In an environment where apps are constantly being created, updated, and retired, automation is essential. Without policy‑driven enforcement, our security teams would be managing a perpetual backlog of reviews. BSM allows us to focus on true exceptions instead of revalidating the baseline itself.

That baseline is also embedded into our ongoing operations.

Our security posture is monitored continuously, not through periodic snapshots. When our configurations drift or new risks appear, we identify them early and address them while the blast radius is still small. Over time, this reduces both our operational effort and incident response overhead.

Perhaps our most important change was cultural.

BSM normalizes the idea that security defaults are foundational. Our teams still innovate and move quickly—but they do so in an environment where secure is expected, enforced, and sustained.

Embracing the feedback loop as Customer Zero

From the start, our team in Microsoft Digital deployed Microsoft Baseline Security Mode as Customer Zero: We applied early versions in our live, large‑scale enterprise environment, where we fed our real‑world learnings back to the product group. That feedback loop became central to how the platform evolved.

Running BSM at Microsoft scale quickly exposed challenges that don’t appear in smaller tenants. Visibility was one of the first. With thousands of apps and constantly changing permissions, it was difficult to pinpoint which apps violated least‑privilege principles and where security teams should focus first.

Those gaps directly shaped the product. Reporting and analytics were refined to better surface elevated permissions, risky scopes, and noncompliant apps, helping teams move from investigation to action more quickly.

Scalability was another critical lesson.

Controls that worked for dozens of apps didn’t automatically work for thousands. Our team needed policies that were opinionated, enforceable, and operationally sustainable without constant adjustment. That pushed BSM toward clearer defaults and stronger enforcement boundaries.

“What made the collaboration work is that Microsoft Digital was deploying this in a real tenant with real consequences,” Wood says. “Their feedback helped us understand what enterprises actually need to adopt these controls successfully, not just what looks good on paper.”

Over time, this became a virtuous cycle. Our team surfaced friction and risk through deployment. The product group translated those insights into product improvements. We then adopted those same improvements to replace custom tooling and manual processes.

For customers, this matters. The controls in BSM are shaped by operational reality, tested under scale and refined so other organizations don’t have to learn the same lessons the hard way.

What’s next for Microsoft Baseline Security Mode

Future iterations of BSM will expand coverage beyond traditional collaboration services to additional platforms and services, while maintaining the same opinionated approach. The goal is not to restrict environments indiscriminately, but to ensure new capabilities are introduced with security baked in from the start.

As compliance requirements grow more complex and more global, organizations need a consistent, defensible security baseline. BSM provides a Microsoft‑managed standard informed by real‑world attack patterns and enterprise deployment realities.

Controls evolve. Scope expands. Feedback loops remain active. As new risks emerge, the baseline adapts, without requiring organizations to redefine their security posture from scratch.

It’s a foundation designed to support whatever comes next.

Key takeaways

If you’re ready to strengthen your organization’s security posture with Microsoft Baseline Security Mode, consider these immediate actions:

  • Establish clear ownership. Assign responsibility for baseline security management to ensure consistency and accountability.
  • Implement repeatable processes. Develop standardized procedures to evaluate and enforce baseline requirements throughout the app lifecycle.
  • Integrate with existing workflows. Embed security controls into daily operations to reduce friction and streamline compliance.
  • Prioritize automation and monitoring. Use automated enforcementand continuous validation for early risk detection and response.
  • Foster a security-first culture. Normalize secure defaults and encourage teams to innovate within defined guardrails.
  • Design for evolution. Design your baseline to adapt as new services, platforms, and compliance needs arise.

The post Deploying Microsoft Baseline Security Mode at Microsoft: Our virtuous learning cycle appeared first on Inside Track Blog.

]]>
22829
Microsoft CISO advice: Read our four tips for securing your network http://approjects.co.za/?big=insidetrack/blog/microsoft-ciso-advice-read-our-four-tips-for-securing-your-network/ Thu, 19 Mar 2026 16:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=22779 Geoff Belknap, CVP and operating CISO for Core and Enterprise, shares four key practices your business can use to be prepared for managing network security incidents. Learn from our experience Network isolation (Secure Future Initiative) “Knowing where devices are, who owns them, and what they’re supposed to be doing is pretty important in the middle […]

The post Microsoft CISO advice: Read our four tips for securing your network appeared first on Inside Track Blog.

]]>
Geoff Belknap, CVP and operating CISO for Core and Enterprise, shares four key practices your business can use to be prepared for managing network security incidents.

“Knowing where devices are, who owns them, and what they’re supposed to be doing is pretty important in the middle of an incident,” Belknap says.

Watch this video to see Geoff Belknap discuss how we’re securing our network at Microsoft. (For a transcript, please view the video on YouTube: https://www.youtube.com/watch?v=nWPaaTHGE-M.)

Key takeaways

Here are best practices you can use to secure your network:

  • Build a complete inventory. Keep track of what your network devices are, who owns them, and what they do.
  • Capture robust telemetry. Make sure your operational teams have the tools they need to see and analyze access and authentication logs.
  • Use dynamic access control. Manage who can send packets on the corporate network by applying policies.
  • Deprecate old network assets. Cyberattackers know to look for older, unpatched network devices. You can reduce the attack surface by replacing older devices.

The post Microsoft CISO advice: Read our four tips for securing your network appeared first on Inside Track Blog.

]]>
22779
Microsoft CISO advice: Explore our four tips for securing your customer support ecosystem http://approjects.co.za/?big=insidetrack/blog/microsoft-ciso-advice-explore-our-four-tips-for-securing-your-customer-support-ecosystem/ Thu, 12 Mar 2026 16:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=22635 Microsoft business operations teams know all too well that cyberattackers seek to exploit customer support pathways. Tools that can unlock customer accounts or aid in troubleshooting issues in complex environments are a rich target. “The path attackers really like to use is to compromise support tooling and laterally move to your core tooling,” says Raji […]

The post Microsoft CISO advice: Explore our four tips for securing your customer support ecosystem appeared first on Inside Track Blog.

]]>
Microsoft business operations teams know all too well that cyberattackers seek to exploit customer support pathways. Tools that can unlock customer accounts or aid in troubleshooting issues in complex environments are a rich target.

“The path attackers really like to use is to compromise support tooling and laterally move to your core tooling,” says Raji Dani, Deputy Chief Information Security Officer (CISO) for Microsoft business operations.

Dani and her team focus on understanding and mitigating the risks within customer support operations. In this video, she shares principles and practices for every business that relies on online tools in their customer support ecosystem.

Watch this video to see Raji Dani discuss four customer support ecosystem security principles. (For a transcript, please view the video on YouTube: https://www.youtube.com/watch?v=rJ87jjz3vvo .)

Key takeaways

Here are best practices you can apply to your customer support ecosystem:

  • Create dedicated and isolated support identities. Use standardized support identities with phish-resistant multifactor authentication based in a separate identity ecosystem.
  • Implement least privilege and enforce device protection. Only grant the access needed for a given task and nothing more.
  • Ensure tooling does not have high privilege access to customer data. Architect secure tools and manage service-to-service trust and high privileged access.
  • Implement strong telemetry. Anomalous patterns in logs and telemetry data are often the first clue a cyberattack is underway.

The post Microsoft CISO advice: Explore our four tips for securing your customer support ecosystem appeared first on Inside Track Blog.

]]>
22635
Getting started with Windows Hello for Business and Day 1 authentication at Microsoft http://approjects.co.za/?big=insidetrack/blog/getting-started-with-windows-hello-for-business-and-day-1-authentication-at-microsoft/ Thu, 05 Mar 2026 17:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=22530 At Microsoft, we’re relentlessly focused on modernizing our passwordless protections in ways that strengthen our identity and security for everyone at the company. At an organization the size of ours—with a global workforce, massive cloud footprint, and millions of identities to protect—relying on passwords wasn’t a sustainable security posture. We needed something stronger, simpler, and […]

The post Getting started with Windows Hello for Business and Day 1 authentication at Microsoft appeared first on Inside Track Blog.

]]>
At Microsoft, we’re relentlessly focused on modernizing our passwordless protections in ways that strengthen our identity and security for everyone at the company.

At an organization the size of ours—with a global workforce, massive cloud footprint, and millions of identities to protect—relying on passwords wasn’t a sustainable security posture. We needed something stronger, simpler, and more secure.

This led to the introduction of Windows Hello for Business, which was first built into Windows 10 and then Windows 11. Windows Hello for Business replaces traditional passwords with hardware‑backed keys tied to a user’s device.

So, instead of typing a “secret phrase” that can be phished or leaked, our employees authenticate with biometrics or a PIN that never leaves the device. It’s fast, intuitive, and—most importantly—resistant to the kinds of attacks that plague password‑based systems.

A photo of Kabir.

“This wasn’t just a technology shift—it was a structural change in how we establish trust across the organization. The lessons we learned offer a practical blueprint for any organization looking to strengthen their security while also reducing friction for their workforce.”

Abu Kabir, director of IT service management, Microsoft Digital

Rolling out passwordless authentication at a large company like ours took more than just introducing new technology. It also required that we come up with a new way to onboard our employees securely, no matter where they work.  

The first step we took toward passwordless credentials was to create Identity Pass, which included an emphasis on Day 1 authentication (on a new employee’s first day at Microsoft). By combining strong identity proofing, a Temporary Access Pass (TAP), and automated onboarding workflows, we forged an identification system where employees could unbox their device, sign in securely, and register their credentials without ever needing a password.

The result wasn’t just a smoother user experience.

“This wasn’t just a technology shift—it was a structural change in how we establish trust across the organization,” says Abu Kabir, a director of IT service management in Microsoft Digital, the company’s IT organization. “The lessons we learned offer a practical blueprint for any organization looking to strengthen their security while also reducing friction for their workforce.”

How we launched passwordless authentication

To understand how we worked through the details of passwordless authentication, it’s helpful to explain how it was implemented in the first place.

Our passwordless security system includes several components, including face or fingerprint, a PIN tied to their device, and a physical security key (like a YubiKey), but this story focuses these on two:

  • Identity Pass: the internal system for secure, passwordless onboarding and recovery
  • Windows Hello for Business: the phishing‑resistant credential that Identity Pass helps users register

Identity Pass

Identity Pass, which is only used internally here at Microsoft, uses several tools to “bootstrap” the user, which is the first step in establishing trust among a user, a device, and an identity system. It’s the moment when you go from “nothing trusted” tosomething trusted.” Everything that happens afterward depends on getting that moment right.

Identity Pass relies on three core elements:

  • Verified ID is what we use internally to establish proof of identity. It’s an initial step and is valid for 30 days.
  • Temporary Access Pass (TAP) establishes authentication.
  • Conditional access enforces policy.

Identity Pass is where risk signals matter most, because onboarding and recovery are the moments when identity assurance is weakest. Those risk signals include:

  • Authentication behavior detection: If a user tries to redeem a TAP or Verified ID from an unusual location, device, or pattern, Authentication Behavior Detection can flag a sign in as risky. Identity Pass can then require stronger identity proofing or block the flow.
  • Global high‑risk detection: If our threat intelligence determines the user is likely compromised, Identity Pass will not allow TAP issuance or passwordless registration until the risk is remediated.
  • Strong fraud indicators: If the user’s session or token shows signs of fraud (token replay, hijacking, malicious infrastructure), Identity Pass will force remediation and block bootstrap flows.
  • Risk‑based identity assurance: This is the decision engine that takes security signals and determines what level of assurance is required. For example:
    • Low risk = allow TAP issuance
    • Medium risk = require Verified ID reproofing
    • High risk = block and escalate

Identity Pass is essentially the front door where these signals decide whether a user can even begin the passwordless journey.

Windows Hello for Business

Windows Hello for Business is the strong, phishing‑resistant credential that Identity Pass helps users register. Once this is in place, the risk signals listed above continue to influence authentication.

  • Authentication behavior detection: Windows Hello for Business sign‑ins are evaluated like any other. If the user suddenly authenticates from an impossible location or unusual device, this system flags it as a sign‑in risk.
  • Global high‑risk detection: If our detects a high‑confidence compromise, Windows Hello for Business sessions can be revoked via Continuous Access Evaluation. The user then reregisters through Identity Pass.
  • Strong fraud indicators: If a Windows Hello for Business token is replayed or misused, this system triggers immediate revocation and forces secure recovery.
  • Risk‑based identity assurance: This determines whether Windows Hello for Business alone is sufficient, or whether the user must step up to a stronger method based on risk.

Windows Hello for Business is the credential, but the risk signals determine whether that credential is trusted at any given moment.

What we learned: Rollout and implementation

While our toolsets and protocols offer a clear path for any organization moving toward passwordless authentication, transferring users from a typical user/password security setup can have a variety of challenges—especially at the outset.

Devices, environments, and remote work all matter

When an organization adopts identity‑based, passwordless authentication, one of the first realities it confronts is that the onboarding experience isn’t uniform. Employees don’t all show up with the same hardware, the same operating system version, or the same security capabilities. That diversity has a direct impact on how smoothly a user can complete the initial Day 1 setup and register a strong, phishing‑resistant credential.

A photo of Scott.

“It’s not one-size-fits-all. The onboarding experience can be different by platform, version, and device. The further away you get from a homogenized environment, the more complexity you introduce.”

Matt Scott, senior IT service manager, Microsoft Digital

Device and platform diversity is one of the defining factors in designing a successful passwordless onboarding experience. Any organization adopting identity‑based authentication needs an onboarding system that can adapt to a wide range of hardware, OS versions, and security capabilities while still enforcing a consistent, high‑assurance security model.

Identity proofing and credential registration don’t look the same across platforms. A laptop might support credential setup directly at the login screen, while a mobile device might require an app‑based flow, and a non‑traditional platform might rely entirely on browser‑based enrollment. The underlying model stays consistent, but the user experience varies depending on where the user begins.

“It’s not one-size-fits-all,” says Matt Scott, a senior IT service manager in Microsoft Digital. “The onboarding experience can be different by platform, version, and device. The further away you get from a homogenized environment, the more complexity you introduce.”

Support volume

With Identity Pass in place, we have seen dramatic reductions in password reset volume (80%), onboarding delays, and help desk tickets related to account access. At the initial rollout stage, however, most organizations should anticipate a temporary spike in support needs.

“We expected an increase in volume, because we had recently gotten to 99% in terms of users being identified through Phish-Resistant Multi-Factor Authentication,” Scott says. “In reality, what’s happening is you have a lot of users who are unhappy with the experience as part of the move to a passwordless environment.”

No matter how solid the argument is for a passwordless approach or how cleanly an organization implements it, our experience shows that organizations should expect initial confusion from employees and increased pressure on support teams.

“Moving into a passwordless environment is obviously good for everyone, but we needed to make it easier for users to get the information they needed,” Scott says. “It’s not just one fell swoop of moving from password to passwordless. It’s truly a journey. And it’s very important that change management is part of that journey.”

Helping employees help themselves

Another key learning during our implementation of passwordless authentication was the importance of accessible documentation. This gives users who have yet to establish their identity credentials a way to get unblocked without having to immediately call IT support.

That documentation must stay accurate over time, so it’s crucial to build a governance strategy that ensures updates are made quickly as new devices, platforms, and scenarios emerge.

“During onboarding, if there’s a problem and a user is locked out, they may not have access to the corporate network,” Kabir says. “Having a site that they could access, with actual instruction based on which device they’re using and that shows them how to get past key blockers, was very helpful.”

Maintaining a direct line to leadership in order to help unblock lingering change requests also proved to be essential. In one case, bugs lingered in the engineering queue for days, even weeks, because the escalation path was limited (by design).

“Approval requests were blocked, and so approvals needed to be accelerated to the skip-level approver,” Kabir says. “We were able to move fast to fix that, because we had a clear understanding of the pain that folks were feeling on our side and could effectively communicate that to leadership.”

Short-term pain, long-term gain

The impact has been significant. Instead of spending long cycles troubleshooting forgotten passwords or manually verifying user identities, IT teams can focus on higher‑value work: strengthening identity protection, refining automation, and improving the user experience. This shift not only reduces operational overhead, it also aligns with our Zero Trust principles by removing weak authentication steps from the identity lifecycle.

For employees, the experience is equally transformative. New hires can unbox a device, authenticate using a TAP delivered through a secure Verified ID workflow, and immediately register passwordless methods like Windows Hello for Business. Although the onboarding journey may vary across platforms and devices, the process is fast and intuitive.

For existing users who lose access—whether due to a forgotten PIN, a lost device, or a credential reset—Identity Pass provides a self‑service recovery path that avoids the delays and security risks of traditional reset processes.

Our experience demonstrates that when these processes are redesigned around strong, hardware‑backed, phishing‑resistant credentials, organizations gain both security and efficiency. The result is a more resilient identity foundation that supports the realities of modern work.

Key takeaways

Here are some suggestions for getting started with Windows Hello for Business and Day 1 onboarding:

  • Passwordless authentication start with strong identity proofing. Establishing user identity up front is essential to creating a secure foundation for all future authentication.
  • Day 1 onboarding is the riskiest moment. The initial bootstrap step is where trust is first established, and risk signals matter most.
  • Temporary Access Pass replaces temporary passwords. TAP provides a secure, time‑bound way for users to authenticate and register passwordless credentials without exposing the network to attack.
  • Device and platform diversity shapes the user experience. Different hardware, operating systems, and compute environments require flexible onboarding paths that still enforce consistent security.
  • Support demand spikes before it drops. Organizations should expect short‑term confusion and increased help‑desk volume before passwordless security benefits fully materialize.
  • Long‑term gains are significant. Once deployed, passwordless authentication reduces operational overhead, strengthens security, and improves the user experience across the identity lifecycle.

The post Getting started with Windows Hello for Business and Day 1 authentication at Microsoft appeared first on Inside Track Blog.

]]>
22530
Keeping our in-house optical network safe with a Zero Trust mentality http://approjects.co.za/?big=insidetrack/blog/keeping-our-in-house-optical-network-safe-with-a-zero-trust-mentality/ Thu, 16 Oct 2025 16:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=20611 When it comes to corporate connectivity at Microsoft, a minute of lost connection can lead to catastrophic disruptions for our product teams, sleepless nights for our network engineers, and millions of dollars of lost value for the company. That’s why we built our own optical network at our headquarters in Washington state, and that’s why […]

The post Keeping our in-house optical network safe with a Zero Trust mentality appeared first on Inside Track Blog.

]]>
When it comes to corporate connectivity at Microsoft, a minute of lost connection can lead to catastrophic disruptions for our product teams, sleepless nights for our network engineers, and millions of dollars of lost value for the company.

That’s why we built our own optical network at our headquarters in Washington state, and that’s why we’re building similar networks at other regional campuses around the United States and the rest of the world.

With so much on the line, we need to make sure these in-house networks never go down.

But how are we doing that?

We’re applying the same robust Zero Trust approach we take to security and identity. While our optical networks are extremely reliable, any complex system can be knocked offline. In alignment with the Zero Trust mentality we have as a company, we trusted the integrity of what we’ve built, but we needed a resilient backup system that went beyond redundancy to provide true resilience.

Driven by this goal, we created a Zero Trust Optical Business Continuity Disaster Recovery (BCDR) network that combines two fully independent optical systems designed to sustain uninterrupted services, even during systemic failures. The result is more confidence for our employees and vendors, less pressure on our network engineers, and comprehensive network resilience that will protect us against a major outage.

The urgency of resilience

In 2021, our team in Microsoft Digital, the company’s IT organization, deployed our first next-generation optical network to serve the exclusive network needs of our Puget Sound metro campuses. It offers more bandwidth on less fiber for a lower operational cost than leasing from traditional carriers.

“Puget Sound is a highly concentrated developer network where we need to provide very high throughput,” says Patrick Alverio, principal group software engineering manager for Infrastructure and Engineering Services within Microsoft Digital. “Our optical system is the backbone of all that traffic.”

Our state-of-the-art optical network fulfills our need for fast and reliable connectivity at up to 400 Gbps between core sites, labs, data centers, and the internet edge. We built this network on the Reconfigurable Optical Add/Drop Multiplexer (ROADM) technology, delivering dynamic reconfiguration, colorless, directionless, contentionless (CDC) capabilities, flexible grid support, remote provisioning, and automation. It also features a full-mesh topology that provides a layer of redundancy.

But what if the entire ROADM-based system fails?

There are plenty of operational risks that can derail even the most robust network. Anything from misconfigured automation scripts to policy changes to misaligned software versioning to simple human error can cause outages.

A photo of Elangovan

“We don’t want even a second of downtime. We needed a life raft for when failures occur that could also function as a standby network for core site migrations or platform upgrades.”

Vinoth Elangovan, senior network engineer, Hybrid Core Network Services, Microsoft Digital

To some degree, those kinds of minor disruptions are inevitable. But catastrophic events like fiber cuts, failures in the ROADM operating system, or even natural disasters have the potential for even more wide-ranging disruption.

During a catastrophic outage, thousands of engineers, developers, researchers, and other technical employees who need access to crucial lab environments and data centers could lose connectivity. That can sabotage feature delivery, disrupt product patches, interrupt updates, and halt all kinds of core product functions.

On top of normal software development operations, new AI tools demand massive bandwidth and consistent uptime. Finally, our hybrid networks feature paths integrated with Microsoft Azure that consume on-premises resources, so they also stand to benefit from increased resilience.

A catastrophic network outage can cause incredible damage to all of these business functions. In fact, we experienced exactly that in 2022.

A fiber cut combined with a ROADM system hardware reboot caused a five-minute outage at our Puget Sound metro region. In this environment, every minute of lost connectivity can result in significant financial impact, making network resilience absolutely essential.

“We don’t want even a second of downtime,” says Vinoth Elangovan, senior network engineer, who designed and implemented the Zero Trust Optical BCDR network for Microsoft. “We needed a life raft for when failures occur that could also function as a standby network for core site migrations or platform upgrades.”

Delivering greater network resilience

To ensure we could deliver uninterrupted network connectivity even in the midst of a catastrophic outage, we needed to consider the technical demands of a truly resilient system. Five design pillars helped us assemble our architectural criteria:

  1. Independent optical systems: To provide true resilience, our primary and BCDR platforms needed to operate autonomously.
  2. Physically independent paths: Circuits should avoid shared conduits, fibers, and splices to operate completely independently.
  3. Separate control software: The primary and backup networks should operate through dedicated network management systems (NMSs), automation, and provisioning domains.
  4. Unified client interface: Both systems needed to terminate into the same interface to unify service for clients and applications.
  5. Survivability by design: We couldn’t assume that any system would be immune to failure. Instead, we built for the best possible outcomes.

The result was the Zero Trust Optical BCDR architecture, a layered approach to optical networking. It consists of our primary, ROADM-based transport layer and a secondary, MUX-based transport layer, both terminating into a single logical port channel.

“Our core responsibility is the employee experience, so our main design thrust was making sure service is seamless and uninterrupted—even during an outage.”

Vinoth Elangovan, senior network engineer, Hybrid Core Network Services, Microsoft Digital

Both systems are live and active, which means they deliver production services through their own independent fibers, power supplies, and software stacks. By layering fully independent optical domains and logically unifying them at the Ethernet edge, the network can sustain a complete failure of one system and maintain continuity.

That physical and operational independence is the difference between simple redundancy and robust resilience.

“Our core responsibility is the employee experience, so our main design thrust was making sure it’s seamless and uninterrupted—even during an outage,” Elangovan says.

Optical network backed by a BCDR network

A schematic of an optical network running between different nodes and backed up by a BCDR network.
The optical network in our Puget Sound region connects core sites to labs, datacenters, and the internet edge, while the BCDR network provides backup connections to deliver resilience in case of a catastrophic network failure.

A typical ROADM optical network connects campus and data center sites to the internet edge. Our design features three interconnected optical rings, with two internet edges as multi-directional nodes, while other sites operate as dual-degree nodes with bidirectional redundancy. Meanwhile, our campuses and datacenters are designated as critical sites and equipped with Optical BCDR links to ensure enhanced resiliency. In the event of a complete Optical ROADM line failure, these critical sites retain connectivity.

In the event of an outage on the primary network, the port channel handles forward continuity automatically, shifting WAN traffic between optical paths in real time.

The transition occurs seamlessly and transparently, with no noticeable impact to clients.

A photo of Martin

“Our initial goal was to provide high-throughput connectivity for major labs, with less than six minutes of downtime per year. That represents a service level of 99.999% network continuity, and we’re aiming for even better moving forward.”

Blaine Martin, principal engineering manager, Hybrid Core Network Services, Microsoft Digital

Coupling at the Ethernet layer provides clients and applications with one logical interface, automatic load balancing and traffic distribution, and seamless failover, regardless of which optical domain is providing service.

“Our initial goal was to provide high-throughput connectivity for major labs, with less than six minutes of downtime per year,” says Blaine Martin, principal engineering manager for Hybrid Core Network Services in Microsoft Digital. “That represents a service level of 99.999% network continuity, and we’re aiming for even better moving forward.”

A new era of confidence for network engineers

For the network engineers who keep Microsoft employees and resources connected, the Zero Trust Optical BCDR network relieves much of the pressure that comes from resolving outages.

“Before, we were dependent on a single system, even with redundancies, so the human experience was like firefighting. Now, if the primary optical network is having a problem, I don’t even see it.”

Kevin Bullard, principal cloud network engineering manager, Microsoft Digital

When a network goes down, engineers have an enormous set of responsibilities to manage: processing the incident report, assigning severity, performing checks, notifying internal teams, providing updates, and engaging with physical support teams—all with a profound urgency to restore productivity.

Dialing those pressures back has been a huge benefit.

“Before, we were dependent on a single system, even with redundancies, so the human experience was like firefighting,” says Kevin Bullard, Microsoft Digital principal cloud network engineering manager responsible for maintaining WAN interconnectivity between labs. “Now, if the primary optical network is having a problem, I don’t even see it.”

There will always be pressure on network engineers to restore connectivity during an outage, but they can breathe easier knowing it won’t cost the company millions of dollars as the time to resolve ticks away. And in non-emergency situations like core site migrations, the BCDR network provides a much easier way to shunt services while the main network is offline.

“Our internal users have become more confident that they can stay connected, no matter what,” says Chakri Thammineni, principal cloud network engineer for Infrastructure and Engineering Services in Microsoft Digital. “That gives the people responsible for maintaining our enterprise networks incredible peace of mind.”

Fortunately, there hasn’t been a substantial network outage in the Puget Sound metro area since 2022. But our network engineering teams know that if and when it happens, the BCDR network will be ready to maintain service continuity.

A photo of Alverio.

“We’re always looking ahead into industry trends to stay at the bleeding edge, whether that’s in the technology we provide for our customers or the networks we use to do our own work.”

Patrick Alverio, principal group software engineering manager, Infrastructure and Engineering Services, Microsoft Digital

With our Puget Sound network protected, we have plans in place to extend this model to other metro areas. Naturally, we have to balance population, criticality, and the knowledge that elevated reliability and availability come with a cost.

Our selection criteria for new BCDR networks have largely centered around two factors: expansions of AI-critical infrastructure and concentrations of secure access workspaces (SAWs) for technical employees. With these criteria in mind, we’re planning new BCDR networks first in the Bay Area and Dublin, then in Virginia, Atlanta, and London.

Zero Trust optical BCDR architecture represents a paradigm shift in enterprise network resilience, and we’re committed to expanding the model to benefit both conventional workloads and the expanding infrastructure demands of AI.

“We’re always looking ahead into industry trends to stay at the bleeding edge, whether that’s in the technology we provide for our customers or the networks we use to do our own work,” Alverio says. “We refuse to accept the status quo, and we’re elevating the experience for employees across Puget Sound and Microsoft as a whole.”

Driving AI innovation in optical network resilience

Our journey towards an AI-driven optical network is gaining momentum.

As part of our Secure Future initiative, we’ve automated our Optical Management Platform credential rotation and are actively developing intelligent incident management ticket enrichment, auto-remediation, link provisioning, deployment validation, and capacity planning.

AI plays a central role in this transformation.

With Microsoft 365 Copilot and GitHub Copilot integrated into our engineering workflows, we’re accelerating development cycles, improving code accuracy, and uncovering optimization opportunities that would otherwise take hours of manual effort.

These Copilots are also helping our engineers analyze network patterns, simulate outcomes, and validate deployment logic before execution, reducing human error and strengthening our Zero Trust posture. Over time, we’re evolving toward a system where AI not only assists but proactively predicts potential disruptions, recommends remediations, and continuously learns from operational telemetry.

These advancements are paving the way for a future where our optical infrastructure can anticipate issues, recover faster, and operate with the agility and assurance expected in a Zero Trust environment.

Key takeaways

If you’re considering implementing your own optical and BCDR networks, consider these tips:

  • Understand the technical components of resilience: Independent optical systems, physically independent paths, separate control software, a unified client interface, and survivability by design are the key technical components of true resilience.
  • Plan from a preparedness and value perspective: Evaluate the critical points in your infrastructure and determine where you can get the most value out of resilient connectivity.
  • Ensure your teams have the right skillset: Carefully consider the right workforce to run those systems and be accountable for their operation.

The post Keeping our in-house optical network safe with a Zero Trust mentality appeared first on Inside Track Blog.

]]>
20611
Securing the borderless enterprise: How we’re using AI to reinvent our network security http://approjects.co.za/?big=insidetrack/blog/securing-the-borderless-enterprise-how-were-using-ai-to-reinvent-our-network-security/ Thu, 10 Jul 2025 16:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=19504 The modern enterprise network is complex, to say the least. Enterprises like ours are increasingly adopting hybrid infrastructures that span on-premises data centers, multiple cloud environments, and a diverse array of remote users. In this context, traditional security tools are still playing checkers while the malicious actors are playing chess. To make matters worse, attacks […]

The post Securing the borderless enterprise: How we’re using AI to reinvent our network security appeared first on Inside Track Blog.

]]>
The modern enterprise network is complex, to say the least.

Enterprises like ours are increasingly adopting hybrid infrastructures that span on-premises data centers, multiple cloud environments, and a diverse array of remote users. In this context, traditional security tools are still playing checkers while the malicious actors are playing chess. To make matters worse, attacks are increasingly enabled by AI tools.

That’s why here in Microsoft Digital, the company’s IT organization, we’re using a modern approach and toolset—including AI—to secure our network environment, turning complexity into clarity, one approach, tool, and insight at a time.

Leaving traditional network security behind

For years, traditional network security relied on a simple but increasingly outdated assumption: everything inside the corporate perimeter can be trusted. This model made sense when networks were static, users were on-premises, and applications lived in a centralized data center.

But that world is gone.

A photo of Venkatraman.

“Implicit trust must be replaced with explicit verification. That means rethinking how we monitor, how we respond, and how we design for resilience from the start.”

Raghavendran Venkatraman, principal cloud network engineering manager, Microsoft Digital

Today’s enterprise is dynamic, decentralized, and borderless. Hybrid work has become the norm. Cloud adoption is accelerating. Teams are globally distributed. Devices and data move constantly across environments. In this new reality, the network perimeter hasn’t just shifted—it has effectively vanished.

That’s where the cracks in legacy security models become impossible to ignore.

Visibility becomes fragmented. Security teams struggle to track what’s happening across a sprawling digital estate. Traditional monitoring tools focus on infrastructure uptime or device health—not on the actual experience of the people using the network. That disconnect creates blind spots, and blind spots create risk.

We know that this model no longer meets the needs of a modern, AI-powered enterprise. Every enterprise needs a new approach—one that assumes breach, enforces least-privilege access, and continuously verifies trust.

“Implicit trust must be replaced with explicit verification,” says Raghavendran Venkatraman, a principal cloud network engineering manager in Microsoft Digital. “That means rethinking how we monitor, how we respond, and how we design for resilience from the start.”

This shift is foundational to our security strategy. It’s not just about securing infrastructure—it’s about securing the experience. Because in a world where users, data, and threats are everywhere, trust has to be proved, not assumed.

Building a resilient and adaptive security strategy

To secure hybrid corporate networks effectively, organizations must go beyond traditional perimeter defenses. They need a comprehensive and adaptive security strategy—one that evolves with the threat landscape and aligns with the complexity of modern enterprise environments. The diversity of hybrid networks introduces new vulnerabilities and expands the attack surface. A static, one-size-fits-all approach simply doesn’t work anymore.

At Microsoft Digital, we’ve embraced a layered, cloud-first security model that integrates identity, access, encryption, and monitoring across every layer of the network. It’s embedded in everything we do. This model includes these key strategies, which we’ll expand upon in the following sections:

  • Adopting Zero Trust principles
  • Establishing identity as the new perimeter 
  • Integrating AI and machine learning
  • Enforcing network segmentation
  • Embracing continuous monitoring

Adopting Zero Trust principles

Zero Trust Architecture (ZTA) operates on a strict principle: “never trust, always verify.” That means no user, device, or application—whether it’s inside or outside the corporate network—is inherently trusted as they are in the traditional network security model.

A photo of McCleery.

“Zero Trust isn’t a product—it’s a mindset. It’s about assuming breach and designing defenses that minimize impact and maximize resilience.”

Tom McCleery, principal group cloud network engineer, Microsoft Digital

Every access request is evaluated against dynamic policies. These policies consider several factors—like user identity, device health, location, and how sensitive the data being accessed is. For example, if an employee tries to access a financial report from a corporate laptop at the office, they might get in, no problem. But that same request from a personal device in another country could get blocked or trigger extra authentication steps.

At the heart of ZTA are policy enforcement points that authorize every data flow. These checkpoints only grant access when all conditions are met, and they log every interaction for auditing and threat detection. This kind of granular control reduces the attack surface and limits lateral movement if there is a breach.

Adopting Zero Trust isn’t just a technical upgrade—it’s a strategic must. It boosts an organization’s ability to defend against modern threats like ransomware, insider attacks, and supply chain compromises.

“Zero Trust isn’t a product—it’s a mindset,” says Tom McCleery, a principal group cloud network engineer in Microsoft Digital. “It’s about assuming breach and designing defenses that minimize impact and maximize resilience.”

By embracing Zero Trust, we strengthen our security posture, lowers the risk of data breaches, and responds more effectively to emerging threats.

Establishing identity as the new perimeter

Identity is no longer just a component of security—it has become the new perimeter. Traditional security models focused on defending the network edge, assuming that everything inside the perimeter could be trusted. But in today’s hybrid and cloud-first environments, the perimeter has dissolved and that assumption is outdated and dangerous. Users, devices, and applications now operate across diverse locations and platforms, making perimeter-based defenses insufficient.

Identity-first security shifts the focus from securing the physical network to securing the identities—both human and machine—that interact with the network. This means every access request is treated as though it originates from an untrusted source, regardless of where it comes from. Whether it’s a remote employee logging in from a personal device or an automated workload accessing cloud resources, the system must verify who or what is making the request, assess the risk, and enforce least-privilege access across the user experience.

This approach enables organizations to implement more granular access controls. For example, a developer might be allowed to access a code repository but not production systems, and only during business hours from a managed device.  Similarly, a service account used by a Continuous Integration and Continuous Deployment CI/CD pipeline might be restricted to specific APIs and monitored for anomalous behavior. A CI/CD pipeline is an automated workflow that takes code from development through testing and into production.

By anchoring network security around verified identities, organizations reduce their attack surface and improve their ability to detect and respond to threats. This identity-centric model is not just a security enhancement—it’s a strategic shift that aligns with how modern enterprises operate.

Integrating AI and machine learning 

AI and machine learning (ML) are foundational pillars in our network security strategy. Intelligent automation and advanced analytics help us not only detect and respond to threats, but also continuously improve our security posture in an ever-changing landscape. Here’s how we’re using AI and ML in some critical aspects of our approach to modern network security:

  • Threat detection and intelligence. We deploy AI-powered monitoring tools that sift through billions of network signals and logs across our hybrid infrastructure. By applying sophisticated ML algorithms, we can identify abnormal behaviors such as unusual login attempts or unexpected data transfers that could indicate a potential breach. These insights allow our security teams to focus on the most critical alerts, reducing noise and accelerating incident investigation.
  • Automated response and containment. Through automation, our security systems can respond to threats in real time. For example, if our AI models detect suspicious activity on a device, automated workflows can immediately isolate the affected endpoint, block malicious traffic, or revoke access privileges, all without waiting for manual intervention. This rapid response capability is essential for minimizing the potential impact of attacks and protecting our critical assets.
  • Predictive analysis and proactive defense. We use predictive analytics to forecast emerging vulnerabilities before they can be exploited. By continuously training our models on the latest threat intelligence and attack patterns, we can anticipate risks and strengthen our defenses proactively—whether that means patching vulnerable systems, adjusting access controls, or updating our security policies.
  • User experience monitoring. We use AI to assess the real experience of our users, a critical measurement in a network environment where identity is the perimeter. By correlating performance metrics with security signals, we ensure that our security mechanisms don’t degrade productivity and that any anomalies impacting user experience are promptly addressed.
  • Continuous learning and improvement. Our AI and ML systems are designed to learn from every incident, adapt to new attack techniques, and evolve with the threat landscape. This continuous improvement loop enables our teams to stay ahead of sophisticated adversaries and maintain robust, resilient network security.

Advanced threats require advanced responses. By integrating AI and ML into our network security strategies, we’re enhancing our ability to detect and respond to threats swiftly, minimize potential damage, and foster a secure environment for innovation and collaboration across our global hybrid infrastructure.

Isolating networks to minimize risk

In a hybrid infrastructure, isolating network segments is a foundational security principle. By segmenting networks, we limit the scope of potential breaches and reduce the risk of lateral movement by attackers. For example, separating employee productivity networks from customer-facing systems ensures that if a vulnerability is exploited in one area, it doesn’t cascade across the entire environment.

This is especially critical in environments where sensitive customer data and internal development systems coexist. Our testing and development environments must remain completely isolated—not only from customer-facing services but also from internal productivity tools like email, collaboration platforms, and identity systems. This prevents test code or experimental configurations from inadvertently exposing production systems to risk.

We also establish policy enforcement points (PEPs) within each network segment. These act as control gates, inspecting and filtering traffic between zones. By placing PEPs at strategic boundaries, we can tightly control what moves between segments and detect anomalies early. This architecture ensures that, if a breach occurs, the “blast radius”—the scope of impact—is minimal and contained.

This layered approach to segmentation and isolation is essential for maintaining the integrity of our production systems, minimizing risk, and ensuring that our hybrid infrastructure remains resilient in the face of evolving threats.

Embracing continuous monitoring 

We’ve stopped thinking of monitoring as a one-time check. Now, it’s a continuous conversation with our network.

A photo of Singh.

“Conventional network performance monitoring—monitoring the systems and infrastructure that support our network—can only tell part of the story. To truly understand and meet our requirements, we must monitor user experiences directly.”

Ragini Singh, partner group engineering manager in Microsoft Digital

Continuous monitoring is how we stay ahead of issues before they impact our people. It’s how we keep our hybrid infrastructure resilient, performant, and secure—every second of every day.

We’ve built a monitoring ecosystem that spans our entire global network from on-premises offices to cloud-based services in Azure and software-as-a-service (SaaS) platforms. With the mindset that identity is the new perimeter, we’re using signals from all aspects of our environment and focusing on the user experience.

“Conventional network performance monitoring—monitoring the systems and infrastructure that support our network—can only tell part of the story,” says Ragini Singh, a partner group engineering manager in Microsoft Digital. “To truly understand and meet our requirements, we must monitor user experiences directly.”

This isn’t just about tools and dashboards. It’s about insight. We’re using synthetic and native metrics to build a hop-by-hop view of the user experience. That lets us pinpoint where things go wrong—and fix them fast. We’re even layering in automation to enable self-healing responses when thresholds are breached.

Continuous monitoring is a strategic shift that helps us protect our people, power our services, and deliver the seamless experience our employees expect.

Looking to the future

As enterprises continue to navigate the complexities of hybrid infrastructures, securing enterprise networks requires an agile, multifaceted approach that integrates Zero Trust principles, identity-first security, and advanced technologies like AI and ML. By shifting the focus from traditional perimeter defenses to a more holistic and adaptive security model, organizations can better protect their assets, maintain operational continuity, and foster innovation in an increasingly interconnected world.

Implementing these strategies not only enhances security but also positions organizations to leverage the full potential of their hybrid infrastructures, driving growth and success in the digital age.

Key takeaways

Here are five key actions you can take to strengthen your organization’s network security and embrace a modern approach to network security:

  • Adopt an identity-first security model. Shift your focus from traditional perimeter-based defenses to verifying and securing every user and device identity—regardless of location or network.
  • Integrate AI and machine learning into your security strategy. Continuously improve your security posture by using intelligent automation and analytics to detect, respond to, and predict threats more effectively.
  • Isolate network segments to minimize risk. Separate critical business functions, customer-facing services, and development environments to contain threats and ensure that any potential breach remains limited in scope.
  • Implement continuous monitoring across your hybrid infrastructure. Move beyond periodic checks by establishing real-time, user-centric monitoring to maintain resilience, performance, and rapid incident response.
  • Embrace a proactive, adaptive mindset. Regularly update your security policies, train your teams, and stay agile to address emerging threats and support innovation as your organization evolves.

The post Securing the borderless enterprise: How we’re using AI to reinvent our network security appeared first on Inside Track Blog.

]]>
19504
Implementing a Zero Trust security model at Microsoft http://approjects.co.za/?big=insidetrack/blog/implementing-a-zero-trust-security-model-at-microsoft/ Thu, 24 Apr 2025 18:30:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=9344 At Microsoft, our shift to a Zero Trust security model—which began more than seven years ago—has helped us navigate many challenges. Engage with our experts! Customers or Microsoft account team representatives from Fortune 500 companies are welcome to request a virtual engagement on this topic with experts from our Microsoft Digital team. The increasing prevalence […]

The post Implementing a Zero Trust security model at Microsoft appeared first on Inside Track Blog.

]]>
At Microsoft, our shift to a Zero Trust security model—which began more than seven years ago—has helped us navigate many challenges.

The increasing prevalence of cloud-based services, mobile computing, internet of things (IoT), and bring your own device (BYOD) in the workforce have changed the technology landscape for the modern enterprise. Security architectures that rely on network firewalls and virtual private networks (VPNs) to isolate and restrict access to corporate technology resources and services are no longer sufficient for a workforce that regularly requires access to applications and resources that exist beyond traditional corporate network boundaries.

The shift to the internet as the network of choice and the continuously evolving threats led us to adopt a Zero Trust security model internally here at Microsoft. Though our journey began many years ago, we expect that it will continue to evolve for years to come.

For a transcript, please view the video on YouTube and select “Show transcript” at the bottom of the description pane.

Carmichael Patton, a principal security architect at Microsoft, shares about the work that his team in the Chief Information Security Office (CISO) organization has been doing to support a Zero Trust security model.

The Zero Trust model

Based on the principle of verified trust—in order to trust, you must first verify—Zero Trust eliminates the inherent trust that is assumed inside the traditional corporate network. Zero Trust architecture reduces risk across all environments by establishing strong identity verification, validating device compliance prior to granting access, and ensuring least privilege access to only explicitly authorized resources.

Zero Trust requires that every transaction between systems (user identity, device, network, and applications) be validated and proven trustworthy before the transaction can occur. In an ideal Zero Trust environment, the following behaviors are required:

  • Identities are validated and secure with phishing-resistant authentication (MFA) everywhere. Using phishing-resistant authentication eliminates password expirations and eventually will eliminate passwords. The added use of biometrics ensures strong authentication for user-backed identities.
  • Devices are managed and validated as healthy. Device health validation is required. All device types and operating systems must meet a required minimum health state as a condition of access to any Microsoft resource.
  • Telemetry is pervasive. Pervasive data and telemetry are used to understand the current security state, identify gaps in coverage, validate the impact of new controls, and correlate data across all applications and services in the environment. Robust and standardized auditing, monitoring, and telemetry capabilities are core requirements across users, devices, applications, services, and access patterns.
  • Least privilege access is enforced. Limit access to only the applications, services, and infrastructure required to perform the job function. Access solutions that provide broad access to networks without segmentation or are scoped to specific resources, such as broad access VPN, must be eliminated.

Zero Trust scenarios

We have identified four core scenarios at Microsoft to help achieve Zero Trust. These scenarios satisfy the requirements for strong identity, enrollment in device management and device-health validation, alternative access for unmanaged devices, and validation of application health. The core scenarios are described here:

  • Scenario 1: Applications and services have the mechanisms to validate multifactor authentication and device health.
  • Scenario 2: Employees can enroll devices into a modern management system which guarantees the health of the device to control access to company resources.
  • Scenario 3: Employees and business guests have a method to access corporate resources when not using a managed device.
  • Scenario 4: Access to resources is limited to the minimum required—least privilege access—to perform a specified function.

Zero Trust scope and phases

We’re taking a structured approach toward Zero Trust, an effort that spans many technologies and organizations and requires investments that will carry over multiple years. The graphic below represents a high-level view of the Zero Trust goals—grouped into our core Zero Trust pillars—that we continually work toward.

While these goals don’t represent the full scope of the Zero Trust efforts and work streams, they capture the most significant areas of Zero Trust effort at Microsoft.

Pillars of the Microsoft Zero Trust model

Graphic showing the four main pillars of our Zero Trust security model: Verify identity, Verify device, Verify Access, and Verify Services.
The major goals for each Zero Trust pillar that we work toward at Microsoft.

Scope

Our initial scope for implementing Zero Trust focused on common corporate services used across our enterprise—our employees, partners, and vendors. Our Zero Trust implementation targeted the core set of applications that Microsoft employees use daily (e.g., Microsoft 365 apps, line-of-business apps) on platforms like iOS, Android, MacOS, Linux, and Windows. As we have progressed, our focus has expanded to include all applications used across Microsoft. Any corporate-owned or personal device that accesses company resources must be managed through our device management systems.

Verify identity

To begin enhancing security for the environment, we implemented MFA using smart cards to control administrative access to servers. We later expanded the multifactor authentication requirement to include all users accessing resources from outside the corporate network. The massive increase in mobile devices connecting to corporate resources pushed us to evolve our multifactor authentication system from physical smart cards to a phone-based challenge (phone-factor) and later into a more modern experience using the Microsoft Azure Authenticator application.

The next step in this area is the widespread deployment of Windows Hello for Business for biometric authentication. While Windows Hello hasn’t completely eliminated passwords in our environment, it has significantly reduced password usage and enabled us to remove our password-expiration policy. Additionally, multifactor authentication validation is required for all accounts, including guest accounts, when accessing Microsoft resources.

Our most recent efforts involve rolling out phishing-resistant authentication credentials through Passkey options in the Microsoft Authenticator app, with YUBIKeys as an option for limited-scale use cases. Additionally, all new employee onboarding is now run through a process for Passkey configuration, without the use of a password from day one.

Verify device

Our first step toward device verification was enrolling devices into a device-management system. We have since completed the rollout of device management for Windows, Mac, Linux, iOS, and Android. Many of our high-traffic applications and services, such as Microsoft 365 and VPN, enforce device health for user access.

Additionally, we’ve started using device management to enable proper device health validation, a foundational component that allows us to set and enforce health policies for devices accessing Microsoft resources. We’re using Windows Autopilot for device provisioning, which ensures that all new Windows devices delivered to employees are already enrolled in our modern device management system.

Devices accessing the corporate network must also be enrolled in the device-management system. This includes both Microsoft-owned devices and personal BYOD devices. If employees want to use their personal devices to access Microsoft resources, the devices must be enrolled and adhere to the same device-health policies that govern corporate-owned devices.

For devices where enrollment in device management isn’t an option, we’ve created a secure access model called Microsoft Azure Virtual Desktop. Virtual Desktop creates a session with a virtual machine that meets the device-management requirements. This allows individuals using unmanaged devices to securely access select Microsoft resources.

There is still work remaining within the verify device pillar. We’re in the process of maturing device management for Linux devices and expanding the number of applications enforcing device management to eventually include all applications and services. We’re expanding the number of resources available when connecting through the Virtual Desktop service. We’re also expanding to other devices, such as the Meta Quest headsets, conference room devices, and kiosks. Finally, we’re making device-health policies more robust and enabling validation across all applications and services.

Verify access

In the verify access pillar, we focused on segmenting users and devices across purpose-built networks, migrating all Microsoft employees to use the internet as the default network, and automatically routing users and devices to appropriate network segments. We successfully deployed several network segments, both for users and devices, including internet-default wired and wireless networks across all Microsoft buildings. All users received policy updates to their systems, thus making this internet-based network their new default.

As part of this network rollout, we deployed a device-registration portal. This portal allows users to self-identify, register, or modify devices to ensure that the devices connect to the appropriate network segment. Through this portal, users can register guest devices, user devices, and IoT devices.

We also created specialized segments, including purpose-built segments for the various IoT devices and scenarios used throughout the organization. We completed the migration of our highest-priority IoT devices in Microsoft offices into the appropriate segments.

Verify services

In the verify services pillar, our efforts center on enabling conditional access across all applications and services. To achieve full conditional access validation, a key effort requires modernizing legacy applications or implementing solutions for applications and services that can’t natively support conditional access systems. This has the added benefit of reducing the dependency on VPN and the corporate network.

Microsoft has adopted a hybrid workplace and a large percentage of our employees have transitioned to work from home. This shift has meant greatly increased use of remote network connectivity. Gradually, we have been able to successfully engage application owners in our plans to make applications and services accessible over the internet without VPN, and we’ve been able to transition 98% of our workloads to internet-facing services.

For those services that remain on-premises or are behind Azure Private Endpoints, we have enabled Azure VPN, which we’ve migrated from “always on” to manual access when a VPN is required. Our goal is to further reduce dependency on VPNs in order to restrict access to only required services, rather than the broader access that VPNs provide. We also further reduced the risk of lateral movement by implementing the Entra Secure Service Edge solution.  

Implementing Entra SSE allows us to provide secure tunnel access through Private Access and Internet Access for Microsoft Services. For Microsoft-specific SaaS solutions like Microsoft 365 and Microsoft Dynamics, the Internet Access for Microsoft Services gives us important functionality, including token protection and the ability to prevent man-in-the-middle (MitM) attacks.

We are also working on onboarding our on-premises and Private Endpoints through Private Access. In addition to helping deal with MitM attacks and token protection, this allows for direct service connections from the client to the service, without allowing broader access to other services that an employee should not have direct access to.

Zero Trust architecture with Microsoft services

The graphic below provides a simplified reference architecture for our approach to implementing Zero Trust. The primary components of this process are Intune for device management and device security policy configuration, Microsoft Entra Conditional Access for device health validation, and Microsoft Entra ID for user and device inventory.

The system works with Intune, by pushing device configuration requirements to the managed devices. The device then generates a statement of health, which is stored in Microsoft Entra ID. When the device user requests access to a resource, the device health state is verified as part of the authentication exchange with Microsoft Entra ID.

Microsoft Security Zero Trust access model

Zero Trust access diagram: Intune enrollment (mobile devices, employees and guest users and desktop) and Internet access for Microsoft Services (Microsoft 365 Dynamics, Microsoft Cloud SaaS apps and On-premises/legacy).
Microsoft’s internal Zero Trust architecture.

A transition that’s paying off

In our transition to a Zero Trust model, we continue to make consistent progress. Over the last several years, we’ve increased identity-authentication strength with expanded coverage of strong authentication, a transition to biometrics-based authentication by using Windows Hello for Business, and phishing-resistant credentials for all supported platforms. We’ve deployed device management and device-health validation capabilities across all major platforms. We’ve also launched a Windows Virtual Desktop system that provides secure access to company resources from unmanaged devices and is Zero Trust compliant by design.

As we continue our progress, we’re making ongoing investments in Zero Trust. We’re expanding health-validation capabilities across devices and applications, increasing the Virtual Desktop features to cover more use cases, and implementing better controls on our network. After reducing (and eliminating when possible) our dependencies on VPN, our next chapter is to migrate to a more modern secure tunnel per application.

Each enterprise that adopts Zero Trust will need to determine what approach best suits their unique environment. This includes balancing risk profiles with access methods, defining the scope for the implementation of Zero Trust in their environments, and determining what specific verifications they want to require for users to gain access to their company resources. In all of this, encouraging the organization-wide embrace of Zero Trust is critical to success, no matter where you decide to begin your transition.

Key takeaways

Here are some tips for moving to a Zero Trust security model at your company:

  • Collect telemetry and evaluate risks, then set goals.​
  • Get to modern identity and MFA—then onboard to Microsoft Entra ID.​
  • For conditional access enforcement, focus on your most-used applications to ensure maximum coverage.​
  • Start with simple policies for device health enforcement, such as device lock or password complexity. ​
  • Run pilots and ringed rollouts. Slow and steady wins the race. ​
  • Migrate your users to the internet and monitor VPN traffic to understand internal dependencies.​
  • Focus on the user experience, which is critical to employee productivity and morale. Without adoption, your program won’t be successful.​
  • Communication is key—bring your employees on the journey with you! ​
  • Assign performance indicators and goals for all workstreams and elements, including employee sentiment.

The post Implementing a Zero Trust security model at Microsoft appeared first on Inside Track Blog.

]]>
9344
Boosting efficiency with SharePoint agents: How our Microsoft legal team is helping clients find answers faster http://approjects.co.za/?big=insidetrack/blog/boosting-efficiency-with-sharepoint-agents-how-our-microsoft-legal-team-is-helping-clients-find-answers-faster/ Thu, 27 Feb 2025 17:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=18540 We all know the frustration of searching for answers we can’t find, and legal professionals often spend too much time answering the same questions repeatedly. Engage with our experts! Customers or Microsoft account team representatives from Fortune 500 companies are welcome to request a virtual engagement on this topic with experts from our Microsoft Digital […]

The post Boosting efficiency with SharePoint agents: How our Microsoft legal team is helping clients find answers faster appeared first on Inside Track Blog.

]]>
We all know the frustration of searching for answers we can’t find, and legal professionals often spend too much time answering the same questions repeatedly.

To address these challenges, knowledge must be captured, presented, and made accessible so that individuals can quickly find answers on their own. Our legal team supporting marketing at Microsoft developed a SharePoint agent to help achieve just that.

A photo of Nowbar smiling.
Hossein Nowbar spearheads the Microsoft AI integration and works on enhancing our legal team’s efficiency.

Over the years, our Microsoft legal team, Corporate, External, and Legal Affairs, has developed rich, comprehensive, and curated content accessible through SharePoint. This includes guidelines, policies, summaries of laws, self-service tools, and more; all presented in a way that’s understandable for a non-legal audience. The marketing section of this SharePoint site alone drives approximately 8,000 page views per month, resulting in significant cost savings.

When Microsoft released SharePoint agents, it created an opportunity to do even more. Now, the marketing legal team’s newly developed SharePoint agent sits on top of its robust SharePoint site, adding the power of AI to answer legal questions and further unlocking the value of the existing resources in an elegant and streamlined way.

SharePoint agents are natural language AI assistants tailored to specific tasks and subject matter, providing trusted and precise answers and insights to support informed decision-making. Each SharePoint site includes an agent based on the site’s content. Or, with a single click users can create and share a custom agent that accesses only the information they select. 

“At Microsoft, AI is transforming how our legal teams operate, creating new opportunities to enhance workflow efficiency,” says Hossein Nowbar, chief legal officer and corporate vice president for Microsoft. “We’ve used SharePoint agents to improve the discoverability and delivery of legal resources, scale our legal advice, and gain critical insights into content usage. This saves considerable time for teams that need advice and those that provide it, all the while driving greater legal compliance and consistency.”

Watch this demo of the SharePoint agent we built to supply the legal team’s internal clients with answers faster and more efficiently.  
A photo of Tan smiling.
CJ Tan and her team build easily customizable agents that enable the legal team and others at Microsoft to do routine work much faster and more efficiently.

To determine whether using the SharePoint agent shown in the demo was better than using search and navigation alone, the legal team ran a test consisting of six legal questions for which five participants were asked to find answers. For each question, the participants were timed using search and navigation alone, and then using the new SharePoint agent.

In timing each participant, we stopped the clock either when they were satisfied that they had found the correct answer, or at five minutes if they did not find the correct answer. In the first test, using search and navigation, participants only found the answer 83.3% of the time, leaving 16.7% of the questions unanswered. Using the SharePoint agent, participants found the correct answer 100% of the time.

Not only were participants more successful at finding correct answers, but they also found the answers much more quickly using the SharePoint agent. Participants found and confirmed the answer in under 1 minute 46.7% of the time and found and in under two minutes 100% of the time. On average, participants found the correct answers 2.97 times faster using the SharePoint agent when compared with using site search and navigation.

We know from experience and feedback that when people can find answers to their legal questions quickly and easily using self-service resources, the legal department can focus on more complex issues. A SharePoint agent is an essential tool for any organization seeking to harness the power of AI to make answers readily available, reduce the need for live support, and bring their existing content to life.

“The Microsoft Legal team was an ideal early adopter of SharePoint agents due to their well-curated content,” says CJ Tan, principal group product manager for SharePoint agents. “They recognized the value of an agent in scaling support and handling easily addressable questions, allowing the team to focus on more complex, unique business scenarios. Instead of learning how to build an agent, they could concentrate on helping marketers surface and use the right content for their business needs. As subject matter experts, they were also well-positioned to validate and test their agent before publishing it on their SharePoint site.”

Watch to see our legal team walk you through how you can create your own SharePoint agent.
A photo of Spataro smiling.
Jared Spataro empowers employees to swiftly access a vast knowledge base by integrating agents into SharePoint sites.

As we build out our array of Microsoft 365 agents, we continue to look to our internal experiences to guide the product’s evolution for our customers. We are exploring new ways for SharePoint agents to be shared and extensible across a variety of content sources. Lastly, we know that governance controls and analytics are critical as organizations introduce new features within their workflow and are excited about the roadmap for additional insights available and coming soon from Copilot Analytics, SharePoint Advanced Management, and SharePoint Purview.

“Organizations rely on SharePoint, creating more than two million sites and uploading more than two billion files daily,” says Jared Spataro, chief marketing officer of AI at Work @ Microsoft. “By giving every SharePoint site an agent, employees can quickly tap into this massive knowledge base with a single click.”

As with any new product and technology innovation, we’re focused on education and customer learnings. At the Microsoft 365 Community Conference we will host a variety of sessions on SharePoint agents, going deeper into business use cases and best practices for creation and usage.

Connect with author Brent Sanders on LinkedIn.

Key Takeaways

Here are some of our top tips for getting started with SharePoint agents at your company:

  • Prepare your content: Ensure your SharePoint content is highly curated, accurate, complete, and unique. This helps agents provide more accurate and relevant responses.​ Organize content into smaller, manageable sets to improve response accuracy (e.g., using smaller document libraries with fewer files and minimal graphics).
  • Maintain your content: Updates made to content sources are reflected in the SharePoint agent responses, so make sure that content sources are maintained. Also, be sure to regularly check that file permissions are accurate, based on the agent audience.
  • Use ready-made agents: Each SharePoint site comes with a ready-made agent scoped to the content of the site. SharePoint admins can approve this agent to help jump-start usage. Use our communication kit to help announce SharePoint agent availability and increase awareness.
  • Identify where custom SharePoint agents can add value: SharePoint agents can be grounded in specific sites, folders, or files. Collaborate with business stakeholders to identify business objectives and priorities to create specialized expert and informational agents.
  • Target no more than 20 content sources: If you are selecting a site or folder, you can have any number of files underneath. However, when selecting items individually, we recommend capping it at 20 sites, folders, or files for best results.
  • Encourage users to provide feedback: Your employees can use “thumbs up or thumbs down” to give feedback on the SharePoint agent’s response. This feedback can be used to continuously improve content and enhance response accuracy over time.
  • Measure the impact: We have a variety of analytics resources to help measure adoption and usage of SharePoint agents, including; the SharePoint document library, SharePoint Advanced Management, Microsoft Purview, and additional reports coming to Copilot Analytics.
Try it out

For organizations with at least 50 Microsoft 365 Copilot licenses, any employee in the organization will be able to create, share, and interact with SharePoint agents. Learn more about SharePoint agents.

The post Boosting efficiency with SharePoint agents: How our Microsoft legal team is helping clients find answers faster appeared first on Inside Track Blog.

]]>
18540
Improving security by protecting elevated-privilege accounts at Microsoft http://approjects.co.za/?big=insidetrack/blog/improving-security-by-protecting-elevated-privilege-accounts-at-microsoft/ Tue, 25 Feb 2025 17:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=9774 This story was first published in 2019. We periodically update our stories, but we can’t verify that they represent the full picture of our current situation at Microsoft. We leave them on the site so you can see what our thinking and experience was at the time. An ever-evolving digital landscape is forcing organizations to […]

The post Improving security by protecting elevated-privilege accounts at Microsoft appeared first on Inside Track Blog.

]]>
Microsoft Digital technical stories

This story was first published in 2019. We periodically update our stories, but we can’t verify that they represent the full picture of our current situation at Microsoft. We leave them on the site so you can see what our thinking and experience was at the time.

An ever-evolving digital landscape is forcing organizations to adapt and expand to stay ahead of innovative and complex security risks. Increasingly sophisticated and targeted threats, including phishing campaigns and malware attacks, attempt to harvest credentials or exploit hardware vulnerabilities that allow movement to other parts of the network, where they can do more damage or gain access to unprotected information.

Like many organizations, Microsoft Digital—our company’s IT organization—used to employ a traditional IT approach to securing the enterprise. We now know that effective security calls for a defense-in-depth approach that requires us to look at the whole environment—and everyone that accesses it—to implement policies and standards that better address risks.

To dramatically limit our attack surface and protect our assets, we developed and implemented our own defense-in-depth approach. This includes new company standards, telemetry, monitoring, tools, and processes to protect administrators and other elevated-privilege accounts.

In an environment where there are too many administrators, or elevated-privilege accounts, there is an increased risk of compromise. When elevated access is persistent or elevated-privilege accounts use the same credentials to access multiple resources, a compromised account can become a major breach.

This story highlights the steps we are taking at Microsoft to protect our environment and administrators, including new programs, tools, and considerations, and the challenges we faced. We will provide some details about the new “Protect the Administrators” program that is positively impacting the Microsoft ecosystem. This program takes security to the next level across the entire enterprise, ultimately changing our digital-landscape security approach.

Understanding defense-in-depth protection

Information protection depicted as a stool with three legs that represent device health, identity management, and data and telemetry.
The three-legged-stool approach to information protection.

Securing all environments within your organization is a great first step in protecting your company. But there’s no silver-bullet solution that will magically counter all threats. At Microsoft, information protection rests on a defense-in-depth approach built on device health, identity management, and data and telemetry—a concept illustrated by the three-legged security stool, in the graphic below. Getting security right is a balancing act. For a security solution to be effective, it must address all three aspects of risk mitigation on a base of risk management and assurance—or the stool topples over and information protection is at risk.

Risk-based approach

Though we would like to be able to fix everything at once, that simply isn’t feasible. We created a risk-based approach to help us prioritize every major initiative. We used a holistic strategy that evaluated all environments, administrative roles, and access points to help us define our most critical roles and resources within the Microsoft ecosystem. Once defined, we could identify the key initiatives that would help protect the areas that represent the highest levels of risk.

As illustrated in the graphic below, the access-level roles that pose a higher risk should have fewer accounts—helping reduce the impact to the organization and control entry.

The next sections focus primarily on protecting elevated user accounts and the “Protect the Administrators” program. We’ll also discuss key security initiatives that are relevant to other engineering organizations across Microsoft.

Implementing the Protect the Administrators program

Illustration of the risk-role pyramid we use to help prioritize security initiatives.
The risk-role pyramid.

After doing a deeper analysis of our environments, roles, and access points, we developed a multifaceted approach to protecting our administrators and other elevated-privilege accounts. Key solutions include:

  • Working to ensure that our standards and processes are current, and that the enterprise is compliant with them.
  • Creating a targeted reduction campaign to scale down the number of individuals with elevated-privilege accounts.
  • Auditing elevated-privilege accounts and role management to help ensure that only employees who need elevated access retain elevated-access privileges.
  • Creating a High Value Asset (HVA)—an isolated, high-risk environment—to host a secure infrastructure and help reduce the attack surface.
  • Providing secure devices to administrators. Secure admin workstations (SAWs) provide a “secure keyboard” in a locked-down environment that helps curb credential-theft and credential-reuse scenarios.
  • Reporting metrics and data that help us share our story with corporate leadership as well as getting buy-in from administrators and other users who have elevated-privilege accounts across the company.

Defining your corporate landscape

In the past, equipment was primarily on-premises, and it was assumed to be easier to keep development, test, and production environments separate, secure, and well-isolated without a lot of crossover. Users often had access to more than one of these environments but used a persistent identity—a unique combination of username and password—to log into all three. After all, it’s easier to remember login information for a persistent identity than it is to create separate identities for each environment. But because we had strict network boundaries, this persistent identity wasn’t a source of concern.

Today, that’s not the case. The advent of the cloud has dissolved the classic network edge. The use of on-premises datacenters, cloud datacenters, and hybrid solutions are common in nearly every company. Using one persistent identity across all environments can increase the attack surface exposed to adversaries. If compromised, it can yield access to all company environments. That’s what makes identity today’s true new perimeter.

At Microsoft, we reviewed our ecosystem to analyze whether we could keep production and non-production environments separate. We used our Red Team/penetration (PEN) testers to help us validate our holistic approach to security, and they provided great guidance on how to further establish a secure ecosystem.

The graphic below illustrates the Microsoft ecosystem, past and present. We have three major types of environments in our ecosystem today: our Microsoft and Microsoft 365 tenants, Microsoft Azure subscriptions, and on-premises datacenters. We now treat them all like a production environment with no division between production and non-production (development and test) environments.

Microsoft ecosystem then and now. Three environment types now: Microsoft and Microsoft 365 tenants, Azure subscriptions, and on-premises datacenters.
Now, everything is considered a “production” environment. We treat our three major environments in the Microsoft ecosystem like production.

Refining roles to reduce attack surfaces

Prior to embarking on the “Protect the Administrators” program, we felt it was necessary to evaluate every role with elevated privileges to determine their level of access and capability within our landscape. Part of the process was to identify tooling that would also protect company security (identity, security, device, and non-persistent access).

Our goal was to provide administrators the means to perform their necessary duties in support of the technical operations of Microsoft with the necessary security tooling, processes, and access capabilities—but with the lowest level of access possible.

The top security threats that every organization faces stem from too many employees having too much persistent access. Every organization’s goal should be to dramatically limit their attack surface and reduce the amount of “traversing” (lateral movement across resources) a breach will allow, should a credential be compromised. This is done by limiting elevated-privilege accounts to employees whose roles require access and by ensuring that the access granted is commensurate with each role. This is known as “least-privileged access.” The first step in reaching this goal is understanding and redefining the roles in your company that require elevated privileges.

Defining roles

We started with basic definitions. An information-worker account does not allow elevated privileges, is connected to the corporate network, and has access to productivity tools that let the user do things like log into SharePoint, use applications like Microsoft Excel and Word, read and send email, and browse the web.

We defined an administrator as a person who is responsible for the development, build, configuration, maintenance, support, and reliable operations of applications, networks, systems, and/or environments (cloud or on-premises datacenters). In general terms, an administrator account is one of the elevated-privilege accounts that has more access than an information worker’s account.

Using role-based controls to establish elevated-privilege roles

We used a role-based access control (RBAC) model to establish which specific elevated-privilege roles were needed to perform the duties required within each line-of-business application in support of Microsoft operations. From there, we deduced a minimum number of accounts needed for each RBAC role and started the process of eliminating the excess accounts. Using the RBAC model, we went back and identified a variety of roles requiring elevated privileges in each environment.

For the Microsoft Azure environments, we used RBAC, built on Microsoft Azure Resource Manager, to manage who has access to Azure resources and to define what they can do with those resources and what areas they have access to. Using RBAC, you can segregate duties within your team and grant to users only the amount of access that they need to perform their jobs. Instead of giving everybody unrestricted permissions in our Azure subscription or resources, we allow only certain actions at a particular scope.

Performing role attestation

We explored role attestation for administrators who moved laterally within the company to make sure their elevated privileges didn’t move with them into the new roles. Limited checks and balances were in place to ensure that the right privileges were applied or removed when someone’s role changed. We fixed this immediately through a quarterly attestation process that required the individual, the manager, and the role owner to approve continued access to the role.

Implementing least-privileged access

We identified those roles that absolutely required elevated access, but not all elevated-privilege accounts are created equal. Limiting the attack surface visible to potential aggressors depends not only on reducing the number of elevated-privilege accounts. It also relies on only providing elevated-privilege accounts with the least-privileged access needed to get their respective jobs done.

For example, consider the idea of crown jewels kept in the royal family’s castle. There are many roles within the operations of the castle, such as the king, the queen, the cook, the cleaning staff, and the royal guard. Not everyone can or should have access everywhere. The king and queen hold the only keys to the crown jewels. The cook needs access only to the kitchen, the larder, and the dining room. The cleaning staff needs limited access everywhere, but only to clean, and the royal guard needs access to areas where the king and queen are. No one other than the king and queen, however, needs access to the crown jewels. This system of restricted access provides two benefits:

  • Only those who absolutely require access to a castle area have keys, and only to perform their assigned jobs, nothing more. If the cook tries to access the crown jewels, security alarms notify the royal guard, along with the king and queen.
  • Only two people, the king and queen, have access to the crown jewels. Should anything happen to the crown jewels, a targeted evaluation of those two people takes place and doesn’t require involvement of the cook, the cleaning staff, or the royal guard because they don’t have access.

This is the concept of least-privileged access: We only allow you access to a specific role to perform a specific activity within a specific amount of time from a secure device while logged in from a secure identity.

Creating a secure high-risk environment

We can’t truly secure our devices without having a highly secure datacenter to build and house our infrastructure. We used HVA to implement a multitiered and highly secure high-risk environment (HRE) for isolated hosting. We treated our HRE as a private cloud that lives inside a secure datacenter and is isolated from dependencies on external systems, teams, and services. Our secure tools and services are built within the HRE.

Traditional corporate networks were typically walled only at the external perimeters. Once an attacker gained access, it was easier for a breach to move across systems and environments. Production servers often reside on the same segments or on the same levels of access as clients, so you inherently gain access to servers and systems. If you start building some of your systems but you’re still dependent on older tools and services that run in your production environment, it’s hard to break those dependencies. Each one increases your risk of compromise.

It’s important to remember that security awareness requires ongoing hygiene. New tools, resources, portals, and functionality are constantly coming online or being updated. For example, certain web browsers sometimes release updates weekly. We must continually review and approve the new releases, and then repackage and deploy the replacement to approved locations. Many companies don’t have a thorough application-review process, which increases their attack surface due to poor hygiene (for example, multiple versions, third-party and malware-infested application challenges, unrestricted URL access, and lack of awareness).

The initial challenge we faced was discovering all the applications and tools that administrators were using so we could review, certify, package, and sign them as approved applications for use in the HRE and on SAWs. We also needed to implement a thorough application-review process, specific to the applications in the HRE.

Our HRE was built as a trust-nothing environment. It’s isolated from other less-secure systems within the company and can only be accessed from a SAW—making it harder for adversaries to move laterally through the network looking for the weakest link. We use a combination of automation, identity isolation, and traditional firewall isolation techniques to maintain boundaries between servers, services, and the customers who use them. Admin identities are distinct from standard corporate identities and subject to more restrictive credential- and lifecycle-management practices. Admin access is scoped according to the principle of least privilege, with separate admin identities for each service. This isolation limits the scope that any one account could compromise. Additionally, every setting and configuration in the HRE must be explicitly reviewed and defined. The HRE provides a highly secure foundation that allows us to build protected solutions, services, and systems for our administrators.

Secure devices

Secure admin workstations (SAWs) are limited-use client machines that substantially reduce the risk of compromise. They are an important part of our layered, defense-in-depth approach to security. A SAW doesn’t grant rights to any actual resources—it provides a “secure keyboard” in which an administrator can connect to a secure server, which itself connects to the HRE.

A SAW is an administrative-and-productivity-device-in-one, designed and built by Microsoft for one of our most critical resources—our administrators. Each administrator has a single device, a SAW, where they have a hosted virtual machine (VM) to perform their administrative duties and a corporate VM for productivity work like email, Microsoft 365 products, and web browsing.

When working, administrators must keep secure devices with them, but they are responsible for them at all times. This requirement mandated that the secure device be portable. As a result, we developed a laptop that’s a securely controlled and provisioned workstation. It’s designed for managing valuable production systems and performing daily activities like email, document editing, and development work. The administrative partition in the SAW curbs credential-theft and credential-reuse scenarios by locking down the environment. The productivity partition is a VM with access like any other corporate device.

The SAW host is a restricted environment:

  • It allows only signed or approved applications to run.
  • The user doesn’t have local administrative privileges on the device.
  • By design, the user can browse only a restricted set of web destinations.
  • All automatic updates from external parties and third-party add-ons or plug-ins are disabled.

Again, the SAW controls are only as good as the environment that holds them, which means that the SAW isn’t possible without the HRE. Maintaining adherence to SAW and HRE controls requires an ongoing operational investment, similar to any Infrastructure as a Service (IaaS). Our engineers code-review and code-sign all applications, scripts, tools, and any other software that operates or runs on top of the SAW. The administrator user has no ability to download new scripts, coding modules, or software outside of a formal software distribution system. Anything added to the SAW gets reviewed before it’s allowed on the device.

As we onboard an internal team onto SAW, we work with them to ensure that their services and endpoints are accessible using a SAW device. We also help them integrate their processes with SAW services.

Provisioning the administrator

Once a team has adopted the new company standard of requiring administrators to use a SAW, we deploy the Microsoft Azure-based Conditional Access (CA) policy. As part of CA policy enforcement, administrators can’t use their elevated privileges without a SAW. Between the time that an administrator places an order and receives the new SAW, we provide temporary access to a SAW device so they can still get their work done.

We ensure security at every step within our supply chain. That includes using a dedicated manufacturing line exclusive to SAWs, ensuring chain of custody from manufacturing to end-user validation. Since SAWs are built and configured for the specific user rather than pulling from existing inventory, the process is much different from how we provision standard corporate devices. The additional security controls in the SAW supply chain add complexity and can make scaling a challenge from the global-procurement perspective.

Supporting the administrator

SAWs come with dedicated, security-aware support services from our Secure Admin Services (SAS) team. The SAS team is responsible for the HRE and the critical SAW devices—providing around-the-clock role-service support to administrators.

The SAS team owns and supports a service portal that facilitates SAW ordering and fulfillment, role management for approved users, application and URL hosting, SAW assignment, and SAW reassignment. They’re also available in a development operations (DevOps) model to assist the teams that are adopting SAWs.

As different organizations within Microsoft choose to adopt SAWs, the SAS team works to ensure they understand what they are signing up for. The team provides an overview of their support and service structure and the HRE/SAW solution architecture, as illustrated in the graphic below.

A high-level overview of the HRE/SAW solution architecture, including SAS team and DevOps support services.
An overview of an isolated HRE, a SAW, and the services that help support administrators.

Today, the SAS team provides support service to more than 40,000 administrators across the company. We have more work to do as we enforce SAW usage across all teams in the company and stretch into different roles and responsibilities.

Password vaulting

The password-vaulting service allows passwords to be securely encrypted and stored for future retrieval. This eliminates the need for administrators to remember passwords, which has often resulted in passwords being written down, shared, and compromised.

SAS Password Vaulting is composed of two internal, custom services currently offered through our SAS team:

  • A custom solution to manage domain-based service accounts and shared password lists.
  • A local administrator password solution (LAPS) to manage server-local administrator and integrated Lights-Out (iLO) device accounts.

Password management is further enhanced by the service’s capability to automatically generate and roll complex random passwords. This ensures that privileged accounts have high-strength passwords that are changed regularly and reduces the risk of credential theft.

Administrative policies

We’ve put administrative policies in place for privileged-account management. They’re designed to protect the enterprise from risks associated with elevated administrative rights. Microsoft Digital reduces attack vectors with an assortment of security services, including SAS and Identity and Access Management, that enhance the security posture of the business. Especially important is the implementation of usage metrics for threat and vulnerability management. When a threat or vulnerability is detected, we work with our Cyber Defense Operations Center (CDOC) team. Using a variety of monitoring systems through data and telemetry measures, we ensure that compliance and enforcement teams are notified immediately. Their engagement is key to keeping the ecosystem secure.

Just-in-time entitlement system

Least-privileged access paired with a just-in-time (JIT) entitlement system provides the least amount of access to administrators for the shortest period of time. A JIT entitlement system allows users to elevate their entitlements for limited periods of time to complete elevated-privilege and administrative duties. The elevated privileges normally last between four and eight hours.

JIT allows removal of users’ persistent administrative access (via Active Directory Security Groups) and replaces those entitlements with the ability to elevate into roles on-demand and just-in-time. We used proper RBAC approaches with an emphasis on providing access only to what is absolutely required. We also implemented access controls to remove excess access (for example, Global Administrator or Domain Administrator privileges). An example of how JIT is part of our overarching defense-in-depth strategy is a scenario in which an administrator’s smartcard and PIN are stolen. Even with the physical card and the PIN, an attacker would have to successfully navigate a JIT workflow process before the account would have any access rights.

Key Takeaways

In the three years this project has been going on, we have learned that an ongoing commitment and investment are critical to providing defense-in-depth protection in an ever-evolving work environment. We have learned a few things that could help other companies as they decide to better protect their administrators and, thus, their company assets:

  • Securing all environments. We needed to evolve the way we looked at our environments. Through evolving company strategy and our Red Team/PEN testing, it has been proven numerous times that successful system attacks take advantage of weak controls or bad hygiene in a development environment to access and cause havoc in production.
  • Influencing, rather than forcing, cultural change. Microsoft employees have historically had the flexibility and freedom to do amazing things with the products and technology they had on hand. Efforts to impose any structure, rigor, or limitation on that freedom can be challenging. Taking people’s flexibility away from them, even in the name of security, can generate friction. Inherently, employees want to do the right thing when it comes to security and will adopt new and better processes and tools as long as they understand the need for them. Full support of the leadership team is critical in persuading users to change how they think about security. It was important that we developed compelling narratives for areas of change, and had the data and metrics to reinforce our messaging.
  • Scaling SAW procurement. We secure every aspect of the end-to-end supply chain for SAWs. This level of diligence does result in more oversight and overhead. While there might be some traction around the concept of providing SAWs to all employees who have elevated-access roles, it would still be very challenging for us to scale to that level of demand. From a global perspective, it is also challenging to ensure the required chain of custody to get SAWs into the hands of administrators in more remote countries and regions. To help us overcome the challenges of scale, we used a phased approach to roll out the Admin SAW policy and provision SAWs.
  • Providing a performant SAW experience for the global workforce. We aim to provide a performant experience for all users, regardless of their location. We have users around the world, in most major countries and regions. Supporting our global workforce has required us to think through and deal with some interesting issues regarding the geo-distribution of services and resources. For instance, locations like China and some places in Europe are challenging because of connectivity requirements and performance limitations. Enforcing SAW in a global company has meant dealing with these issues so that an administrator, no matter where they are located, can effectively complete necessary work.

What’s next

As we stated before, there are no silver-bullet solutions when it comes to security. As part of our defense-in-depth approach to an ever-evolving threat landscape, there will always be new initiatives to drive.

Recently, we started exploring how to separate our administrators from our developers and using a different security approach for the developer roles. In general, developers require more flexibility than administrators.

There also continue to be many other security initiatives around device health, identity and access management, data loss protection, and corporate networking. We’re also working on the continued maturity of our compliance and governance policies and procedures.

Getting started

While it has taken us years to develop, implement, and refine our multitiered, defense-in-depth approach to security, there are some solutions that you can adopt now as you begin your journey toward improving the state of your organization’s security:

  • Design and enforce hygiene. Ensure that you have the governance in place to drive compliance. This includes controls, standards, and policies for the environment, applications, identity and access management, and elevated access. It’s also critical that standards and policies are continually refined to reflect changes in environments and security threats. Implement governance and compliance to enforce least-privileged access. Monitor resources and applications for ongoing compliance and ensure that your standards remain current as roles evolve.
  • Implement least-privileged access. Using proper RBAC approaches with an emphasis on providing access only to what is absolutely required is the concept of least-privileged access. Add the necessary access controls to remove the need for Global Administrator or Domain Administrator access. Just provide everyone with the access that they truly need. Build your applications, environments, and tools to use RBAC roles, and clearly define what each role can and can’t do.
  • Remove all persistent access. All elevated access should require JIT elevation. It requires an extra step to get temporary secure access before performing elevated-privilege work. Setting persistent access to expire when it’s no longer necessary narrows your exposed attack surface.
  • Provide isolated elevated-privilege credentials. Using an isolated identity substantially reduces the possibility of compromise after a successful phishing attack. Admin accounts without an inbox have no email to phish. Keeping the information-worker credential separate from the elevated-privilege credential reduces the attack surface.

Microsoft Services can help

Customers interested in adopting a defense-in-depth approach to increase their security posture might want to consider implementing Privileged Access Workstations (PAW). PAWs are a key element of the Enhanced Security Administrative Environment (ESAE) reference architecture deployed by the cybersecurity professional services teams at Microsoft to protect customers against cybersecurity attacks.

For more information about engaging Microsoft Services to deploy PAWs or ESAE for your environment, contact your Microsoft representative or visit the Microsoft Security page.

Reaping the rewards

Over the last two years we’ve had an outside security audit expert perform a cyber-essentials-plus certification process. In 2017, the security audit engineers couldn’t run most of their baseline tests because the SAW was so locked down. They said it was the “most secure administrative-client audit we’ve ever completed.” They couldn’t even conduct most of their tests with the SAW’s baseline, locked configuration.

In 2018, the security audit engineer said, “I had no chance; you have done everything right,” and added, “You are so far beyond what any other company in the industry is doing.”

Also, in 2018, our SAW project won a CSO50 Award, which recognizes security projects and initiatives that demonstrate outstanding business value and thought leadership. SAW was commended as an innovative practice and a core element of the network security strategy at Microsoft.

Ultimately, the certifications and awards help validate our defense-in-depth approach. We are building and deploying the correct solutions to support our ongoing commitment to securing Microsoft and our customers’ and partners’ information. It’s a pleasure to see that solution recognized as a leader in the industry.

The post Improving security by protecting elevated-privilege accounts at Microsoft appeared first on Inside Track Blog.

]]>
9774
Boosting our Secure Future Initiative at Microsoft with a transformed approach to wired network security http://approjects.co.za/?big=insidetrack/blog/boosting-our-secure-future-initiative-at-microsoft-with-a-transformed-approach-to-wired-network-security/ Thu, 30 Jan 2025 17:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=18066 If you asked Sean Adams, Justin Griffin, Sajith Balan, or Shyam Sunder Gogi to provide a one-word answer that describes their current focus, you’d get the same answer: “Security.” Adams, Griffin, Balan, and Gogi are all part of a team in Microsoft Digital, the company’s IT organization, that is implementing internet-first, policy-based security for every […]

The post Boosting our Secure Future Initiative at Microsoft with a transformed approach to wired network security appeared first on Inside Track Blog.

]]>
Microsoft Digital technical stories

If you asked Sean Adams, Justin Griffin, Sajith Balan, or Shyam Sunder Gogi to provide a one-word answer that describes their current focus, you’d get the same answer:

“Security.”

Adams, Griffin, Balan, and Gogi are all part of a team in Microsoft Digital, the company’s IT organization, that is implementing internet-first, policy-based security for every single wired network device here at Microsoft.

This immense effort spans our global network and ensures that every device connecting to our network—regardless of how or where—is identified, attested, authenticated, and placed on the proper network first.

“Our default network posture for any device that connects is internet-first,” Adams says. “The majority of tools Microsoft employees use are cloud-based and internet-friendly in our modern workplace, so it only makes sense. The concept of a corporate network where we inherently trust physically connected devices is long gone—and good riddance.”

It’s one example of how we’re demonstrating our organization-wide commitment to security.

In May 2024, Microsoft CEO Satya Nadella committed to the idea that we would prioritize security above all else here at Microsoft. At the center of that commitment is our Microsoft Secure Future Initiative (SFI), which brings together every part of the company to advance cybersecurity protection across new products and our legacy infrastructure.

The SFI provides Microsoft with an overarching set of principles and pillars that we’re building upon with everything we do, from the broadest reaches of our cloud networking infrastructure to each individual wired network port in our buildings and datacenters.

Secure Future Initiative commitment

The SFI is the single largest cybersecurity initiative engineering project in our history, with more than 34,000 engineers committed to advancing the principles laid out in the SFI. Three principles define exactly how we’re prioritizing cybersecurity in our products and infrastructure:

  • Secure by design. Security comes first when designing any product or service.
  • Secure by default. Security protections are enabled and enforced by default, require no extra effort, and aren’t optional.
  • Secure operations. Security controls and monitoring will be continuously improved to meet current and future cyberthreats.

These principles anchor our approach to security internally at Microsoft. We’re continuously applying what we’ve learned from incidents to improve our methods and practices, ensuring that security is paramount in everything we do, create, and provide.

A photo of Balan.

“The SFI aligns seamlessly with Zero Trust principles. With Zero Trust, everything within the network is scrutinized and verified, which supports exactly how the SFI should impact our network.”

Sajith Balan, principal cloud network engineering manager, Microsoft Digital

Applying practical pillars

We apply these principles through our security pillars, which are to:

  • Protect identities and secrets. Reduce the risk of unauthorized access by implementing and enforcing best-in-class standards across all identity and secrets infrastructure, plus user and application authentication and authorization.
  • Protect tenants and isolate systems. Protect all our tenants and production environments using consistent, best-in-class security practices and strict isolation to minimize breadth of impact.
  • Protect engineering systems. Protect software assets and continuously improve code security through governance of the software supply chain and engineering systems infrastructure.
  • Monitor and detect cyberthreats. Provide comprehensive coverage and automatic detection of cyberthreats to our production infrastructure and services.
  • Accelerate response and remediation. Prevent exploitation of vulnerabilities discovered by external and internal entities through comprehensive and timely remediation.
  • Protect networks. Protect our production networks and implement network isolation of Microsoft and customer resources.

“The SFI aligns seamlessly with Zero Trust principles,” says Balan, a principal cloud network engineering manager in Microsoft Digital. “With Zero Trust, everything within the network is scrutinized and verified, which supports exactly how the SFI should impact our network. We started off with Zero Trust networking, which is now directly aligned with SFI. It’s about strengthening security while minimizing any employee disruption.”

Based on the principle of verified trust—to trust, you must first verify—Zero Trust eliminates the inherent trust that is assumed inside the traditional corporate network. Zero Trust architecture reduces risk across all environments by establishing strong identity verification, validating device compliance prior to granting access, and ensuring least privilege access to only explicitly authorized resources.

A photo of Griffin.

“Our wired network security puts physically connected devices in almost the exact same position as wireless devices. With Zero Trust, being physically connected means nothing, as far as security goes. Every device, every connection, every resource request is authenticated, authorized, and monitored, from end to end.”

Justin Griffin, principal group network engineering manager, Microsoft Digital

Zero Trust requires that every transaction between systems (user identity, device, network, and applications) be validated and proven trustworthy before the transaction can occur. In an ideal Zero Trust environment, the following behaviors are required:

  • Identities are validated and secure with multifactor authentication (MFA) everywhere. Using multifactor authentication eliminates password expirations and eventually will eliminate passwords.
  • Devices are managed and validated as healthy. Device health validation is required. All device types and operating systems must meet a required minimum health state as a condition of access to any Microsoft resource.
  • Telemetry is pervasive. Pervasive data and telemetry are used to understand the current security state, identify gaps in coverage, validate the impact of new controls, and correlate data across all applications and services in the environment.
  • Least privilege access is enforced. Limit access to only the applications, services, and infrastructure required to perform the job function.

Enforced security for wired networking

Our wired network connectivity policy is rooted in the SFI and Zero Trust. The security posture that this policy creates for every wired network device at Microsoft is critical to applying the principles of SFI and Zero Trust.

“Our wired network security puts physically connected devices in almost the exact same position as wireless devices,” says Griffin, a principal group network engineering manager in Microsoft Digital. “With Zero Trust, being physically connected means nothing, as far as security goes. Every device, every connection, every resource request is authenticated, authorized, and monitored, from end to end.”

Using the internet as the default network for devices is at the core of Microsoft’s wired network security. Unless the need is critical—and authorized—every device that connects to our network is routed to the internet, by default.

Griffin and the team have been working consistently for the past five years to implement comprehensive wired network security. The policy engines, networking hardware, and supporting technology for wired network security enforcement require time, effort, and—in many cases—physical presence to implement the solution properly.

The scope and impact are massive.

“This is probably the single largest network change our enterprise has ever seen,” Adams says.

With more than 700 buildings, 4,000 network switches, and almost 300,000 wired network devices, getting a device onto the appropriate network segment happens multiple times every second across our network.

The network segmentation strategy for wired network security is a critical component of the overall security framework. This strategy involves several key practices and principles to ensure robust security and efficient network management.

We use macro-segmentation to create distinct segments within the corporate network. This approach restricts access to only the necessary systems within each segment, thereby reducing the risk of unauthorized access and lateral movement within the network.

“Using a phased approach isn’t new at Microsoft. However, our success in rolling out wired port security depended directly on how we planned and structured our phases or, more accurately, our rings.”

Shyam Sunder Gogi, senior technical program manager, Microsoft Digital

Micro-segmentation is applied to further isolate network resources. Least-privilege access policies ensure that users and devices have only the minimum level of access required for their roles. This principle extends to both on-premises environments and cloud resources, including infrastructure-as-a-service (IaaS) and platform-as-a-service (PaaS) resources.

Our layered defense approach includes using monitoring tools, access control lists (ACLs), network security groups, network address translation (NAT) gateways, and bastions to secure the network environment. These measures help to detect and prevent malicious activities, ensuring that the network remains secure even in the face of potential threats.

Using iteration and consistency in implementation

Implementation of our wired network security used a phased approach, and it began more than five years ago.

During the COVID-19 pandemic, significant testing was conducted to ensure that the supporting network infrastructure could continue to function independently and support all devices and networks without interruption, even when connectivity issues arose. The team created robust policies that allowed for seamless re-authentication and re-attestation after connectivity was restored.

In those initial phases, confirming configuration and monitoring results were important, so the team started small and learned from their progress.

“Using a phased approach isn’t new at Microsoft,” says Gogi, a senior technical program manager in Microsoft Digital. “However, our success in rolling out wired port security depended directly on how we planned and structured our phases or, more accurately, our rings.”

The ring-based approach was designed to minimize disruptions and ensure that security measures were robust and reliable. Changes gradually rolled out in stages, starting with smaller, controlled environments before being expanded to the entire network. This approach allowed for continuous monitoring and adjustments, ensuring that any issues could be addressed promptly without affecting the entire network.

Adams highlights the importance of the iterative approach.

“At our scale, we had to be efficient and accurate,” he says. “Downtime was out of the question, and we certainly didn’t want gaps in service availability or applied security measures.”

Automation played a crucial role in the implementation process. Automated tools were developed to standardize configurations across all network switches, ensuring a consistent and predictable user experience. Standardizing through automation helped maintain adequate security measures while also making the deployment process more efficient.

Automating user-initiated device onboarding

Our GetConnected portal is an essential tool for securely connecting devices to our corporate network. The GetConnected portal, hosted on our internal corporate intranet site, ensures that all devices meet necessary security standards to protect employees, customers, and data.

The portal provides a centralized location for all network access needs, allowing our employees to:

  • Register managed or personal devices to specific Microsoft networks
  • Delete devices from Microsoft networks
  • Move devices between different Microsoft networks
  • Manage changes to both managed and personal devices

When connecting a managed device to the wired network in one of our buildings, devices are placed into an internet-connected segment with enterprise-quality connection and bandwidth. To access tools or services on the corporate network, devices must be registered at the GetConnected portal.

“We’ve learned a lot. There are some best practices that can make implementation much more efficient and simplify the transition to a secure, internet-first posture.”

Sean Adams, principal network engineer, Microsoft Digital

The portal uses Microsoft Entra ID’s Conditional Access capabilities to enforce user access based on device groupings and user profiles, connecting the user-directed registration process to our cloud-based identity management systems.

Getting your wired network security right

Five years of network implementation comes with some lessons.

“We’ve learned a lot,” says Adams, a principal network engineer in Microsoft Digital. “There are some best practices that can make implementation much more efficient and simplify the transition to a secure, internet-first posture.”

These practices were not only instrumental in achieving the desired security outcomes but also in ensuring the seamless operation of the entire network infrastructure through implementation. Here are some key strategies and methodologies that proved to be critical in the successful deployment of wired network security at Microsoft:

Plan for global reach. Wired network security efforts must span across the entire infrastructure, encompassing data centers, offices, and remote locations. This ensures that all network segments, regardless of their geographical location, adhere to the same high standards of security.

Comprehensive asset management. Our teams have identified, inventoried, and attributed accountability for more than 99.3% of its physical assets. This foundational step is crucial for implementing effective network security measures.

“This is a critical component,” Balan says. “Device accountability is a first line of defense. Knowing who owns every device on the network ensures faster security response, targeted containment, and accountability. When incidents strike, attribution means quicker resolution and stronger protection.”

Service tagging and traffic identification. Service tagging for new IP address allocations helps to enable precise traffic identification across the network. This capability helps detect malicious activity and simplifies the management of ACLs for both infrastructure and services.

Harden network devices. We’ve put significant effort into hardening network devices and improving lifecycle management policies. This includes developing scalable and automated methods for secret rotation, making secrets unique per device, and implementing unique per-device authentication and one-time passwords for service accounts.

Use micro-segmentation and access controls. Implementing microsegmentation ACLs further secures the management of the network. This approach limits access to a known scope of trusted production-ready locked-down machines, significantly reducing the impact of exposed secrets.

Embrace Zero Trust principles. Our entire network security strategy is aligned with Zero Trust principles, ensuring that every access request is thoroughly authenticated and authorized. This involves migrating resources to internet-facing environments and implementing strict access controls.

Scale efficiently with automation and standardization. Automation plays a critical role in maintaining a consistent and predictable user experience across all network switches. Standardizing configurations ensures that the network behaves uniformly at every site, facilitating efficient management and security.

“This is a significant advancement in our security posture and demonstrates our commitment to protecting our assets against unauthorized access. Our internet-first posture and alignment with Zero Trust principles ensure that we’ll continuously examine and iterate our network environment to improve our security posture and remain prepared for the future.”

Sajith Balan, principal cloud network engineering manager, Microsoft Digital

Looking forward

Our future efforts in wired network security will continue to evolve, focusing on supporting Zero Trust principles and the Secure Foundation Initiative (SFI), enhancing security, improving user experience, and ensuring the resilience of our network infrastructure as we go.

We’re continuously improving the employee experience, building on the success of the GetConnected portal. We want to maintain a balance between security and employee experience as we improve the security posture of our network, ensuring that security measures don’t hinder productivity.

The team is excited about the future of wired network security and the SFI at Microsoft.

“This is a significant advancement in our security posture and demonstrates our commitment to protecting our assets against unauthorized access,” Balan says. “Our internet-first posture and alignment with Zero Trust principles ensure that we’ll continuously examine and iterate our network environment to improve our security posture and remain prepared for the future.”

Key takeaways

Consider the following best practices when planning to implement wired network security at your own organization:

  • Plan for global reach by ensuring that your network security efforts span across all locations, including data centers, offices, and remote sites.
  • Conduct comprehensive asset management by identifying, inventorying, and attributing accountability for physical assets to implement effective security measures.
  • Use service tagging for new IP address allocations to enable precise traffic identification and simplify the management of ACLs.
  • Harden your network devices by developing scalable and automated methods for secret rotation, unique per-device authentication, and one-time passwords for service accounts.
  • Implement micro-segmentation and access controls to limit access to trusted, production-ready, locked-down machines, thereby reducing the impact of exposed secrets.
  • Embrace Zero Trust principles by thoroughly authenticating and authorizing every access request and migrating resources to internet-facing environments with strict access controls.
  • Scale efficiently with automation and standardization to maintain a consistent and predictable user experience across all network switches and ensure uniform behavior at every site.

The post Boosting our Secure Future Initiative at Microsoft with a transformed approach to wired network security appeared first on Inside Track Blog.

]]>
18066