Today we want to share the lessons we’re learning from deploying Zero Trust networking across Microsoft.
In many enterprises, network security has traditionally focused on strictly secured and monitored corporate network perimeters. Today, in a mobile-first and cloud-first world, business network traffic exists outside the corporate network as much as it does within. The rate and the sophistication level of security attacks are increasing. Organizations can no longer rely on the traditional model of simply protecting their remaining internal environments behind a firewall. Adopting a Zero Trust strategy can help to ensure optimal security without compromising end users’ experiences.
Our team in Microsoft Digital (MSD) is deploying Zero Trust networking across the enterprise to support the Zero Trust model that our internal security team is implementing across Microsoft.
The Zero Trust model centers on strong identity, least-privilege access, device health verification, and service level control and telemetry across the entire IT infrastructure. The network perimeter is no longer the primary method of defense for an enterprise.
At Microsoft’s scale, with more than 600 sites in 120 countries and regions, evolving our network strategy to embrace Zero Trust networking has required alignment across the entire organization.
[Gain insight from Microsoft’s digital security team on Top 10 questions for Zero Trust. | Read more about sharing how Microsoft protects against ransomware. | Unpack the lessons learned in engineering Zero Trust networking.]
Sharing leadership lessons
Throughout our journey toward Zero Trust networking, we’ve learned valuable lessons. We’ve experienced challenges in the various stages of implementation that forced us to reassess and adjust our tactics and methods. We hope that by sharing our experiences we can help other enterprises better prepare to adopt and implement a Zero Trust networking strategy and overcome similar obstacles.
To read more about the lessons that our engineers have learned from our Zero Trust networking deployment, visit Lessons learned in engineering Zero Trust networking.
Planning and design
Plan using a broad scope
The impact of implementing Zero Trust networking is significant because of its size and scope. At Microsoft, early and big-picture planning involved all relevant stakeholders, including network teams, security teams, user experience teams, team managers, infrastructure service providers, and compliance auditors. We started with a comprehensive plan and worked toward more specific plans and goals.
Establish goals
We established several primary goals that we used as targets for the implementation process. While each of these considerations involved a finite subset of goals and discrete features that informed the specifics of Zero Trust networking implementation, they also served as high-level signposts to provide the direction that best supported our business. Our primary goals included:
- Understand the environment and architecture. Zero Trust networking involved fundamental changes to our network and how our business used it. We needed to understand our existing infrastructure on several different levels: what equipment was in the field, how our network was supporting critical business processes, and what network aspects we must change to support Zero Trust networking properly. This included understanding wired and wireless user-experience scenarios, evaluating traffic patterns, and measuring inbound and outbound capacity utilization. Investing in data insights and visualizations was critical to measuring current usage and modeling future usage.
- Be deliberate about implementation scope. We established scope early. We set all corporate-managed user wired and wireless networks in scope for Zero Trust networking but left dedicated engineering and services environments, like labs and datacenters, out of scope. A well-defined scope set firm boundaries for implementation tasks that followed. If our engineers could examine our scope documentation and observe that a specific device class was out of scope, they could identify that equipment easily when they encountered it in the field during implementation. Technical and fiscal limitations also directly affected the scope. If we couldn’t replace 50,000 simple IP devices in a location or region because of cost or replacement availability, we knew that those devices must be isolated from the network and addressed when a replacement was viable.
- Establish staffing and knowledge base requirements. There were many aspects of Zero Trust networking that required specific knowledge, including network security expertise, firewall and policy management, network telemetry, and foundational knowledge of traditional network functions. In addition, we identified opportunities for more in-depth automation of network configuration and deployment activity to reduce workload for our engineers and reduce the need for staffing increases.
- Embrace the internet as transport. A Zero Trust Networking implementation shifts to the internet as the default network of choice to get users and systems to their dominant cloud workloads. At Microsoft, we had been dependent on a traditional, flat corporate network model for decades. Embracing the internet meant rethinking and restructuring our network to best support the Zero Trust model. Internet-first thinking informed our decision making in all implementation areas, including perimeter security, segmentation and routing, device selection, security and policy standards, and user experience.
Teams and organization
Leadership engagement
Across any large organization, individual lines of business and departments have varying requirements. We involved leadership at all levels in Microsoft to create transparency, collect information, and gain allies and sponsors for implementing a Zero Trust network. Because we‘re also asking our customers to trust us with their critical data and workloads, our cloud offerings and services also must reflect—and support—Zero Trust principles.
We engaged executive leadership and obtained sponsorship as early in the process as possible. Visible leadership support helped drive the project forward. Effective sponsorship helped our teams overcome priority conflicts and other cultural and operational obstacles.
Governance and responsibilities
In partnership with our security team, we established governance to bring our leadership together. We addressed roles and responsibilities across teams, ensuring that we documented them. We established intake prioritization standards to ensure that our implementation teams worked on the most important tasks from a business perspective. To set these standards, we examined the business impact and implementation effort for new tasks as they arose by using an agile framework.
Planning and implementation teams
We required a broad range of business and technical knowledge across planning and implementation teams. We needed security experts that could reevaluate our network policies and standards across the implementation. We included network experts from various areas, including wired and wireless networks, virtualization and network segmentation, traffic management, quality of service (QoS), and device configuration. Understanding the potential impact on existing network team members was also important. Fortunately, we had already added software engineering expertise to our network engineering teams to drive automation, and Zero Trust accelerated that. We ensured that we placed our experts in planning and decision-making roles, thereby making the best use of our internal intellectual property. Finally, we directed as much of our high-effort, low-impact tasks, including physical infrastructure installation and maintenance, to outsourced providers.
Meetings and communication
Meeting cadence and scope
We based our meetings and communications strategy on agile methodology, a collaborative effort of self-organizing and cross-functional teams that could plan and iterate rapidly. Our core teams met briefly and often, while meetings including stakeholders and executive leadership occurred less frequently. We applied governance for how our teams worked together: how often we met, roles and responsibilities, and how we prioritized incoming work and changes to our plans. The following list reflects our meeting structure and frequency:
- Several times a week
- Conduct brief stand-up meetings with our security, end-user experience, and business teams
- Track feature progress
- Track potential blockers
- Assess overall project health
- Biweekly
- Meet with stakeholders to provide progress updates in specific areas
- Monthly
- Meet with steering committee, sponsors, and use-case scenario owners
- Quarterly
- Meet with executive leadership to check alignment with business goals and plan for the future
- Secure budget in time for fiscal planning to ensure that we could fund upcoming tasks
- Align resource allocation for upcoming tasks
Measurement and assessment
We created and relied on process-monitoring systems and reporting dashboards to keep all team members informed on project status. We used Microsoft Power BI to build dashboards for teams at all levels, ensuring that each team, leader, stakeholder, or sponsor had an active overview of their relevant area. A partial list of useful dashboards included:
- Device inventory, to identify our install base, OS versions, and whether they could accept our new network policies and configuration standards.
- Configuration change tracking, to understand where we were on our Zero Trust journey, which devices were successfully onboarded, and which devices remained on legacy configurations.
- Usage monitoring, to understand our application patterns and help answer questions such as “Which applications still require VPN, and does each application have a roadmap to cloud adoption?”
- Internet of Things (IoT) inventory and network usage, to identify vulnerable devices such as conferencing kiosks, building-management systems, and life-safety systems. These are typically a primary focus area in a Zero Trust framework.
Deployment and execution
User experience assessment
User experience is one of our primary measures for organizational effectiveness across all Microsoft systems. Our users work in diverse locations, regions, and cultures, and a potentially different experience characterizes each location. Reaching out to our users and measuring experience and impact throughout the Zero Trust networking implementation helped us understand and avoid potential issues.
Situational dependencies, such as data-residency laws or telecom-systems capabilities, required us to change implementation plans. It was important to identify these dependencies and anomalies as early as possible in our deployment processes so that we could plan and adapt accordingly.
At Microsoft, a significant portion of our network workloads come from engineering teams with unique user experiences. Software and hardware engineers who build and test software and hardware systems have very different network usage profiles than typical information workers. We reached out early and often to this community to understand their current and future needs and account for them in our deployment flight planning.
Involvement of local support and users
Local IT and leadership teams were also instrumental in implementing Zero Trust networking across Microsoft. We relied heavily on local IT staff to supply information about their environment and ensure that our solutions accounted for local functionality limitations and technical considerations. These included network resources inventory, applications, and services required for productivity, and the impact of network traffic and topology changes.
Staff members’ input reduced the engineering workload and increased the overall knowledge base that our engineers had when designing each regional implementation. We used our local and regional teams’ capabilities to collect and supply information. Local staff—including technical, support, and leadership teams in each location—who were informed about and included in the planning and design process helped prevent surprise obstacles. These individuals also served as valuable advocates and advisors when we deployed Zero Trust networking; when our deployment reached their building or region, we had local support to ensure a smooth transition.
Zero Trust networking supports a model that effectively adapts to the complexity of the modern corporate environment. It supports the mobile workforce and protects people, devices, apps, and data regardless of location. In sharing the lessons that we’ve learned so far, we hope to help other enterprises adopt Zero Trust networking effectively and efficiently. As we continue to deploy the Zero Trust model across the Microsoft enterprise, we’re learning from our experience and adapting our approach to achieve our goals.