Microsoft uses Azure to retire hundreds of physical branch-office servers

Feb 17, 2020   |  

IT professionals build and maintain the LinkedIn server farm which operates on 100% renewable energy.

Microsoft migrated branch-office services to Microsoft Azure from more than 200 physical servers hosting more than 1,400 virtual machines in 86 countries and regions. The migration enabled CSEO to decommission 95 percent of the physical servers and align the branch-office infrastructure with the rest of the organization. The new cloud-based services have created a more robust and internet-friendly environment for branch-office employees, thus creating new ways to work in remote locations and maintaining connectivity to the rest of the organization.

Microsoft Core Services Engineering and Operations (CSEO) migrated branch-office services from more than 200 physical servers hosting more than 1,400 virtual machines to Microsoft Azure to centralize services and increase efficiency. The new cloud-based services have created a more robust and internet-friendly environment for our branch-office employees, giving them new ways to work in remote locations while maintaining connectivity to the rest of the Microsoft organization.

Understanding branch-office services at Microsoft

Microsoft has more than 148,000 employees in 120 countries and regions across the world. We support employees working in many environments, and one of our most common environments is the branch office. Our branch offices provide various business services and support employees in diverse roles—from sales staff to developers—each with unique needs. Our branch offices extend our physical presence throughout the world and provide our employees with a place to get their work done.

Our branch offices have historically been an extension of our corporate network, with the offices hosting a variety of services onsite, including print, file, deployment, and configuration management. The branch offices were connected to our corporate network via managed wide area network (WAN) links that enabled us to maintain service connectivity between branch-office services and our centralized services such as Active Directory Domain Services (AD DS) hosted in our regional datacenters.

Virtual branch office servers

Most of our branch offices were supported by an onsite server running Windows Server in a configuration that we refer to as a virtual branch office server. A virtual branch office server consists of a single physical host server running several Hyper-V virtual machines with each virtual machine hosting a discrete service on top of Windows Server. The virtual branch office server environment supported five major services provided at branch locations:

  • File services
  • Print services
  • System Center Configuration Manager
  • Windows Deployment Services
  • Read-only domain controllers

The virtual branch office server had been the standard configuration in our branch offices. The server configuration provided several advantages that we didn’t have in the physical-server environment that Microsoft previously maintained in our branch locations. The virtual branch-office server configuration put services closer to the user, minimized traffic between the branch offices and central datacenters, and allowed basic service functionality in the event of a broad network outage. The virtual branch office server environment consisted of the following:

  • 238 physical host servers in 86 countries and regions
  • Approximately 1,400 virtual machines
  • Branch-office locations connected to the corporate network by using a wide area network (WAN) in combination with Multiprotocol Layer Switching (MPLS) networks and virtual private network (VPN) connections over the internet.

The following figure depicts the virtual branch office server architecture configuration.

Virtual branch office server architecture configuration. Branch office users accessing services including Print, File, Configuration Manager, Windows Deployment Services, and Read-only domain controller.
Figure 1. Virtual branch office server architecture configuration

Shifting to support cloud-first infrastructure

Although the virtual branch office server infrastructure provided important functionality for our branches in the past, the architecture was becoming less efficient and more cumbersome as we migrated our datacenter resources to the cloud in Microsoft Azure. Our impetus for introducing cloud-first infrastructure and services was our desire to be a more agile and streamlined IT organization—to provide the services that best support our employees. More of our corporate traffic was going directly to the internet or through Azure ExpressRoute connections, and we were relying less on our corporate datacenters and WAN links. We recognized the opportunity to consolidate and centralize our branch-office services to reduce costs and increase management efficiency. We needed to change the infrastructure model for our branch offices to better integrate with our cloud-first model and to better support those employees.

Migrating virtual branch office servers to Azure

Our migration process consisted of several key steps. These steps began with conducting a complete inventory of the environment, and progressed through to deploying services hosted on Azure.

Perform environment inventory

Taking an inventory of our virtual branch office server environment was a critical first step. The inventory helped us capture the larger virtual branch office server perspective in our environment, and it also helped us identify potential problems or situations that would change the migration plan. For example, one of the major discoveries from our inventory was that our virtual branch office server environment was generally over-provisioned. We had several file servers that were not hosting active shared folders, and print servers that weren’t hosting active queues. We also had several underused servers in sites that had a small number of users or sites that had another server in close network proximity.

Map requirements to cloud services

We evaluated migration approaches separately for each service type. This model enabled us to develop migration strategies and implementation plans that would best suit each service and provide the best migration for its users.

File services

We used three primary criteria to evaluate our file services servers for migration:

  • Number of file shares
  • Volume of usage and traffic
  • Amount of changeover

Based on these criteria, we assessed the consolidation potential for multiple file servers into single Azure virtual machines (VMs). Most file servers had existing file shares re-created on newly deployed Azure virtual machines. Our file services were hosted on two sets of virtual machines: one for file sharing of corporate (or group) data and another for personal data and settings using Microsoft IntelliMirror. During the migration process, we cleaned up obsolete file shares and removed unused and expired data. We also provided direction to users on how to use alternative file sharing and hosting methods such as SharePoint Online, Microsoft Teams, and OneDrive for Business for backing up personal data. Many file servers were consolidated to a few virtual machines in Azure.

Print services

Print services servers were evaluated on key criteria in the same manner as the file servers. These criteria included:

  • Number of print queues
  • Print queue usage
  • Third-party software support for Azure
  • Type of print usage, such as SAP printing or complex print scenarios
  • Server-consolidation opportunities

We migrated printer queues to newly deployed Azure virtual machines. Print processes for users were largely unaltered, and we maintained naming conventions for servers and queues in many instances where user demand or other factors required it. Print servers were consolidated by country or region to Azure virtual machines.

Microsoft Endpoint Manager – Configuration Manager

Our Configuration Manager implementation underwent a significant transformation during the migration. We re-evaluated the distribution points and secondary sites within our Configuration Manager infrastructure to determine optimal site placement. In many cases, we consolidated Configuration Manager virtual machines to simplify the virtual branch office server-based architecture further before migrating the VMs to Azure. Distribution points with low usage were considered first for ease of use and to improve service effectiveness. We engaged our network services teams to perform thorough network testing and impact evaluation before establishing an execution plan. In some locations, using the internet connection as the default rather than the VPN connection provided the best performance and user experience. Secondary sites required additional consideration due to more complicated network traffic flow.

We used cloud-based distribution points and Cloud Management Gateway to make our Configuration Manager environment cloud-ready. These components enabled us to take the next steps toward using modern management in our branch-office environment. We now had robust, internet-based touchpoints for our clients and devices.

Leveraging client caching

One of the most significant changes in the Configuration Manager environment was the transition to the Configuration Manager peer-caching feature. With peer caching, Configuration Manager clients in the same site could download updates and other Configuration Manager payloads directly from another client, instead of downloading the same content again from the distribution point in the cloud. This greatly reduced bandwidth usage for Configuration Manager clients across all sites. In parallel, we used Windows Branch Cache to cache files outside of Configuration Manager to improve client performance in some scenarios.

Windows Deployment Services

Windows Deployment Services presented our largest departure from the traditional service. At the time of migration, we had no viable solution to replace Windows Deployment Services because it required corporate network connectivity to enable devices to boot to the network. Our Windows Deployment Services environment in the virtual branch office server environment consisted of two servers: one to host Windows Deployment Services functionality, and another to host image file libraries using Distributed File System (DFS) replication. Initially, we adopted local methods for deploying Windows 10 clients and retired most of our Windows Deployment Services instances. We’ve begun using Windows Autopilot as a cloud-based client deployment solution, and we’re investigating cloud-based options for Pre-boot Execution Environment (PXE) boot functionality in our branch-office environments. Within the Redmond area, where usage volume and criticality necessitated, we temporarily left several Windows Deployment Services servers in place until we developed cloud-based PXE solutions that could fully support this environment and our business requirements.

Read-only domain controllers

Several branch office sites hosted read-only domain controllers (RODCs) to provide localized AD DS presence. In the new cloud-first model, the main purpose of the RODC, namely facilitating AD DS authentication and services over a WAN, became obsolete. We were able to simply decommission all RODC servers as part of the migration cleanup process.

The following figure depicts the migration process from virtual branch office servers to Microsoft Azure.

Diagram depicting the process for virtual branch office server consolidation. Five services, including Print, File, Configuration Manager, Windows Deployment Services, and Read-only domain controllers, are migrated to Microsoft Azure virtual machines where they are accessed by branch office users.
Figure 2. Virtual branch office server consolidation

Evaluate potential blocks

Although much of the migration planning was straightforward, we also encountered and evaluated several potential blocks to the migration process, including:

  • Network latency. With several services, the added latency from placing services in the cloud created communication issues between the client and the service. In each case, we were able to isolate the problem and accommodate the latency while still meeting user needs.
  • Configuration Manager network impact. We needed to evaluate the viability of our Configuration Manager implementation to ensure that network security and the network volume that the new configuration generated wouldn’t negatively impact our network environment. We used small test rings to validate configuration and network impact before deploying the solution to the wider environment.

Implementation and execution

The technical implementation plan for the migration was relatively simple; each service hosted in a virtual branch office server VM would migrate to a Microsoft Azure virtual machine according to the migration plan, providing consolidation of virtual branch office server VMs into Azure VMs that supplied the same services. We executed the migration to Azure over three years, moving a major group of branch-office sites within each year:

  • Year 1 (55 percent migrated). In the first year, we focused on sites with fewer users and lower resource utilization. These locations provided low-impact migration scenarios in which we could refine our methods and develop best practices for larger sites. In many cases, we were able to consolidate services into fewer Azure VMs than we had deployed as virtual branch office server VMs. We migrated approximately 55 percent of our virtual branch office server environment in the first year.
  • Year 2 (30 percent migrated). In the second year, we focused on mid-sized sites with approximately 300 to 600 users. These sites typically required a more significant investment in planning and communication. We migrated approximately 30 percent of our virtual branch office server environment in the second year.
  • Year 3 (15 percent migrated). During our third year, we migrated the largest of our sites, including those surrounding our main campus in Redmond. Large sites involved multi-month planning and a high degree of change management to ensure that our users and business experienced minimal disruption. We migrated the remaining 15 percent of the virtual branch office server environment in the third year.

We executed Azure VM deployment by using Azure Resource Manager (ARM) templates for reusability and repeatability. ARM templates enabled us to standardize our deployment process and better manage Azure virtual machines post-deployment.

Implementation considerations

We encountered several situations throughout the process that altered our migration approach or forced us to reexamine minor aspects of our migration strategy. These included:

  • Moving away from local traffic at branch sites and toward internet-based traffic provides a distributed-client approach to services. In the case of Configuration Manager, a distributed-client approach completely transformed our deployment model.
  • Obtaining user buy-in is critical. Although our technical implementation was relatively straightforward, working with the users of branch-office services required a considerable amount of effort and intentionality to ensure that they understood the migration process and how it would affect them.
  • Having mature, well-documented change management controls in place helps ensure a smooth migration process and provides an audit trail and tracking for the migration process. Change management was critical for our virtual branch office services owners and users.

Benefits

The transition of the virtual branch office server environment to Azure has provided several benefits, including gains for CSEO and for the end users. These benefits include:

  • Reduced management complexity and cost. Consolidating our services in Azure removed a large amount of complexity from our management processes. We reduced our VM count in Azure to approximately 20 percent of the virtual machines we had used in the virtual branch office server configuration. Troubleshooting issues and maintaining updates are easier, and we use Azure Monitor alerts to stay aware of the overall system state.
  • Increased agility, performance, and scalability. In Azure, we can take advantage of the platform’s agility and scalability. Azure VMs provide a much more agile environment for updating, deployment, and disaster recovery than virtual branch-office servers.

Best practices

During the migration process, we established several best practices early in the migration that helped us improve processes for the larger migration scenarios that occurred later in the migration. These best practices include:

  • Build for the cloud. Take advantage of the scalability of Azure VMs and choose the most effective SKUs. Provisioning in the cloud is a dynamic, fluid process, and managing SKUs for efficiency helps ensure adequate performance and increased cost-effectiveness.
  • Plan for traffic-flow changes. Hosting services in the cloud brings requires a different pattern and flow of network traffic. Involve network support to build network changes together. Ensure adequate traffic flow and proper network segmentation.
  • Involve local IT and support teams. We involved our IT and support teams in the branch-office locations to provide the best support for our users and ensure that we understood the nuances of the branch office server environment in each location. These teams helped us gather user feedback and identify potential usability issues. Early IT engagement and support created a better migration experience for users, local support teams, and the team implementing the migration.
  • Implement change-management controls. Any time that infrastructure changes affect the business and the user, change-management controls should be used to minimize miscommunication and ensure services continuity throughout the migration.

Conclusion

Microsoft Core Services Engineering and Operations (CSEO) migrated branch-office services to Microsoft Azure from more than 200 physical servers hosting more than 1,400 virtual machines in 86 countries and regions. The migration enabled CSEO to decommission 95 percent of the physical servers and align the branch-office infrastructure with the rest of the organization. The new cloud-based services have created a more robust and internet-friendly environment for branch-office employees, thus creating new ways to work in remote locations and maintaining connectivity to the rest of the organization.