Azure Monitor Archives - Inside Track Blog http://approjects.co.za/?big=insidetrack/blog/tag/azure-monitor/ How Microsoft does IT Wed, 05 Feb 2025 00:28:51 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.2 137088546 How we’re deploying our VWAN infrastructure using infrastructure as code and CI/CD http://approjects.co.za/?big=insidetrack/blog/how-were-deploying-our-vwan-infrastructure-using-infrastructure-as-code-and-ci-cd/ Sun, 19 Jan 2025 21:48:18 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=12202 Editor’s note: This is the first in an ongoing series on moving our network to the cloud internally at Microsoft. We’re building a more agile, resilient, and stable virtual wide-area network (VWAN) to create a better experience for our employees to connect and collaborate globally. By implementing a continuous integration/continuous deployment (CI/CD) approach to building […]

The post How we’re deploying our VWAN infrastructure using infrastructure as code and CI/CD appeared first on Inside Track Blog.

]]>
Microsoft Digital storiesEditor’s note: This is the first in an ongoing series on moving our network to the cloud internally at Microsoft.

We’re building a more agile, resilient, and stable virtual wide-area network (VWAN) to create a better experience for our employees to connect and collaborate globally. By implementing a continuous integration/continuous deployment (CI/CD) approach to building our VWAN-based network infrastructure, we can automate the deployment and configuration processes to ensure rapid and reliable delivery of network changes. Here’s how we’re making that happen internally at Microsoft.

Infrastructure as code (IaC)

Jimenez and Scheffler smile in corporate photos that have been merged into a composite image.
Juan Jimenez (left) and Eric Scheffler are part of the team in Microsoft Digital that is helping the company move its network to the cloud. Jimenez is a principle cloud network engineer and Scheffler is a senior cloud network engineer.

Infrastructure as code (IaC) is the fundamental principle underlying our entire VWAN infrastructure. Using IaC, we can develop and implement a descriptive model that defines and deploys VWAN components and determines how the components work together. IaC allows us to create and manage a massive network infrastructure with reusable, flexible, and rapid code deployments.

We created deployment templates and resource modules using the Bicep language in our implementation. These templates and modules describe the desired state of our VWAN infrastructure in a declarative manner. Bicep is a domain-specific language (DSL) that uses declarative syntax to deploy Microsoft Azure resources.

We maintain a primary Bicep template that calls separate modules—also maintained in Bicep templates—to create the desired resources for the deployment in alignment with Microsoft best practices. We use this modular approach to apply different deployment patterns to accommodate changes or new requirements.

With IaC, changes and redeployments are as quick as modifying templates and calling the associated modules. Additionally, parameters for each unique deployment are maintained in separate files from the templates so that different iterations of the same deployment pattern can be deployed without changing the source Bicep code.

Version control

We use Microsoft Azure DevOps, a source control system using Git, to track and manage our IaC templates, modules, and associated parameter files. With Azure DevOps, we can maintain a history of changes, collaborate within teams, and easily roll back to previous versions if necessary.

We’re also using pull requests to help track change ownership. Azure DevOps tracks changes and associates them with the engineer who made the change. Azure DevOps is a considerable help with several other version control tasks, such as requiring peer reviews and approvals before code is committed to the main branch. Our code artifacts are published to (and consumed from) a Microsoft Azure Container Registry that allows role-based access control of modules. This enables version control throughout the module lifecycle, and it’s easy to share Azure Container Registry artifacts across multiple teams for collaboration.

Automated testing

Responsible deployment is essential with IaC when deploying a set of templates could radically alter critical network infrastructure. We’ve implemented safeguards and tests to validate the correctness and functionality of our code before deployment. These tests include executing the Bicep linter as part of the Azure DevOps deployment pipeline to ensure that all Bicep best practices are being followed and to find potential issues that could cause a deployment to fail.

We’re also running a test deployment to preview the proposed resource changes before the final deployment. As the process matures, we plan to integrate more testing, including network connectivity tests, security checks, performance benchmarks, and enterprise IP address management (IPAM) integration.

Configuration management

Azure DevOps and Bicep allow us to automate the configuration and provisioning of network objects and services within our VWAN infrastructure. These tools make it easy to define and enforce desired configurations and deployment patterns to ensure consistency across different network environments. Using separate parameter files, we can rapidly deploy new environments in minutes rather than hours without changing the deployment templates or signing in to the Microsoft Azure Portal.

Continuous deployment

The continuous integration (CI) pipeline automates the deployment process for our VWAN infrastructure when the infrastructure code passes all validation and tests. The CI pipeline triggers the deployment process automatically, which might involve deploying virtual machines, building and configuring cloud network objects, setting up VPN connections, or establishing network policies.

Monitoring and observability

We’ve implemented robust monitoring and observability practices for how we deploy and manage our VWAN deployment. Monitoring and observability are helping us to ensure that our CI builds are successful, detect issues promptly, and maintain the health of our development process. Here’s how we’re building monitoring and observability in our Azure DevOps CI pipeline:

  • We’re creating built-in dashboards and reports that visualize pipeline status and metrics such as build success rates, durations, and failure details.
  • We’re generating and storing logs and artifacts during builds.
  • We’ve enabled real-time notifications to help us monitor build status for failures and critical events.
  • We’re building-in pipeline monitoring review processes to identify areas for improvement including optimizing build times, reducing failures, and enhancing the stability of our pipeline.

We’re continuing to iterate and optimize our monitoring practices. We’ve created a feedback loop to review the results of our monitoring. This feedback provides the information we need to adjust build scripts, optimize dependencies, automate certain tasks, and further enhance our pipeline.

By implementing comprehensive monitoring and observability practices in our Azure DevOps CI pipeline, we can maintain a healthy development process, catch issues early, and continuously improve the quality of our code and builds.

Rollback and rollforward

We’ve built the ability to rollback or rollforward changes in case of any issues or unexpected outcomes. This is achieved through infrastructure snapshots, version-controlled configuration files, or using features provided by our IaC tool.

Improving through iteration

We’re continuously improving our VWAN infrastructure using information from monitoring data and user experience feedback. We’re also continually assessing new requirements, newly added Azure features, and operational insights. We iterate on our infrastructure code and configuration to enhance security, performance, and reliability.

By following these steps and using CI/CD practices, we can build, test, and deploy our VWAN network infrastructure in a controlled and automated manner, creating a better employee experience by ensuring faster delivery, increased stability, and more effortless scalability.

Key Takeaways
Here are some tips on how you can start tackling some of the same challenges at your company:

  • You can use Infrastructure as code (IaC) to create and manage a massive network infrastructure with reusable, flexible, and rapid code deployments.
  • Using IaC, you can make changes and redeployments quickly by modifying templates and calling the associated modules.
  • Don’t overlook version control. Tracking and managing IaC templates, modules, and associated parameter files is essential.
  • Perform automated testing. It’s necessary to validate the correctness and functionality of the code before deployment.
  • Use configuration management tools to simplify defining and enforcing desired configurations and deployment patterns. This ensures consistency across different network environments.
  • Implement continuous deployment to automate the deployment process for network infrastructure after the code passes all validation and tests.
  • Use monitoring and observability best practices to help identify issues, track performance, troubleshoot problems, and ensure the health and availability of the network infrastructure.
  • Building rollback and roll-forward capabilities enables you to quickly respond to issues or unexpected outcomes.

Try it out
Try using a Bicep template to manage your Microsoft Azure resources.

Related links

We'd like to hear from you!
Please share your feedback with us—take our survey and let us know what kind of content is most useful to you.

The post How we’re deploying our VWAN infrastructure using infrastructure as code and CI/CD appeared first on Inside Track Blog.

]]>
12202
Microsoft moves IT infrastructure management to the cloud with Microsoft Azure http://approjects.co.za/?big=insidetrack/blog/microsoft-moves-it-infrastructure-management-to-the-cloud-with-microsoft-azure/ Tue, 03 Sep 2024 16:08:04 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=8977 We’re transforming our IT infrastructure management internally here at Microsoft. At Microsoft Digital Employee Experience (MDEE), we’re embracing our digital transformation and the culture changes that comes with it. With over 98 percent of our IT infrastructure in the cloud, we’re adopting Microsoft Azure monitoring, patching, backup, and security tools to create a customer-focused self-service […]

The post Microsoft moves IT infrastructure management to the cloud with Microsoft Azure appeared first on Inside Track Blog.

]]>
Microsoft Digital technical storiesWe’re transforming our IT infrastructure management internally here at Microsoft.

At Microsoft Digital Employee Experience (MDEE), we’re embracing our digital transformation and the culture changes that comes with it. With over 98 percent of our IT infrastructure in the cloud, we’re adopting Microsoft Azure monitoring, patching, backup, and security tools to create a customer-focused self-service management environment centered around Microsoft Azure DevOps and modern engineering principles. As we continue to benefit from the growing feature set of Azure management tools, we’ll deliver a fully automated, self-service management solution that gives us visibility over our entire IT environment.

The result?

Business groups at Microsoft will be able to adapt IT services to best fit their needs.

[Explore shining a light on how Microsoft manages Shadow IT. | Discover enabling a modern support experience at Microsoft. | Unpack creating the digital workplace at Microsoft.]
For a transcript, please view the video on YouTube: https://www.youtube.com/watch?v=C1PEhAT1Cns, select the “More actions” button (three dots icon) below the video, and then select “Show transcript.”

Microsoft experts share the processes and tools used to move our monitoring services into Azure. They discuss how we utilized solutions that use native Azure functionality to recreate certain SCOM functions and views in Azure Monitor. You will also learn how DevOps teams use log analytics to gain more visibility into end-to-end application performance.

Digital transformation at Microsoft

Our MDEE team is a global IT organization that strives to meet Microsoft business needs. Microsoft Azure is the default platform for our IT infrastructure. We host 98 percent of our IT infrastructure in the cloud. Here are a few details:

  • More than 220,000 employees
  • 150 countries
  • 587 locations
  • 1,400 Azure subscriptions
  • 1,600 Azure-based applications
  • 17,000 Azure infrastructure-as-a-service (IaaS) virtual machines
  • 643,000 managed devices

Like most IT organizations, we have our roots in the datacenter. In the past, our traditional hosting services were mostly physical, on-premises environments that consisted of servers, storage, and network devices. Most of the devices were owned and maintained for specific business functions. Technologies were very diverse and needed people with specialized skills to design, deploy, and run them. Our achievements were limited by the time required to plan and implement the infrastructure to support the business.

As technology evolved, we began to move out of the datacenter and into the cloud. Cloud-based infrastructure created new opportunities for us and has transformed the IT infrastructure we manage. We continue to grow and adapt in a constantly changing IT landscape.

Traditional IT technologies, processes, and teams

Our traditional datacenters were managed by a legion of IT pros, who supported the diverse platforms and systems that made up our infrastructure. Physical servers, and later virtual servers, numbered in the tens of thousands, spanning multiple datacenters and comprising a mass of metal and silicon to be managed and maintained. Platform technologies ranged from Windows, SQL Server, BizTalk, and SharePoint farms to third-party solutions such as SAP and other information security-related tool sets. Server virtualization evolved from Hyper-V to System Center Virtual Machine Manager and System Center Orchestrator.

To provide a stable infrastructure, we used structured frameworks, such as IT Infrastructure Library/Managed Object Format (ITIL/MOF). Policies, processes, and procedures in the framework helped to enforce and control security and availability, and to prevent failures. Microsoft product engineering groups that used hosting services had a similar adoption process for their application and service needs, which were based on ITIL/MOF.

This model worked well for traditional IT infrastructure, but things began to change when cloud computing and Microsoft Azure began to influence the IT landscape.

Evolution of the hybrid cloud

As IT infrastructure and services began to move to the cloud, the nature of the cloud and how we treat it changed. We’ve now been hosting IT services in Microsoft Azure for a long time, and as Azure has evolved and grown, so has our engagement with Azure services and the volume of our IT services hosted in Azure.

Early Azure: IT-owned, IaaS, and lift-and-shift

In the early years, Microsoft Azure was IT only. We had full control of cloud development, implementation, and management. We could create and manage solutions in Azure, but it was a siloed service.

The infrastructure consisted primarily of IaaS virtual machines that hosted workloads in the cloud the same way that they hosted workloads in on-premises datacenters.

Efficiency gains were small and infrastructure management still used the same tools—sometimes hosted in the cloud and sometimes hosted on-premises and connected to the cloud. It was very much a lift-and-shift migration from the datacenter to the cloud, and our management processes imitated the on‑premises model in much the same way.

The datacenter remained the focus, but that was changing.

Microsoft Azure evolves: PaaS, co-ownership, and cloud-first

As Microsoft Azure matured and more of our infrastructure and services moved to the cloud, we began to move away from IT‑owned applications and services. The strengths of the Azure self-service and management features meant that a business group could handle many of the duties that we offered as an IT service provider—which meant that they could build solutions that were more agile and responsive to their needs.

Microsoft Azure platform-as-a-service (PaaS) functionality matured, and the focus moved from IaaS-based solutions to PaaS-based solutions. Azure became the default target for IT solutions; datacenter decommissioning began as more solutions moved to or were created in Azure. Monitoring and management was becoming cloud-focused as we pointed more of our System Center Operations Manager (SCOM) and System Center Configuration Manager (SCCM) instances at the cloud. Azure-native management started to mature.

Large-scale Azure: Service line–owned, IT-managed, PaaS-first

PaaS quickly became a focus for developers in our business groups, as they realized the agility and scalability they could achieve with PaaS-based solutions. Those developers shifted to PaaS for applications as we transitioned away from IaaS and virtual machine-based solutions.

With the advent of Microsoft Azure Resource Manager, which permitted a broader level of user control over Microsoft Azure services, we saw service lines begin to take ownership of their solutions, and business groups started to manage their own Azure resources.

The datacenter became an inconvenient necessity for apps that couldn’t move to Microsoft Azure. We still used SCOM and SCCM as the primary monitoring and management tools, but we had moved almost all our instances into IaaS implementations in Azure. Azure-native management became a mature product, and we started to plan and deliver a completely cloud-based management environment.

Microsoft Azure in a DevOps culture: Service linemanaged, Internet-first, business-first

We’re continually nurturing a DevOps culture—DevOps has transformed the way that Microsoft Azure solutions are developed and operated. Our Azure solutions offer an end-to-end view for our business groups. They’re agile, dynamic, and data-intensive. Continuous integration and continuous development create a continual state of improvements and feature releases.

The Microsoft Azure solutions that our business groups use are designed to respond to their business needs. We actively seek and use Azure-native tools for control over and insight into IT environments, in Azure first, but also, back to the datacenter where required. We’re a long, long way past managing a stack of metal. The modern workplace is here at Microsoft, and it changes every day.

Realizing digital transformation

In the modern workplace, the developers and IT decision makers in our business groups have an increasingly critical business role. Our business groups need the autonomy to make IT decisions that serve their business needs in the best way possible. With 98 percent of our IT infrastructure in Microsoft Azure, we’re increasingly looking to the agility, scale, and manageability that Azure provides. Using this scale, we solve business needs and provide the framework for a complete IT organization, from infrastructure to development to management.

Managing the modern hybrid cloud

Our modern hybrid cloud is 98 percent Microsoft Azure—and Azure is the primary platform for infrastructure and management tools. Azure is not only the default platform for IT solutions—it is our IT solution.

Just as PC sprawl occurred in the late 1990s and server sprawl did the same thing in the 2000s, cloud sprawl is a growing reality. Implementing new cloud solutions to manage the cloud environment and the remaining on-premises infrastructure is critical for our organization. The new Cloud solutions scope includes the flexibility for our engineers to leverage PaaS, Functions, and Container models to optimize the management of Cloud Environments.

Embracing decentralized IT

Decentralized IT services are a big part of digital transformation. We need a management solution that offers us—and our business groups—what we need to manage our IT environments. We always want to maintain governance over security and compliance of Microsoft as a whole, but we also realize that decentralized IT services are the most suitable model for a cloud-first organization.

By decentralizing services and ownership in Microsoft Azure, we offer our business groups several benefits:

  • Greater DevOps flexibility.
  • A native cloud experience: subscription owners can use features as soon as they’re available.
  • Freedom to choose from marketplace solutions.
  • Minimal subscription limit issues.
  • Greater control over groups and permissions.
  • Greater control over Microsoft Azure provisioning and subscriptions.
  • Business group ownership of billing and capacity management.

Our goal in the management of modern hybrid cloud continues to be a solution that transforms IT tasks into self-service native cloud solutions for monitoring, management, backup, and security across our entire environment. With this solution, our business groups and service lines have reliable, standardized management tools, and we can maintain control over and visibility into security and compliance for our entire organization.

The areas where we retain oversight include:

  • General IT and operational policy implementation, as approved by the subscription owner. Areas include compliance, operations, and incident management.
  • Shared network connectivity over Microsoft Azure ExpressRoute, as needed.
  • Visibility into infrastructure inefficiencies and self-service tool development.

Our management solution must be as agile as the solutions we manage, and we provide best practices, standards, and consulting for Microsoft Azure management solutions to ensure that our business groups are getting the most out of the platform.

Supporting digital transformation with Microsoft Azure management tools

Managing the hybrid cloud in Microsoft Azure encompasses a wide range of services and activities. For our business groups to improve, they need to monitor their apps and solutions to recognize issues and opportunities. They need a patching and management solution that keeps systems up to date, manages configuration, and automates common maintenance tasks.

We must protect data with a disaster recovery platform and ensure security and compliance for business groups and the entire company. We use the management tools in Microsoft Azure to enable hybrid cloud management.

Monitoring the hybrid cloud

Monitoring is an essential task for our business groups and their service lines. They need to understand how their apps are performing (or not performing) and have insight into their environment. We’ve used SCOM for monitoring at Microsoft for more than 10 years—and a certain rhythm develops when you use a product for that long.

To ease the transition from SCOM to Microsoft Azure monitoring, we developed transition solutions that use native Azure functionality to recreate certain SCOM functions and views in Microsoft Azure Monitor.

The transition solutions consist primarily of PowerShell scripts and documentation. They give our business groups a familiar environment to work in while they become familiar with Microsoft Azure monitoring.

Our business groups can also start in a standardized environment with our built-in tested security and compliance components. This helps us maintain a centralized standard while allowing for decentralized monitoring. We maintain metrics for critical organizational services, but we leave operational monitoring to each business group.

Our Microsoft Azure monitoring is designed to:

  • Create visibility. We’re providing instant access to a foundation set of metrics, alerts, and notifications across core Azure services for all business units. Microsoft Azure Monitoring also covers production and non-production environments, as well as native monitoring support across Microsoft Azure DevOps.
  • Provide insight. Business groups and service lines can view rich analytics and diagnostics across applications, as well as compute, storage, and network resources, including anomaly detection and proactive alerting.
  • Enable optimization. Monitoring results help our business groups and service lines understand how users are engaging with their applications, identify sticking points, develop cohorts, and optimize the business impact of their solutions.
  • Deliver Extensibility. Designed for extensibility to enable support for custom event ingestion, and broader analytics scenarios.

We’ve now retired our SCOM environment, leaving Microsoft Azure monitoring as the default for both cloud and on-premises monitoring now focusing on:

  • Automated installation and repair of the Microsoft Monitoring Agent using Microsoft Azure Runbooks.
  • Centralized visibility into comprehensive health and performance.
  • Fully featured transition solution development to enable complete self-service monitoring in Microsoft Azure.
  • Complete transition from SCOM to Microsoft Azure.

Patching, updating, and inventory management

As we’ve done for monitoring, we’re using transition solutions to make it easier for business groups to transition from previously used on-premises tools to Microsoft Azure.

Our patching processes depended on our preexisting solutions as we worked through the transition to Microsoft Azure. SCCM and associated agents provided the bulk of our patching, software distribution, and management process, but we’ve moved to Azure in a phased approach as our Azure subscriptions become ready to transition to Azure for management.

We’ve built transition solutions for our business groups to help them transition from the SCCM platform and other legacy tools to the Microsoft Azure update management patching service. We’re maintaining and modifying these transition solutions as Azure features replaced the on-premises functionality.

From a patching and management perspective, we’re focusing on:

  • The transition of inventory management from Configuration Manager to Microsoft Azure, including discovery, tracking, and management of IT assets.
  • Transition of our update processes to Microsoft Azure Update Management for business groups.
  • Enabling self-service patch management. We’re developing an orchestrated deployment of operating system and application updates with Microsoft Azure, including centralized compliance reporting.
  • Creating and updating solutions to support the transition of the above areas, including Resource Manager templates, PowerShell scripts, documentation, and Microsoft Azure Desired State Configuration.

The design for patching and management, as with monitoring, is to provide an Microsoft Azure-based self-service solution for our business groups that gives them control over their patching and management environment while giving us the ability to centrally monitor for compliance and security purposes.

Ensuring recoverable data

With Microsoft Azure as the primary repository for business data, it’s extremely important to have an Azure backup solution with which our business groups and service lines can safeguard, retain, and recover their data.

Our data recovery solutions address the following major areas of concern:

  • Recover business data from attacks by malicious software or malicious activity.
  • Recover from accidental deletion or data corruption.
  • Secure critical business data.
  • Maintain compliance standards.
  • Provide historical data recovery requirements for legal purposes.

Our Microsoft Azure data footprint is immense. We currently host 1.5 petabytes of raw data in Azure and use almost nine petabytes of storage to back up that data.

We’re using Microsoft Azure Backup as a self-service solution. It gives business groups more control over how they perform their backups and gives them responsibility for backing up their business data—because each business group knows its data better than anyone else.

We’re also using Microsoft Azure Backup for virtual machine-level backup, and we’re backing up some on-premises data to Microsoft Azure using Microsoft Azure Recovery Services vaults. We’ve created a packaged solution for backup management in Azure that consists of scripts and documentation—our business groups can use it to migrate to Azure Backup quickly and efficiently.

As with other areas of enterprise management, we’re evaluating new features for Microsoft Azure Backup that will offer more backup capabilities to our business groups.

Embedding security and compliance

Decentralization gets the greatest scrutiny when it comes to security and compliance. We’re responsible for security and legal compliance for the organization, so our security controls are the most centralized of all the cloud management solutions we implement. However, centralization does not directly affect day-to-day solution management for our business groups and their service lines.

We leveraged a broad set of security and compliance practices and tools that are generally applied across all Microsoft Azure subscriptions. The following imperatives govern the general application of security and compliance measures:

  • Microsoft Azure Policy. Using Azure Policy, we establish guardrails in subscriptions that keep our service engineers within governance boundaries automatically. Policy can help control a myriad of settings by default, including limiting the network configurations to safe patterns, controlling the regions and types of Microsoft Azure resources available for use, and ensuring data is stored with encryption enabled.
  • Automation gives us a chance to keep pace with the constantly changing cloud environment. DevOps is heavily centered on end-to-end automation, and we need to complement DevOps automation with automated security. Automated security saves significant time and cost for apps that are frequently updated, and we can quickly and consistently configure and deploy security.
  • Empower engineering teams. In an environment where change is constant, we want to empower our engineering teams to make meaningful, consistent changes without waiting for a central security team to approve an app. Our engineers need the ability to integrate security into the DevOps workflow. They don’t have to take extra measures to be secure, nor do they need to wait for a central security team to approve an app.
  • Maintain continuous assurance. When development and deployment are continuous, everything that goes with them needs to follow suit—including security assurance. The old requirements for sign-offs or compliance checks create tension in the modern engineering environment. We want to define a security state and track drift from that state to maintain a consistent level of security assurance across the entire environment. This helps ensure that builds and deployments that are secure when they’re delivered stay secure from one release iteration to the next and beyond.
  • Set up operational hygiene. We need a clear view of our DevOps environment to ensure operational hygiene. In addition to understanding operational risks in the cloud, DevOps operational hygiene in the cloud requires a different perspective. We need to create the ability to see the security state across DevOps stages and establish capabilities to receive security alerts and reminders for important periodic activities.

Key Takeaways

At MDEE, our goal is a completely cloud-based, self-service management solution that gives our business groups concise control over their environments using Microsoft Azure tools and features. We’ll continue to offer updated Azure-based solutions, transitioning away from on-premises, System Center–based management.

As we continue to transition business groups to cloud-based monitoring, we’re growing our feature set and making our Microsoft Azure-based management even better. We envision a near future where our management systems will be completely cloud based, decentralized, and automated—and our organization continuing to build our business in Azure.

Related links

We'd like to hear from you!
Please share your feedback with us—take our survey and let us know what kind of content is most useful to you.

The post Microsoft moves IT infrastructure management to the cloud with Microsoft Azure appeared first on Inside Track Blog.

]]>
8977
How Microsoft is modernizing its internal network using automation http://approjects.co.za/?big=insidetrack/blog/how-microsoft-is-modernizing-its-internal-network-using-automation/ Wed, 11 Dec 2019 23:20:08 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=5033 After Microsoft moved its workload of 60,000 on-premises servers to Microsoft Azure, employees could set up systems and virtual machines (VMs) with a push of a few buttons. Although network hardware servers have changed over time, the way that network engineers work isn’t nearly as modern. “With computers, we have modernized our processes to follow […]

The post How Microsoft is modernizing its internal network using automation appeared first on Inside Track Blog.

]]>
Microsoft Digital storiesAfter Microsoft moved its workload of 60,000 on-premises servers to Microsoft Azure, employees could set up systems and virtual machines (VMs) with a push of a few buttons.

Although network hardware servers have changed over time, the way that network engineers work isn’t nearly as modern.

“With computers, we have modernized our processes to follow DevOps processes,” says Bart Dworak, a software engineering manager on the Network Automation Delivery Team in Microsoft Digital. “For the most part, those processes did not exist with networking.”

Two years ago, Dworak says, network engineers still created and ran command-line-based scripts and created configuration change reports.

“We would sign into network devices and submit changes using the command line,” Dworak says. “In other, more modern systems, the cloud provides desired-state configurations. We should be able to do the same thing with networks.”

It became clear that Microsoft needed modern technology for configuring and managing the network, especially as the number of managed network devices increased on Microsoft’s corporate network. This increase occurred because of higher network utilization by users, applications, and devices as well as more complex configurations.

“When I started at Microsoft in 2015, our network supported 13,000 managed devices,” Dworak says. “Now, we surpassed 17,000. We’re adding more devices because our users want more bandwidth as they move to the cloud so they can do more things on the network.”

[Learn how Microsoft is using Azure ExpressRoute hybrid technology to secure the company.]

Dworak and the Network Automation Delivery Team saw an opportunity to fill a gap in the company’s legacy network-management toolkit. They decided to apply the concept of infrastructure as code to the domain of networking.

“Network as code provides a means to automate network device configuration and transform our culture,” says Steve Kern, a Microsoft Digital senior program manager and leader of the Network Automation Delivery Team.

The members of the Network Automation Delivery Team knew that implementing the concept of network as code would take time, but they had a clear vision.

“If you’ve worked in a networking organization, change can seem like your enemy,” Kern says. “We wanted to make sure changes were controlled and we had a routine, peer-reviewed rhythm of business that accounted for the changes that were pushed out to devices.”

The team has applied the concept of network as code to automate processes like changing the credentials on more than 17,000 devices at Microsoft, which now occurs in days rather than weeks. The team is also looking into regular telemetry data streaming, which would inform asset and configuration management.

“We want network devices to stream data to us, rather than us collecting data from them,” Dworak says. “That way, we can gain a better understanding of our network with a higher granularity than what is available today.”

The Network Automation Delivery Team has been working on the automation process since 2017. To do this, the team members built a Git repository and started with simple automation to gain momentum. Then, they identified other opportunities to apply the concept of GitOps—a set of practices for deployment, management, and monitoring—to deliver network services to Microsoft employees.

Implementing network as code has led to an estimated savings of 15 years of labor and vendor spending on deployments and network devices changes. As network technology shifts, so does the role of network engineers.

“We’re freeing up network engineers so they can build better, faster, and more reliable networks,” Kern says. “Our aspiration is that network engineers will become network developers who write the code. Many of them are doing that already.”

Additionally, the team is automating how it troubleshoots and responds to outages. If the company’s network event system detects that a wireless access point (AP) is down, it will automatically conduct diagnostics and attempt to address the AP network outage.

“The building AP is restored to service in less time than it would take to wake up a network engineer in the middle of the night, sign in, and troubleshoot and remediate the problem,” Kern says.

Network as code also applies a DevOps mentality to network domain by applying software development and business operations practices to iterate quickly.

“We wanted to bring DevOps principles from the industry and ensure that development and operations teams were one and the same,” Kern says. “If you build something, you own it.”

In the future, the network team hopes to create interfaces for each piece of network gear and have application developers interact with the API during the build process. This would enable the team to run consistent deployments and configurations by restoring a network device entirely from a source-code repository.

Dworak believes that network as code will enable transformation to occur across the company.

“Digital transformation is like remodeling a house. You can remodel your kitchen, living room, and other parts of your house, but first you have to have a solid foundation,” he says. “Your network is part of the foundation—transforming networking will allow others to transform faster.”

Related links

The post How Microsoft is modernizing its internal network using automation appeared first on Inside Track Blog.

]]>
5033