Inside Track staff, Author at Inside Track Blog http://approjects.co.za/?big=insidetrack/blog/author/insidetrack/ How Microsoft does IT Tue, 07 Apr 2026 14:22:32 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.4 137088546 Improving security by protecting elevated-privilege accounts at Microsoft http://approjects.co.za/?big=insidetrack/blog/improving-security-by-protecting-elevated-privilege-accounts-at-microsoft/ Tue, 25 Feb 2025 17:00:00 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=9774 This story was first published in 2019. We periodically update our stories, but we can’t verify that they represent the full picture of our current situation at Microsoft. We leave them on the site so you can see what our thinking and experience was at the time. An ever-evolving digital landscape is forcing organizations to […]

The post Improving security by protecting elevated-privilege accounts at Microsoft appeared first on Inside Track Blog.

]]>
Microsoft Digital technical stories

This story was first published in 2019. We periodically update our stories, but we can’t verify that they represent the full picture of our current situation at Microsoft. We leave them on the site so you can see what our thinking and experience was at the time.

An ever-evolving digital landscape is forcing organizations to adapt and expand to stay ahead of innovative and complex security risks. Increasingly sophisticated and targeted threats, including phishing campaigns and malware attacks, attempt to harvest credentials or exploit hardware vulnerabilities that allow movement to other parts of the network, where they can do more damage or gain access to unprotected information.

Like many organizations, Microsoft Digital—our company’s IT organization—used to employ a traditional IT approach to securing the enterprise. We now know that effective security calls for a defense-in-depth approach that requires us to look at the whole environment—and everyone that accesses it—to implement policies and standards that better address risks.

To dramatically limit our attack surface and protect our assets, we developed and implemented our own defense-in-depth approach. This includes new company standards, telemetry, monitoring, tools, and processes to protect administrators and other elevated-privilege accounts.

In an environment where there are too many administrators, or elevated-privilege accounts, there is an increased risk of compromise. When elevated access is persistent or elevated-privilege accounts use the same credentials to access multiple resources, a compromised account can become a major breach.

This story highlights the steps we are taking at Microsoft to protect our environment and administrators, including new programs, tools, and considerations, and the challenges we faced. We will provide some details about the new “Protect the Administrators” program that is positively impacting the Microsoft ecosystem. This program takes security to the next level across the entire enterprise, ultimately changing our digital-landscape security approach.

Understanding defense-in-depth protection

Information protection depicted as a stool with three legs that represent device health, identity management, and data and telemetry.
The three-legged-stool approach to information protection.

Securing all environments within your organization is a great first step in protecting your company. But there’s no silver-bullet solution that will magically counter all threats. At Microsoft, information protection rests on a defense-in-depth approach built on device health, identity management, and data and telemetry—a concept illustrated by the three-legged security stool, in the graphic below. Getting security right is a balancing act. For a security solution to be effective, it must address all three aspects of risk mitigation on a base of risk management and assurance—or the stool topples over and information protection is at risk.

Risk-based approach

Though we would like to be able to fix everything at once, that simply isn’t feasible. We created a risk-based approach to help us prioritize every major initiative. We used a holistic strategy that evaluated all environments, administrative roles, and access points to help us define our most critical roles and resources within the Microsoft ecosystem. Once defined, we could identify the key initiatives that would help protect the areas that represent the highest levels of risk.

As illustrated in the graphic below, the access-level roles that pose a higher risk should have fewer accounts—helping reduce the impact to the organization and control entry.

The next sections focus primarily on protecting elevated user accounts and the “Protect the Administrators” program. We’ll also discuss key security initiatives that are relevant to other engineering organizations across Microsoft.

Implementing the Protect the Administrators program

Illustration of the risk-role pyramid we use to help prioritize security initiatives.
The risk-role pyramid.

After doing a deeper analysis of our environments, roles, and access points, we developed a multifaceted approach to protecting our administrators and other elevated-privilege accounts. Key solutions include:

  • Working to ensure that our standards and processes are current, and that the enterprise is compliant with them.
  • Creating a targeted reduction campaign to scale down the number of individuals with elevated-privilege accounts.
  • Auditing elevated-privilege accounts and role management to help ensure that only employees who need elevated access retain elevated-access privileges.
  • Creating a High Value Asset (HVA)—an isolated, high-risk environment—to host a secure infrastructure and help reduce the attack surface.
  • Providing secure devices to administrators. Secure admin workstations (SAWs) provide a “secure keyboard” in a locked-down environment that helps curb credential-theft and credential-reuse scenarios.
  • Reporting metrics and data that help us share our story with corporate leadership as well as getting buy-in from administrators and other users who have elevated-privilege accounts across the company.

Defining your corporate landscape

In the past, equipment was primarily on-premises, and it was assumed to be easier to keep development, test, and production environments separate, secure, and well-isolated without a lot of crossover. Users often had access to more than one of these environments but used a persistent identity—a unique combination of username and password—to log into all three. After all, it’s easier to remember login information for a persistent identity than it is to create separate identities for each environment. But because we had strict network boundaries, this persistent identity wasn’t a source of concern.

Today, that’s not the case. The advent of the cloud has dissolved the classic network edge. The use of on-premises datacenters, cloud datacenters, and hybrid solutions are common in nearly every company. Using one persistent identity across all environments can increase the attack surface exposed to adversaries. If compromised, it can yield access to all company environments. That’s what makes identity today’s true new perimeter.

At Microsoft, we reviewed our ecosystem to analyze whether we could keep production and non-production environments separate. We used our Red Team/penetration (PEN) testers to help us validate our holistic approach to security, and they provided great guidance on how to further establish a secure ecosystem.

The graphic below illustrates the Microsoft ecosystem, past and present. We have three major types of environments in our ecosystem today: our Microsoft and Microsoft 365 tenants, Microsoft Azure subscriptions, and on-premises datacenters. We now treat them all like a production environment with no division between production and non-production (development and test) environments.

Microsoft ecosystem then and now. Three environment types now: Microsoft and Microsoft 365 tenants, Azure subscriptions, and on-premises datacenters.
Now, everything is considered a “production” environment. We treat our three major environments in the Microsoft ecosystem like production.

Refining roles to reduce attack surfaces

Prior to embarking on the “Protect the Administrators” program, we felt it was necessary to evaluate every role with elevated privileges to determine their level of access and capability within our landscape. Part of the process was to identify tooling that would also protect company security (identity, security, device, and non-persistent access).

Our goal was to provide administrators the means to perform their necessary duties in support of the technical operations of Microsoft with the necessary security tooling, processes, and access capabilities—but with the lowest level of access possible.

The top security threats that every organization faces stem from too many employees having too much persistent access. Every organization’s goal should be to dramatically limit their attack surface and reduce the amount of “traversing” (lateral movement across resources) a breach will allow, should a credential be compromised. This is done by limiting elevated-privilege accounts to employees whose roles require access and by ensuring that the access granted is commensurate with each role. This is known as “least-privileged access.” The first step in reaching this goal is understanding and redefining the roles in your company that require elevated privileges.

Defining roles

We started with basic definitions. An information-worker account does not allow elevated privileges, is connected to the corporate network, and has access to productivity tools that let the user do things like log into SharePoint, use applications like Microsoft Excel and Word, read and send email, and browse the web.

We defined an administrator as a person who is responsible for the development, build, configuration, maintenance, support, and reliable operations of applications, networks, systems, and/or environments (cloud or on-premises datacenters). In general terms, an administrator account is one of the elevated-privilege accounts that has more access than an information worker’s account.

Using role-based controls to establish elevated-privilege roles

We used a role-based access control (RBAC) model to establish which specific elevated-privilege roles were needed to perform the duties required within each line-of-business application in support of Microsoft operations. From there, we deduced a minimum number of accounts needed for each RBAC role and started the process of eliminating the excess accounts. Using the RBAC model, we went back and identified a variety of roles requiring elevated privileges in each environment.

For the Microsoft Azure environments, we used RBAC, built on Microsoft Azure Resource Manager, to manage who has access to Azure resources and to define what they can do with those resources and what areas they have access to. Using RBAC, you can segregate duties within your team and grant to users only the amount of access that they need to perform their jobs. Instead of giving everybody unrestricted permissions in our Azure subscription or resources, we allow only certain actions at a particular scope.

Performing role attestation

We explored role attestation for administrators who moved laterally within the company to make sure their elevated privileges didn’t move with them into the new roles. Limited checks and balances were in place to ensure that the right privileges were applied or removed when someone’s role changed. We fixed this immediately through a quarterly attestation process that required the individual, the manager, and the role owner to approve continued access to the role.

Implementing least-privileged access

We identified those roles that absolutely required elevated access, but not all elevated-privilege accounts are created equal. Limiting the attack surface visible to potential aggressors depends not only on reducing the number of elevated-privilege accounts. It also relies on only providing elevated-privilege accounts with the least-privileged access needed to get their respective jobs done.

For example, consider the idea of crown jewels kept in the royal family’s castle. There are many roles within the operations of the castle, such as the king, the queen, the cook, the cleaning staff, and the royal guard. Not everyone can or should have access everywhere. The king and queen hold the only keys to the crown jewels. The cook needs access only to the kitchen, the larder, and the dining room. The cleaning staff needs limited access everywhere, but only to clean, and the royal guard needs access to areas where the king and queen are. No one other than the king and queen, however, needs access to the crown jewels. This system of restricted access provides two benefits:

  • Only those who absolutely require access to a castle area have keys, and only to perform their assigned jobs, nothing more. If the cook tries to access the crown jewels, security alarms notify the royal guard, along with the king and queen.
  • Only two people, the king and queen, have access to the crown jewels. Should anything happen to the crown jewels, a targeted evaluation of those two people takes place and doesn’t require involvement of the cook, the cleaning staff, or the royal guard because they don’t have access.

This is the concept of least-privileged access: We only allow you access to a specific role to perform a specific activity within a specific amount of time from a secure device while logged in from a secure identity.

Creating a secure high-risk environment

We can’t truly secure our devices without having a highly secure datacenter to build and house our infrastructure. We used HVA to implement a multitiered and highly secure high-risk environment (HRE) for isolated hosting. We treated our HRE as a private cloud that lives inside a secure datacenter and is isolated from dependencies on external systems, teams, and services. Our secure tools and services are built within the HRE.

Traditional corporate networks were typically walled only at the external perimeters. Once an attacker gained access, it was easier for a breach to move across systems and environments. Production servers often reside on the same segments or on the same levels of access as clients, so you inherently gain access to servers and systems. If you start building some of your systems but you’re still dependent on older tools and services that run in your production environment, it’s hard to break those dependencies. Each one increases your risk of compromise.

It’s important to remember that security awareness requires ongoing hygiene. New tools, resources, portals, and functionality are constantly coming online or being updated. For example, certain web browsers sometimes release updates weekly. We must continually review and approve the new releases, and then repackage and deploy the replacement to approved locations. Many companies don’t have a thorough application-review process, which increases their attack surface due to poor hygiene (for example, multiple versions, third-party and malware-infested application challenges, unrestricted URL access, and lack of awareness).

The initial challenge we faced was discovering all the applications and tools that administrators were using so we could review, certify, package, and sign them as approved applications for use in the HRE and on SAWs. We also needed to implement a thorough application-review process, specific to the applications in the HRE.

Our HRE was built as a trust-nothing environment. It’s isolated from other less-secure systems within the company and can only be accessed from a SAW—making it harder for adversaries to move laterally through the network looking for the weakest link. We use a combination of automation, identity isolation, and traditional firewall isolation techniques to maintain boundaries between servers, services, and the customers who use them. Admin identities are distinct from standard corporate identities and subject to more restrictive credential- and lifecycle-management practices. Admin access is scoped according to the principle of least privilege, with separate admin identities for each service. This isolation limits the scope that any one account could compromise. Additionally, every setting and configuration in the HRE must be explicitly reviewed and defined. The HRE provides a highly secure foundation that allows us to build protected solutions, services, and systems for our administrators.

Secure devices

Secure admin workstations (SAWs) are limited-use client machines that substantially reduce the risk of compromise. They are an important part of our layered, defense-in-depth approach to security. A SAW doesn’t grant rights to any actual resources—it provides a “secure keyboard” in which an administrator can connect to a secure server, which itself connects to the HRE.

A SAW is an administrative-and-productivity-device-in-one, designed and built by Microsoft for one of our most critical resources—our administrators. Each administrator has a single device, a SAW, where they have a hosted virtual machine (VM) to perform their administrative duties and a corporate VM for productivity work like email, Microsoft 365 products, and web browsing.

When working, administrators must keep secure devices with them, but they are responsible for them at all times. This requirement mandated that the secure device be portable. As a result, we developed a laptop that’s a securely controlled and provisioned workstation. It’s designed for managing valuable production systems and performing daily activities like email, document editing, and development work. The administrative partition in the SAW curbs credential-theft and credential-reuse scenarios by locking down the environment. The productivity partition is a VM with access like any other corporate device.

The SAW host is a restricted environment:

  • It allows only signed or approved applications to run.
  • The user doesn’t have local administrative privileges on the device.
  • By design, the user can browse only a restricted set of web destinations.
  • All automatic updates from external parties and third-party add-ons or plug-ins are disabled.

Again, the SAW controls are only as good as the environment that holds them, which means that the SAW isn’t possible without the HRE. Maintaining adherence to SAW and HRE controls requires an ongoing operational investment, similar to any Infrastructure as a Service (IaaS). Our engineers code-review and code-sign all applications, scripts, tools, and any other software that operates or runs on top of the SAW. The administrator user has no ability to download new scripts, coding modules, or software outside of a formal software distribution system. Anything added to the SAW gets reviewed before it’s allowed on the device.

As we onboard an internal team onto SAW, we work with them to ensure that their services and endpoints are accessible using a SAW device. We also help them integrate their processes with SAW services.

Provisioning the administrator

Once a team has adopted the new company standard of requiring administrators to use a SAW, we deploy the Microsoft Azure-based Conditional Access (CA) policy. As part of CA policy enforcement, administrators can’t use their elevated privileges without a SAW. Between the time that an administrator places an order and receives the new SAW, we provide temporary access to a SAW device so they can still get their work done.

We ensure security at every step within our supply chain. That includes using a dedicated manufacturing line exclusive to SAWs, ensuring chain of custody from manufacturing to end-user validation. Since SAWs are built and configured for the specific user rather than pulling from existing inventory, the process is much different from how we provision standard corporate devices. The additional security controls in the SAW supply chain add complexity and can make scaling a challenge from the global-procurement perspective.

Supporting the administrator

SAWs come with dedicated, security-aware support services from our Secure Admin Services (SAS) team. The SAS team is responsible for the HRE and the critical SAW devices—providing around-the-clock role-service support to administrators.

The SAS team owns and supports a service portal that facilitates SAW ordering and fulfillment, role management for approved users, application and URL hosting, SAW assignment, and SAW reassignment. They’re also available in a development operations (DevOps) model to assist the teams that are adopting SAWs.

As different organizations within Microsoft choose to adopt SAWs, the SAS team works to ensure they understand what they are signing up for. The team provides an overview of their support and service structure and the HRE/SAW solution architecture, as illustrated in the graphic below.

A high-level overview of the HRE/SAW solution architecture, including SAS team and DevOps support services.
An overview of an isolated HRE, a SAW, and the services that help support administrators.

Today, the SAS team provides support service to more than 40,000 administrators across the company. We have more work to do as we enforce SAW usage across all teams in the company and stretch into different roles and responsibilities.

Password vaulting

The password-vaulting service allows passwords to be securely encrypted and stored for future retrieval. This eliminates the need for administrators to remember passwords, which has often resulted in passwords being written down, shared, and compromised.

SAS Password Vaulting is composed of two internal, custom services currently offered through our SAS team:

  • A custom solution to manage domain-based service accounts and shared password lists.
  • A local administrator password solution (LAPS) to manage server-local administrator and integrated Lights-Out (iLO) device accounts.

Password management is further enhanced by the service’s capability to automatically generate and roll complex random passwords. This ensures that privileged accounts have high-strength passwords that are changed regularly and reduces the risk of credential theft.

Administrative policies

We’ve put administrative policies in place for privileged-account management. They’re designed to protect the enterprise from risks associated with elevated administrative rights. Microsoft Digital reduces attack vectors with an assortment of security services, including SAS and Identity and Access Management, that enhance the security posture of the business. Especially important is the implementation of usage metrics for threat and vulnerability management. When a threat or vulnerability is detected, we work with our Cyber Defense Operations Center (CDOC) team. Using a variety of monitoring systems through data and telemetry measures, we ensure that compliance and enforcement teams are notified immediately. Their engagement is key to keeping the ecosystem secure.

Just-in-time entitlement system

Least-privileged access paired with a just-in-time (JIT) entitlement system provides the least amount of access to administrators for the shortest period of time. A JIT entitlement system allows users to elevate their entitlements for limited periods of time to complete elevated-privilege and administrative duties. The elevated privileges normally last between four and eight hours.

JIT allows removal of users’ persistent administrative access (via Active Directory Security Groups) and replaces those entitlements with the ability to elevate into roles on-demand and just-in-time. We used proper RBAC approaches with an emphasis on providing access only to what is absolutely required. We also implemented access controls to remove excess access (for example, Global Administrator or Domain Administrator privileges). An example of how JIT is part of our overarching defense-in-depth strategy is a scenario in which an administrator’s smartcard and PIN are stolen. Even with the physical card and the PIN, an attacker would have to successfully navigate a JIT workflow process before the account would have any access rights.

Key Takeaways

In the three years this project has been going on, we have learned that an ongoing commitment and investment are critical to providing defense-in-depth protection in an ever-evolving work environment. We have learned a few things that could help other companies as they decide to better protect their administrators and, thus, their company assets:

  • Securing all environments. We needed to evolve the way we looked at our environments. Through evolving company strategy and our Red Team/PEN testing, it has been proven numerous times that successful system attacks take advantage of weak controls or bad hygiene in a development environment to access and cause havoc in production.
  • Influencing, rather than forcing, cultural change. Microsoft employees have historically had the flexibility and freedom to do amazing things with the products and technology they had on hand. Efforts to impose any structure, rigor, or limitation on that freedom can be challenging. Taking people’s flexibility away from them, even in the name of security, can generate friction. Inherently, employees want to do the right thing when it comes to security and will adopt new and better processes and tools as long as they understand the need for them. Full support of the leadership team is critical in persuading users to change how they think about security. It was important that we developed compelling narratives for areas of change, and had the data and metrics to reinforce our messaging.
  • Scaling SAW procurement. We secure every aspect of the end-to-end supply chain for SAWs. This level of diligence does result in more oversight and overhead. While there might be some traction around the concept of providing SAWs to all employees who have elevated-access roles, it would still be very challenging for us to scale to that level of demand. From a global perspective, it is also challenging to ensure the required chain of custody to get SAWs into the hands of administrators in more remote countries and regions. To help us overcome the challenges of scale, we used a phased approach to roll out the Admin SAW policy and provision SAWs.
  • Providing a performant SAW experience for the global workforce. We aim to provide a performant experience for all users, regardless of their location. We have users around the world, in most major countries and regions. Supporting our global workforce has required us to think through and deal with some interesting issues regarding the geo-distribution of services and resources. For instance, locations like China and some places in Europe are challenging because of connectivity requirements and performance limitations. Enforcing SAW in a global company has meant dealing with these issues so that an administrator, no matter where they are located, can effectively complete necessary work.

What’s next

As we stated before, there are no silver-bullet solutions when it comes to security. As part of our defense-in-depth approach to an ever-evolving threat landscape, there will always be new initiatives to drive.

Recently, we started exploring how to separate our administrators from our developers and using a different security approach for the developer roles. In general, developers require more flexibility than administrators.

There also continue to be many other security initiatives around device health, identity and access management, data loss protection, and corporate networking. We’re also working on the continued maturity of our compliance and governance policies and procedures.

Getting started

While it has taken us years to develop, implement, and refine our multitiered, defense-in-depth approach to security, there are some solutions that you can adopt now as you begin your journey toward improving the state of your organization’s security:

  • Design and enforce hygiene. Ensure that you have the governance in place to drive compliance. This includes controls, standards, and policies for the environment, applications, identity and access management, and elevated access. It’s also critical that standards and policies are continually refined to reflect changes in environments and security threats. Implement governance and compliance to enforce least-privileged access. Monitor resources and applications for ongoing compliance and ensure that your standards remain current as roles evolve.
  • Implement least-privileged access. Using proper RBAC approaches with an emphasis on providing access only to what is absolutely required is the concept of least-privileged access. Add the necessary access controls to remove the need for Global Administrator or Domain Administrator access. Just provide everyone with the access that they truly need. Build your applications, environments, and tools to use RBAC roles, and clearly define what each role can and can’t do.
  • Remove all persistent access. All elevated access should require JIT elevation. It requires an extra step to get temporary secure access before performing elevated-privilege work. Setting persistent access to expire when it’s no longer necessary narrows your exposed attack surface.
  • Provide isolated elevated-privilege credentials. Using an isolated identity substantially reduces the possibility of compromise after a successful phishing attack. Admin accounts without an inbox have no email to phish. Keeping the information-worker credential separate from the elevated-privilege credential reduces the attack surface.

Microsoft Services can help

Customers interested in adopting a defense-in-depth approach to increase their security posture might want to consider implementing Privileged Access Workstations (PAW). PAWs are a key element of the Enhanced Security Administrative Environment (ESAE) reference architecture deployed by the cybersecurity professional services teams at Microsoft to protect customers against cybersecurity attacks.

For more information about engaging Microsoft Services to deploy PAWs or ESAE for your environment, contact your Microsoft representative or visit the Microsoft Security page.

Reaping the rewards

Over the last two years we’ve had an outside security audit expert perform a cyber-essentials-plus certification process. In 2017, the security audit engineers couldn’t run most of their baseline tests because the SAW was so locked down. They said it was the “most secure administrative-client audit we’ve ever completed.” They couldn’t even conduct most of their tests with the SAW’s baseline, locked configuration.

In 2018, the security audit engineer said, “I had no chance; you have done everything right,” and added, “You are so far beyond what any other company in the industry is doing.”

Also, in 2018, our SAW project won a CSO50 Award, which recognizes security projects and initiatives that demonstrate outstanding business value and thought leadership. SAW was commended as an innovative practice and a core element of the network security strategy at Microsoft.

Ultimately, the certifications and awards help validate our defense-in-depth approach. We are building and deploying the correct solutions to support our ongoing commitment to securing Microsoft and our customers’ and partners’ information. It’s a pleasure to see that solution recognized as a leader in the industry.

The post Improving security by protecting elevated-privilege accounts at Microsoft appeared first on Inside Track Blog.

]]>
9774
Enhancing VPN performance at Microsoft http://approjects.co.za/?big=insidetrack/blog/enhancing-vpn-performance-at-microsoft/ Sun, 26 Jan 2025 17:00:13 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=8569 [Editor’s note: This content was written to highlight a particular event or moment in time. Although that moment has passed, we’re republishing it here so you can see what our thinking and experience was like at the time.] Modern workers are increasingly mobile and require the flexibility to get work done outside of the office. […]

The post Enhancing VPN performance at Microsoft appeared first on Inside Track Blog.

]]>
Microsoft Digital technical stories[Editor’s note: This content was written to highlight a particular event or moment in time. Although that moment has passed, we’re republishing it here so you can see what our thinking and experience was like at the time.]

Modern workers are increasingly mobile and require the flexibility to get work done outside of the office. Here at Microsoft headquarters in the Puget Sound area of Washington State, every weekday an average of 45,000 to 55,000 Microsoft employees use a virtual private network (VPN) connection to remotely connect to the corporate network. As part of our overall Zero Trust Strategy, we have redesigned our VPN infrastructure, something that has simplified our design and let us consolidate our access points. This has enabled us to increase capacity and reliability, while also reducing reliance on VPN by moving services and applications to the cloud.

Providing a seamless remote access experience

Remote access at Microsoft is reliant on the VPN client, our VPN infrastructure, and public cloud services. We have had several iterative designs of the VPN service inside Microsoft. Regional weather events in the past required large increases in employees working from home, heavily taxing the VPN infrastructure and requiring a completely new design. Three years ago, we built an entirely new VPN infrastructure, a hybrid design, using Microsoft Azure Active Directory (Azure AD) load balancing and identity services with gateway appliances across our global sites.

Key to our success in the remote access experience was our decision to deploy a split-tunneled configuration for the majority of employees. We have migrated nearly 100% of previously on-premises resources into Microsoft Azure and Microsoft Office 365. Our continued efforts in application modernization are reducing the traffic on our private corporate networks as cloud-native architectures allow direct internet connections. The shift to internet-accessable applications and a split-tunneled VPN design has dramatically reduced the load on VPN servers in most areas of the world.

Using VPN profiles to improve the user experience

We use Microsoft Endpoint Manager to manage our domain-joined and Microsoft Azure AD–joined computers and mobile devices that have enrolled in the service. In our configuration, VPN profiles are replicated through Microsoft Intune and applied to enrolled devices; these include certificate issuance that we create in Configuration Manager for Windows 10 devices. We support Mac and Linux device VPN connectivity with a third-party client using SAML-based authentication.

We use certificate-based authentication (public key infrastructure, or PKI) and multi‑factor authentication solutions. When employees first use the Auto-On VPN connection profile, they are prompted to authenticate strongly. Our VPN infrastructure supports Windows Hello for Business and Multi-Factor Authentication. It stores a cryptographically protected certificate upon successful authentication that allows for either persistent or automatic connection.

For more information about how we use Microsoft Intune and Endpoint Manager as part of our device management strategy, see Managing Windows 10 devices with Microsoft Intune.

Configuring and installing VPN connection profiles

We created VPN profiles that contain all the information a device requires to connect to the corporate network, including the supported authentication methods and the VPN gateways that the device should connect to. We created the connection profiles for domain-joined and Microsoft Intune–managed devices using Microsoft Endpoint Manager.

For more information about creating VPN profiles, see VPN profiles in Configuration Manager and How to Create VPN Profiles in Configuration Manager.

The Microsoft Intune custom profile for Intune-managed devices uses Open Mobile Alliance Uniform Resource Identifier (OMA-URI) settings with XML data type, as illustrated below.

Creating a Profile XML and editing the OMA-URI settings to create a connection profile in System Center Configuration Manager.
Creating a Profile XML and editing the OMA-URI settings to create a connection profile in System Center Configuration Manager.

Installing the VPN connection profile

The VPN connection profile is installed using a script on domain-joined computers running Windows 10, through a policy in Endpoint Manager.

For more information about how we use Microsoft Intune as part of our mobile device management strategy, see Mobile device management at Microsoft.

Conditional Access

We use an optional feature that checks the device health and corporate policies before allowing it to connect. Conditional Access is supported with connection profiles, and we’ve started using this feature in our environment.

Rather than just relying on the managed device certificate for a “pass” or “fail” for VPN connection, Conditional Access places machines in a quarantined state while checking for the latest required security updates and antivirus definitions to help ensure that the system isn’t introducing risk. On every connection attempt, the system health check looks for a certificate that the device is still compliant with corporate policy.

Certificate and device enrollment

We use an Azure AD certificate for single sign-on to the VPN connection profile. And we currently use Simple Certificate Enrollment Protocol (SCEP) and Network Device Enrollment Service (NDES) to deploy certificates to our mobile devices via Microsoft Endpoint Manager. The SCEP certificate we use is for wireless and VPN. NDES allows software on routers and other network devices running without domain credentials to obtain certificates based on the SCEP.

NDES performs the following functions:

  1. It generates and provides one-time enrollment passwords to administrators.
  2. It submits enrollment requests to the certificate authority (CA).
  3. It retrieves enrolled certificates from the CA and forwards them to the network device.

For more information about deploying NDES, including best practices, see Securing and Hardening Network Device Enrollment Service for Microsoft Intune and System Center Configuration Manager.

VPN client connection flow

The diagram below illustrates the VPN client-side connection flow.

A graphic representation of the client connection workflow. Sections shown are client components, Azure components, and site components.
The client-side VPN connection flow.

When a device-compliance–enabled VPN connection profile is triggered (either manually or automatically):

  1. The VPN client calls into the Windows 10 Azure AD Token Broker on the local device and identifies itself as a VPN client.
  2. The Azure AD Token Broker authenticates to Azure AD and provides it with information about the device trying to connect. A device check is performed by Azure AD to determine whether the device complies with our VPN policies.
  3. If the device is compliant, Azure AD requests a short-lived certificate. If the device isn’t compliant, we perform remediation steps.
  4. Azure AD pushes down a short-lived certificate to the Certificate Store via the Token Broker. The Token Broker then returns control back over to the VPN client for further connection processing.
  5. The VPN client uses the Azure AD–issued certificate to authenticate with the VPN gateway.

Remote access infrastructure

At Microsoft, we have designed and deployed a hybrid infrastructure to provide remote access for all the supported operating systems—using Azure for load balancing and identity services and specialized VPN appliances. We had several considerations when designing the platform:

  • Redundancy. The service needed to be highly resilient so that it could continue to operate if a single appliance, site, or even large region failed.
  • Capacity. As a worldwide service meant to be used by the entire company and to handle the expected growth of VPN, the solution had to be sized with enough capacity to handle 200,000 concurrent VPN sessions.
  • Homogenized site configuration. A standard hardware and configuration stamp was a necessity both for initial deployment and operational simplicity.
  • Central management and monitoring. We ensured end-to-end visibility through centralized data stores and reporting.
  • Azure AD­–based authentication. We moved away from on-premises Active Directory and used Azure AD to authenticate and authorize users.
  • Multi-device support. We had to build a service that could be used by as much of the ecosystem as possible, including Windows, OSX, Linux, and appliances.
  • Automation. Being able to programmatically administer the service was critical. It needed to work with existing automation and monitoring tools.

When we were designing the VPN topology, we considered the location of the resources that employees were accessing when they were connected to the corporate network. If most of the connections from employees at a remote site were to resources located in central datacenters, more consideration was given to bandwidth availability and connection health between that remote site and the destination. In some cases, additional network bandwidth infrastructure has been deployed as needed. The illustration below provides an overview of our remote access infrastructure.

VPN infrastructure. Diagram shows the connection from the internet to Azure traffic manager profiles, then to the VPN site.
Microsoft remote access infrastructure.

VPN tunnel types

Our VPN solution provides network transport over Secure Sockets Layer (SSL). The VPN appliances force Transport Layer Security (TLS) 1.2 for SSL session initiation, and the strongest possible cipher suite negotiated is used for the VPN tunnel encryption. We use several tunnel configurations depending on the locations of users and level of security needed.

Split tunneling

Split tunneling allows only the traffic destined for the Microsoft corporate network to be routed through the VPN tunnel, and all internet traffic goes directly through the internet without traversing the VPN tunnel or infrastructure. Our migration to Office 365 and Azure has dramatically reduced the need for connections to the corporate network. We rely on the security controls of applications hosted in Azure and services of Office 365 to help secure this traffic. For end point protection, we use Microsoft Defender Advanced Threat Protection on all clients. In our VPN connection profile, split tunneling is enabled by default and used by the majority of Microsoft employees. Learn more about Office 365 split tunnel configuration.

Full tunneling

Full tunneling routes and encrypts all traffic through the VPN. There are some countries and business requirements that make full tunneling necessary. This is accomplished by running a distinct VPN configuration on the same infrastructure as the rest of the VPN service. A separate VPN profile is pushed to the clients who require it, and this profile points to the full-tunnel gateways.

Full tunnel with high security

Our IT employees and some developers access company infrastructure or extremely sensitive data. These users are given Privileged Access Workstations, which are secured, limited, and connect to a separate highly controlled infrastructure.

Applying and enforcing policies

In Microsoft Digital, the Conditional Access administrator is responsible for defining the VPN Compliance Policy for domain-joined Windows 10 desktops, including enterprise laptops and tablets, within the Microsoft Azure Portal administrative experience. This policy is then published so that the enforcement of the applied policy can be managed through Microsoft Endpoint Manager. Microsoft Endpoint Manager provides policy enforcement, as well as certificate enrollment and deployment, on behalf of the client device.

For more information about policies, see VPN and Conditional Access.

Early adopters help validate new policies

With every new Windows 10 update, we rolled out a pre-release version to a group of about 15,000 early adopters a few months before its release. Early adopters validated the new credential functionality and used remote access connection scenarios to provide valuable feedback that we could take back to the product development team. Using early adopters helped validate and improve features and functionality, influenced how we prepared for the broader deployment across Microsoft, and helped us prepare support channels for the types of issues that employees might experience.

Measuring service health

We measure many aspects of the VPN service and report on the number of unique users that connect every month, the number of daily users, and the duration of connections. We have invested heavily in telemetry and automation throughout the Microsoft network environment. Telemetry allows for data-driven decisions in making infrastructure investments and identifying potential bandwidth issues ahead of saturation.

Using Power BI to customize operational insight dashboards

Our service health reporting is centralized using Power BI dashboards to display consolidated data views of VPN performance. Data is aggregated into an SQL Azure data warehouse from VPN appliance logging, network device telemetry, and anonymized device performance data. These dashboards, shown in the next two graphics below, are tailored for the teams using them.

A map is shown with icons depicting the status of each VPN site globally. All are in a good state.
Global VPN status dashboard.

Six graphs are shown to share VPN performance reporting dashboards. They include peak internet usage, peak VPN bandwidth, Peak VPN concurrent sessions.
Microsoft Power BI reporting dashboards.

Key Takeaways

With our optimizations in VPN connection profiles and improvements in the infrastructure, we have seen significant benefits:

  • Reduced VPN requirements. By moving to cloud-based services and applications and implementing split tunneling configurations, we have dramatically reduced our reliance on VPN connections for many users at Microsoft.
  • Auto-connection for improved user experience. The VPN connection profile automatically configured for connection and authentication types have improved mobile productivity. They also improve the user experience by providing employees the option to stay connected to VPN—without additional interaction after signing in.
  • Increased capacity and reliability. Reducing the quantity of VPN sites and investing in dedicated VPN hardware has increased our capacity and reliability, now supporting over 500,000 simultaneous connections.
  • Service health visibility. By aggregating data sources and building a single pane of glass in Microsoft Power BI, we have visibility into every aspect of the VPN experience.

Related links

The post Enhancing VPN performance at Microsoft appeared first on Inside Track Blog.

]]>
8569
Understanding Microsoft’s digital transformation http://approjects.co.za/?big=insidetrack/blog/inside-the-transformation-of-it-and-operations-at-microsoft/ Sat, 20 Jul 2024 16:16:41 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=8822 Our Microsoft Digital Employee Experience (MDEE) team builds and operates the systems that run Microsoft, and as such, we’re leading the company’s internal digital transformation. We’re doing this by rethinking traditional IT and business operations, and by driving innovation and productivity for our 220,000-plus employees worldwide. Fueling Microsoft’s digital transformation is improving our ability to […]

The post Understanding Microsoft’s digital transformation appeared first on Inside Track Blog.

]]>
Microsoft Digital technical storiesOur Microsoft Digital Employee Experience (MDEE) team builds and operates the systems that run Microsoft, and as such, we’re leading the company’s internal digital transformation. We’re doing this by rethinking traditional IT and business operations, and by driving innovation and productivity for our 220,000-plus employees worldwide. Fueling Microsoft’s digital transformation is improving our ability to empower our employees, engage our customers and partners, optimize our operations, and transform our products.

The need for digital transformation

The need for our digital transformation is evident—the global pandemic has created challenges for every organization, from employee placement to supply chain management, to continued retail operations. The investments that Microsoft has made in digital transformation have helped us respond quickly and efficiently to the frequent changes brought by the COVID-19 pandemic.

Our continued digital transformation will enable Microsoft to further its mission of empowering every person and every organization of the planet to achieve more, and it starts right here at home, with MDEE. Every new challenge presents an opportunity to assess our role in the organization and how we can put Microsoft in an even better position to take on new challenges.

Disruptions have always been a catalyst for business transformation. To lead on the forefront, we’re becoming more agile, efficient, and innovative. This means changing our systems and processes to support and quickly adapt to new products, services, business models, regulations, and anything else that comes our way.

Leading with vision and world-class execution

Leading with vision is the primary driver of our digital transformation. MDEE powers the company, and we are critical to both internal and external customers. To lead with vision, we need a clearly articulated view of where we want to take things and what we need to get there. Aligning our work to a larger vision of what we want to accomplish pushes us past day-to-day fire drills and comfortable routines to deliver something truly great for Microsoft.  Each one of our groups has a clear, targeted vision grounded in what our customers need and what we need as an organization. However, articulating the vision is not enough. An inspired and productive vision must accurately reflect what we actually do.

Vision is the foundation for the major decisions we make, not a document that we write once a year and put on a shelf. Building a strong connection between vision and work can be clarified by telling a story. The vision should create a narrative that informs our day-to-day decisions at every level. Each choice, no matter how granular, should connect itself and contribute to the broader vision. In turn, the vision inspires these choices, supporting aspirations for the business and energizing our employees. Telling the story this way makes us think carefully about how a piece of work fits into the broader vision—or if it doesn’t. It also helps us define our work in a way that’s consumable by our various stakeholder audiences, which is critical if we want them to support and partner with us. If we tell the story well, our stakeholders should be able to tell the story of how our work supports them to others.

[Discover how we’re reinventing Microsoft’s Employee Experience for a Hybrid World. Learn more about Microsoft’s cloud-centric architecture transformation. Find out how we’re enabling a modern support experience at Microsoft.]

Making hard choices

Being vision-led means making difficult and specific choices about where we will focus our efforts, and which work we will need to postpone or simply not do. We ruthlessly prioritize, focusing on what to stop investing in as much as what to invest in next. We set a high bar for quality, delivery, cost, and compliance. Our approach includes observing important guidelines for how we implement our vision and how that informs our operations. This includes:

  • Connecting outcomes to the vision and clearly prioritizing.
  • Placing user experiences at the center of our designs.
  • Building capability and depth within role-specific disciplines.
  • Investing in core platforms and systems to drive engineering productivity.
  • Using data and insights to continually assess and prioritize our approach, ensuring that we achieve our most important goals and that they align with our vision.

With this mindset and these guidelines for execution, we empower our employees to think strategically. We want them to continually have this question in their minds: What experience do customers have when interacting with Microsoft, and how can we make it better?

Establishing priorities that support our vision

As part of our Microsoft Digital Product Vision, we established and articulated critical priorities that framed our areas of work. We based the priorities on pain points that existed within MDEE and on best-in-class experiences across other organizations that we studied. The priorities continue to define and guide our work, and they act as an organizational tool for measuring our transformation’s progress:

Cloud-centric architecture

Cloud-centric architecture is designed to deliver a consistently high level of service reliability. Our systems in the cloud are agile, resilient, cost-effective, and scalable, so we can be proactive and innovative. Microsoft Azure is at the core of our architecture. We use Azure to automate our processes, unify our tools, and improve our engineering productivity. This includes transitioning to a DevOps model using the out-of-the-box capabilities that Azure DevOps and Azure Pipelines offer. The DevOps model enables faster deployment of new capabilities that are more secure and compliant. A modern cloud-centric architecture is foundational to our digital transformation, and we’re building integrated, reliable systems, instrumented for telemetry, to gather data and enable experimentation. Our investments include:

  • Transitioning from on-premises to cloud offerings to enable dynamic elastic compute, geo-redundancy, unified data strategy (Azure Data Lake), and flexible software-defined infrastructures.
  • Moving to cloud-centered IT operations, with provisioning, patching, monitoring, and backups for our cloud and on-premises environments utilizing Azure-based offerings.
  • Enabling continued company growth and improvement in our platform services while staying flat on the running cost of our services.
  • Developing deeper and richer insights into our service reliability, via standardization of monitoring solutions through Azure Application Insights, and standardization of incident-management tooling and automatic alerting. At the same time, we’re increasingly modeling our critical business processes and helping ensure end-to-end integrity through the monitoring and alerting of complex processes spanning multiple systems.
  • Providing a powerful feedback loop to our product-group partners (such as those for Azure, Microsoft Dynamics 365, and Windows) to showcase Microsoft running on Microsoft. This results in an improved enterprise-customer experience, including running one of the largest SAP instances entirely on Azure and helping ensure that Azure is SAP-ready for our customers.

Secure enterprise

Security is a never-ending, holistic pursuit that requires the same level of innovation and improvement found in every facet of the tech industry. Cloud-based architecture and ubiquitous user access require an enterprise security strategy that embraces identity as the new perimeter and encompasses our entire digital footprint. Improved security, which we’re seamlessly integrating into all parts of our digital transformation, is a component of every product we develop. Our strategy aligns around six core security pillars: device health, identity management, information protection, data and telemetry, risk management, and security assurance. Some of the specific areas in which we’re investing include:

  • Using Zero Trust as a model to help protect our infrastructure through enforced device health, strong authentication, least-privileged access, and pervasive telemetry that verifies control effectiveness.
  • Eliminating passwords through strong multi-factor authentication.
  • Thwarting phishing attacks on our users by using Microsoft Office 365 safe filters and Safe links, phishing detection, and email-delivery prevention.
  • Making our Security Operations Center even more efficient and effective through automation and the orchestration of detection and response.

Data and intelligence

Data is the most critical asset that modern organizations possess. The exponential increases in data, sophisticated algorithms, and computational power are fueling modern organizations to make rapid advances in technology and business disruptions. Our data’s value is directly proportional to the number of people within our organization who can find it, understand it, know they can trust it, and then connect it in new and meaningful ways for the deepest insights. We’re turning disparate company data into cohesive insights and intelligent experiences, and we’re investing in core areas including:

  • Creating a modern data foundation by aggregating clean, connected, and authoritative data that is catalogued and easily discoverable in a common location and any team can understand how to use to create insights and intelligent experiences.
  • Developing AI and machine learning—not to replace human experts but augment and accelerate human decisions using trusted intelligent models built on the wealth of available data.
  • Using analytics services to understand user journeys, processes, behavior, and insights, which roll up to executive scorecards to measure our progress against strategic goals.

Customer centricity

Employees and customers belong at the center of our focus and need to feel that they’re doing business with “One Microsoft” across all products and channels. Our ability to digitally transform hinges on a strong foundation of customer data. Achieving a holistic understanding allows us to provide customers with relevant and tailored offers and highly customized customer service by responding to their needs proactively. The complete technology solutions in the offers give customers the best value and a consistent experience. To achieve a security-enhanced and 360-degree understanding of our customers, online identity tenants need to be linked with sales accounts, purchase accounts and agreements, billing accounts, and third-party organizational-reference data. Our investments include:

  • Developing customer health-analytics and recommendation engines, using a clean directory and historical customer actions and interactions, to better understand and predict our customers’ needs and how we can add value with our offerings.
  • Publishing a shared, authoritative, and clean directory of organizational data and providing the tools and processes to maintain its accuracy and completeness.
  • Augmenting the organizational data that Microsoft holds by identifying and managing the relationships for any organization, enabling a more holistic understanding of who the customer is and how we can better serve them.

Productive enterprise

Microsoft employees are at the heart of our mission to enable and support our customers and partners to achieve more. We empower our employees to be their most creative and productive in how they work and collaborate across physical and digital environments. We use Microsoft products and services underpinned with Microsoft 365, AI, and machine learning to deliver connected, accessible, interactive, and individualized experiences for our employees. Our specific investments include:

  • Supporting a broad selection of devices, providing a quick and easy setup, and ensuring the devices are always up to date. We provide secure and seamless access to work-related apps, sites, services, documents, and data.
  • Developing enterprise search and task-automation capabilities that use Microsoft Search and integrated digital assistants. We’re providing our employees with a coherent and reliable enterprise-search experience and delivering automated micro-task capability to further enhance productivity.
  • Enabling team productivity by using Microsoft Teams and Office 365 as the backbone, fostering increased engagement, and accelerating decision making across devices and locations.
  • Creating a modern workplace where our employees have integrated digital and physical experiences for finding meeting spaces, indoor wayfinding, transportation, parking, and other workplace services.
  • Providing a customizable web and mobile employee experience focused on what’s important to the individual, delivering personalized access to workplace services, and making it easier to quickly complete common tasks.

Turning vision into a practical reality

Our priorities describe what we do, but how we’ll do it is just as important. We’ve made significant changes to the way we work to enable transformation. These changes allow us to take more ownership of our work, run more efficiently and effectively, and build in a way that’s durable over time. With a model for transformation, we can move away from decisions and directions based on team budget availability and move toward the delivery of clear and prioritized business outcomes. We measure our collective success by directly applying this model to our business and not by pure delivery of features. We prioritize as an organization based on where our vision directs us rather than at the local budget level. The practical goal of our vision-led product mindset is to discover the most effective and efficient solutions that will have the greatest impact on the transformational focus areas that make our vision a reality.

[Learn how we’re creating the digital workplace at Microsoft. Discover how we’re transforming modern engineering here at Microsoft. Check out how we’re redefining the digitally assisted workday at Microsoft. Learn how we’re transforming enterprise collaboration at Microsoft.]

MDEE digital-transformation methodology
Microsoft’s digital transformation methodology.

Transformed operating model

With an operating model for transformation, we can move away from decisions and directions based on team budgets and move toward the delivery of clear and prioritized business outcomes. Through this model, we’re empowering our business groups and employees by giving them autonomy and decision-making capabilities. Each business group maintains its own vision and has the freedom to prioritize its work based on that vision. However, this work still needs to align with the overarching MDEE vision and is assessed twice a year during a central review. This ensures that work is correctly prioritized and funded across the entire organization. Examples of our transformed operating model include:

  • Centralizing funding and prioritization: We’ve moved away from a decentralized, department-focused funding model and toward a centralized model where MDEE owns the budget. In the past, our business groups, such as Finance and Marketing, drove funding and projects. Now, we can use our priorities to fund work based on our vision.
  • Insourcing core systems and engineering: We’re managing the systems most critical to our organization’s success with trained, full-time employees. Historically, we outsourced much of this work. However, we’re bringing it back under the control of our employees and retaining intellectual property. We want our people behind the design, development, and operation of our most-important internal products.
  • Focusing metrics on business outcomes: Our metrics reflect the business outcomes to which we’re driving as opposed to traditional IT operating metrics. To transform successfully, alignment with our vision and contribution to the organization’s success take top priority. Therefore, how we measure success is based on business outcomes and not on arbitrary metrics.

Product-based approach to our business

To enable world-class execution of the services we build and run, we’re taking a product-based approach to our processes. We want to focus on developing solutions that contribute to our vision, and we want to use agile development methods and product-focused management in our development. Taking a product-based approach to our business means:

  • Creating a vision and business-driven agenda: We ensure that anything in which we invest resources aligns to our vision. We’re asking our internal teams to always have the best interest of Microsoft in mind. If it doesn’t align with our vision, it should be questioned—regardless of who’s doing the questioning. We want to produce the best products for our internal and external customers.
  • Focusing on skill development and a DevOps structure: A DevOps structure extends the management lifecycle for developers beyond version release. With the DevOps approach, the people on our team in MDEE who build solutions are responsible for the operation, fixes, troubleshooting and ownership over each line of code they write. A DevOps approach and agile methodology focus our employees on a solution’s success both during its development and after it’s in use. This leads to a more fluid evolution of product features and a focus on functionality rather than on feature addition.
  • Shifting to product management: We manage products rather than projects. Product management keeps our teams focused on the success of the product rather than the completion of a project. Our product managers are involved in the entire process, from managing relationships with stakeholders to understanding the technical foundations of their products. Product management builds on the DevOps structure to help ensure that teams who develop a solution feel invested in the ongoing success of that solution and not just on the release of the latest version.

Modern engineering and design practices across all processes

Modern engineering focuses on providing a common set of tools and automation that delivers code and new functionality to our employees by enabling continuous integration and delivery practices. We prioritize the most effective outcomes for the business, delivering against a ranked backlog. We add telemetry to monitor customer usage patterns, which provides insights on the health of our services and customer experiences. We want to remove functional silos in our organization and increase the ways in which our infrastructure, apps, and services connect and integrate. Behind all this, we have a unified set of standards that protect and enable our employees. We engineer for the future by:

  • Establishing a coherent design system: We’re creating a consistent, coherent, and seamless experience for our employees and customers across all our products and solutions. This means establishing priorities and standards for design and the user experience and creating an internal catalog of shared principles and guidelines to keep our entire organization in sync. Historically, we’ve developed in siloes, which led to varying user experiences and a cacophony of different tools. Now, we’re reviewing work in aggregate and scrutinizing experiences to drive user productivity.
  • Creating integrated and connected services: Our move to the cloud increases the overall agility of the development process and accelerates value delivery to the company. We’ve achieved this by re-envisioning our portfolio into a microservice architecture that promotes code reuse and enables cross-service dependencies through APIs. This further enables the delivery of a seamless and integrated experience that brings data and tools together, providing users with intuitive experiences and new insights.
  • Building privacy, security, and accessibility standards into our workflow: We integrate tools that support our engineers in building improved privacy, security, and accessibility into our solutions. Without these standards and automated policies, we’d have to rework and clean up as situations change. This is more costly and impacts our velocity of releases to users. Creating standards that we apply organization-wide, and from the beginning, creates an environment of trust in our engineering practices. Our innovations in this area ensure that our solutions also benefit our customers as these solutions are integrated into our commercial products.

Using a customer-zero feedback cycle

In MDEE, we have a unique opportunity to help our customers through their own transformations by sharing our best practices and lessons learned. As early adopters of Microsoft solutions, we provide feedback to our product-development teams and we co-develop solutions with them, which ultimately improves the products that we, and our customers, use to transform. Many of our product enhancements begin as internal solutions to business problems at Microsoft and then evolve within the feedback cycle, and then are incorporated into a final product. A key part of being customer zero is that we provide advice, guidance, and reference materials to customers based on our transformation blueprint and early adopter experience.

Key Takeaways
Almost every company in the world, including Microsoft, finds itself at a point unlike any other since the industrial revolution. The old IT model hinders the ability to remain relevant in an ever-changing marketplace, and companies must transform to maintain their competitive positioning. At Microsoft, we’ve rallied around transformation and are well underway. We’ve set ambitious goals, and we’re reshaping what we value and how we work. At our core, we’re vision-led and adopting the expectation for world-class execution. The combination of external and internal change presents a significant challenge but, more importantly, it offers a substantial opportunity for us to become more agile and respond more quickly. As a result, we’re in a better position to empower our employees, engage our customers and partners, optimize our operations, and transform our products.

Transformation does not have a finish line—it’s a journey. As we progress through our transformation, we’ll make mistakes and adjust our strategy accordingly, but we’ll also continue to move forward. We will share our transformation journey with our customers with the hope that our experiences can inspire, advise, and assist them through their own transformations.

Related links

We'd like to hear from you!

Share your feedback with us—take our survey and let us know what kind of content is most useful to you.

The post Understanding Microsoft’s digital transformation appeared first on Inside Track Blog.

]]>
8822
Providing modern data transfer and storage service at Microsoft with Microsoft Azure http://approjects.co.za/?big=insidetrack/blog/microsoft-uses-azure-to-provide-a-modern-data-transfer-and-storage-service/ Thu, 13 Jul 2023 14:54:07 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=8732 Companies all over the world have launched their cloud adoption journey. While some are just starting, others are further along the path and are now researching the best options for moving their largest, most complex workflows to the cloud. It can take time for companies to address legacy tools and systems that have on-premises infrastructure […]

The post Providing modern data transfer and storage service at Microsoft with Microsoft Azure appeared first on Inside Track Blog.

]]>
Microsoft Digital technical storiesCompanies all over the world have launched their cloud adoption journey. While some are just starting, others are further along the path and are now researching the best options for moving their largest, most complex workflows to the cloud. It can take time for companies to address legacy tools and systems that have on-premises infrastructure dependencies.

Our Microsoft Digital Employee Experience (MDEE) team has been running our company as mostly cloud-only since 2018, and continues to design cloud-only solutions to help fulfill our Internet First and Microsoft Zero Trust goals.

In MDEE, we designed a Modern Data Transfer Service (MDTS), an enterprise-scale solution that allows the transfer of large files to and from partners outside the firewall and removes the need for an extranet.

MDTS makes cloud adoption easier for teams inside Microsoft and encourages the use of Microsoft Azure for all of their data transfer and storage scenarios. As a result, engineering teams can focus on building software and shipping products instead of dealing with the management overhead of Microsoft Azure subscriptions and becoming subject matter experts on infrastructure.

[Unpack simplifying Microsoft’s royalty ecosystem with connected data service. | Check out how Microsoft employees are leveraging the cloud for file storage with OneDrive Folder Backup. | Read more on simplifying compliance evidence management with Microsoft Azure confidential ledger.]

Leveraging our knowledge and experience

As part of Microsoft’s cloud adoption journey, we have been continuously looking for opportunities to help other organizations move data and remaining legacy workflows to the cloud. With more than 220,000 employees and over 150 partners that data is shared with, not every team had a clear path for converting their transfer and storage patterns into successful cloud scenarios.

We have a high level of Microsoft Azure service knowledge and expertise when it comes to storage and data transfer. We also have a long history with legacy on-premises storage designs and hybrid third-party cloud designs. Over the past decade, we engineered several data transfer and storage services to facilitate the needs of Microsoft engineering teams. Those services traditionally leveraged either on-premises designs or hybrid designs with some cloud storage. In 2019, we began to seriously look at replacing our hybrid model, which included a mix of on-premises resources, third party software, and Microsoft Azure services, with one modern service that would completely satisfy our customer scenarios using only Azure—thanks to new capabilities in Azure making it possible and it being the right time.

MDTS uses out of the box Microsoft Azure storage configurations and capabilities to help us address legacy on-premises storage patterns and support Microsoft core commitments to fully adopt Azure in a way that satisfies security requirements. Managed by a dedicated team of service engineers, program managers, and software developers, MDTS offers performance, security, and is available to any engineering team at Microsoft that needs to move their data storage and transfer to the cloud.

Designing a Modern Data Transfer and Storage Service

The design goal for MDTS was to create a single storage service offering entirely in Microsoft Azure, that would be flexible enough to meet the needs of most engineering teams at Microsoft. The service needed to be sustainable as a long-term solution, continue to support ongoing Internet First and Zero Trust Network security designs, and have the capability to adapt to evolving technology and security requirements.

Identifying use cases

First, we needed to identify the top use cases we wanted to solve and evaluate which combination of Microsoft Azure services would help us meet our requirements. The primary use cases we identified for our design included:

  • Sharing and/or distribution of complex payloads: We not only had to provide storage for corporate sharing needs, but also share those same materials externally. The variety of file sizes and different payload characteristics can be challenging because they don’t always fit a standard profile for files (e.g., Office docs, etc.).
  • Cloud storage adoption (shifting from on-premises to cloud): We wanted to ensure that engineering teams across Microsoft that needed a path to the cloud would have a roadmap. This need could arise because of expiring on-premises infrastructure, corporate direction, or other modernization initiatives like ours.
  • Consolidation of multiple storage solutions into one service, to reduce security risks and administrative overhead: Having to place data and content in multiple storage datastores for the purposes of specific sharing or performance needs is both cumbersome and can introduce additional risk. Because there wasn’t yet a single service that could meet all their sharing needs and performance requirements, employees and teams at Microsoft were using a variety of locations and services to store and share data.

Security, performance, and user experience design requirements

After identifying the use cases for MDTS, we focused on our primary design requirements. They fell into three high-level categories: security, performance, and user experience.

Security

The data transfer and storage design needed to follow our Internet First and Zero Trust network design principles. Accomplishing parity with Zero Trust meant leveraging best practices for encryption, standard ports, and authentication. At Microsoft, we already have standard design patterns that define how these pieces should be delivered.

  • Encryption: Data is encrypted both in transit and at rest.
  • Authentication: Microsoft Azure Active Directory supports both corporate synced domain accounts, external business-to-business accounts, as well as corporate and external security groups. Leveraging Azure Active Directory allows teams to remove dependencies on corporate domain controllers for authentication.
  • Authorization: Microsoft Azure Data Lake Gen2 storage provides fine grained access to containers and subfolders. This is possible because of many new capabilities, most notably the support for OAuth, hierarchical name space, and POSIX permissions. These capabilities are necessities of a Zero Trust network security design.
  • No non-standard ports: Opening non-standard ports can present a security risk. Using only HTTPS and TCP 443 as the mechanisms for transport and communication prevents opening non-standard ports. This includes having software capable of transport that maximizes the ingress/egress capabilities of the storage platform. Microsoft Azure Storage Explorer, AzCopy, and Microsoft Azure Data Factory meet the no non-standard ports requirement.

Performance

Payloads can range from being comprised of one very large file, millions of small files, and every combination in between. Scenarios across the payload spectrum have their own computing and storage performance considerations and challenges. Microsoft Azure has optimized software solutions for achieving the best possible storage ingress and egress. MDTS helps ensure that customers know what optimized solutions are available to them, provides configuration best practices, and shares the learnings with Azure Engineering to enable robust enterprise scale scenarios.

  • Data transfer speeds: Having software capable of maximizing the ingress/egress capabilities of the storage platform is preferable for engineering-type workloads. It’s common for these workloads to have complex payloads, payloads with several large files (10-500 GB) or millions of small files.
  • Ingress and egress: Support for ingress upwards of 10 Gbps and egress of 50 Gbps. Furthermore, client and server software that can consume the maximum amount of bandwidth possible up to the maximum amount in ingress/egress possible on client and storage.

 

Data size/ bandwidth 50 Mbps 100 Mbps 500 Mbps 1 Gbps 5 Gbps 10 Gbps
1 GB 2.7 minutes 1.4 minutes 0.3 minutes 0.1 minutes 0.03 minutes 0.010 minutes
10 GB 27.3 minutes 13.7 minutes 2.7 minutes 1.3 minutes 0.3 minutes >0.1 minutes
100 GB 4.6 hours 2.3 hours 0.5 hours 0.2 hours 0.05 hours 0.02 hours
1 TB 46.6 hours 23.3 hours 4.7 hours 2.3 hours 0.5 hours 0.2 hours
10 TB 19.4 days 9.7 days 1.9 days 0.9 days 0.2 days 0.1 days

Copy duration calculations based on data size and the bandwidth limit for the environment.

User experience

Users and systems need a way to perform manual and automated storage actions with graphical, command line, or API-initiated experiences.

  • Graphical user experience: Microsoft Azure Storage Explorer provides Storage Admins the ability to graphically manage storage. It also has storage consumer features for those who don’t have permissions for Administrative actions, and simply need to perform common storage actions like uploading, downloading, etc.
  • Command line experience: AzCopy provides developers with an easy way to automate common storage actions through CLI or scheduled tasks.
  • Automated experiences: Both Microsoft Azure Data Factory and AzCopy provide the ability for applications to use Azure Data Lake Gen2 storage as its primary storage source and destination.

Identifying personas

Because a diverse set of personas utilize storage for different purposes, we need to design storage experiences that satisfy the range of business needs. Through the process of development, we identified these custom persona experiences relevant to both storage and data transfer:

  • Storage Admins: The Storage Admins are Microsoft Azure subscription owners. Within the Azure subscription they create, manage, and maintain all aspects of MDTS: Storage Accounts, Data Factories, Storage Actions Service, and Self-Service Portal. Storage Admins also resolve requests and incidents that are not handled via Self-Service.
  • Data Owners: The Data Owner personas are those requesting storage who have the authority to create shares and authorize storage. Data Owners also perform the initial steps of creating automated distributions of data to and from private sites. Data Owners are essentially the decision makers of the storage following handoff of a storage account from Storage Admins.
  • Storage Consumers: At Microsoft, storage consumers represent a broad set of disciplines, from engineers and developers to project managers and marketing professionals. Storage Consumers can use Microsoft Azure Storage Explorer to perform storage actions to and from authorized storage paths (aka Shares). Within the MDTS Self Service Portal, a storage consumer can be given authorization to create distributions. A distribution can automate the transfer of data from a source to one or multiple destinations.

Implementing and enhancing the solution architecture

After considering multiple Microsoft Azure storage types and complimentary Azure Services, the MDTS team chose the following Microsoft Azure services and software as the foundation for offering a storage and data transfer service to Microsoft Engineering Groups.

  • Microsoft Azure Active Directory: Meets the requirements for authentication and access.
  • Microsoft Azure Data Lake Gen2: Meets security and performance requirements by providing encryption, OAuth, Hierarical namespace, fine grained authorization to Azure Active Directory entities, and 10+ GB per sec ingress and egress.
  • Microsoft Azure Storage Explorer: Meets security, performance, and user experience requirements by providing a graphical experience to perform storage administrative tasks and storage consumer tasks without needing a storage account key or role based access (RBAC) on an Azure resource. Azure Storage Explorer also has AzCopy embedded to satisfy performance for complex payloads.
  • AzCopy: Provides a robust and highly performant command line interface.
  • Microsoft Azure Data Factory: Meets the requirements for orchestrating and automating data copies between private networks and Azure Data Lake Gen2 storage paths. Azure Data Factory copy activities are equally as performant as AzCopy and satisfy security requriements.

Enabling Storage and Orchestration

As illustrated below, the first MDTS design was comprised entirely of Microsoft Azure Services with no additional investment from us other than people to manage the Microsoft Azure subscription and perform routine requests. MDTS was offered as a commodity service to engineering teams at Microsoft in January 2020. Within a few months we saw a reduction of third-party software and on-premises file server storage, which provided significant savings. This migration also contributed progress towards the company-wide objectives of Internet First and Zero Trust design patterns.

The first design of MDTS provides storage and orchestration using out of the box Microsoft Azure services.
The first design of MDTS provides storage and orchestration using out of the box Microsoft Azure services.

We initially onboarded 35 engineering teams which included 10,000 Microsoft Azure Storage Explorer users (internal and external accounts), and 600 TB per month of Microsoft Azure storage uploads and downloads. By offering the MDTS service, we saved engineering teams from having to run Azure subscriptions themselves and needing to learn the technical details of implementing a modern cloud storage solution.

Creating access control models

As a team, we quickly discovered that having specific repeatable implementation strategies was essential when configuring public facing Microsoft Azure storage. Our initial time investment was in standardizing an access control process which would simplify complexity and ensure a correct security posture before handing off storage to customers. To do this, we constructed onboarding processes for identifying the type of share for which we standardized the implementation steps.

We implemented standard access control models for two types of shares: container shares and sub-shares.

Container share access control model

The container share access control model is used for scenarios where the data owner prefers users to have access to a broad set of data. As illustrated in the graphic below, container shares supply access to the root, or parent, of a folder hierarchy. The container is the parent. Any member of the security group will gain access to the top level. When creating a container share, we also make it possible to convert to a sub-share access control model if desired.

 

Microsoft Azure Storage Explorer grants access to the root of a folder hierarchy using the container share access control model.
Microsoft Azure Storage Explorer grants access to the root, or parent, of a folder hierarchy using the container share access control model. Both engineering and marketing are containers. Each has a specific Microsoft Azure Active Directory Security group. A top-level Microsoft Azure AD Security group is also added to minimize effort for users who should get access to all containers added to the storage account.

This model fits scenarios where group members get Read, Write, and Execute permissions to an entire container. The authorization allows users to upload, download, create, and/or delete folders and files. Making changes to the Access Control restricts access. For example, to create access permissions for download only, select Read and Execute.

Sub-share access control model

The sub-share access control model is used for scenarios where the data owner prefers users have explicit access to folders only. As illustrated in the graphic below, folders are hierarchically created under the container. In cases where several folders exist, a security group access control can be implemented on a specific folder. Access is granted to the folder where the access control is applied. This prevents users from seeing or navigating folders under the container other than the folders where an explicit access control is applied. When users attempt to browse the container, authorization will fail.

 

Microsoft Azure Storage Explorer grants access to sub-folder only using the sub-share access control model.
Microsoft Azure Storage Explorer grants access to sub-folder only using the sub-share access control model. Members are added to the sub-share group, not the container group. The sub-share group is nested in the container group with execute permissions to allow for Read and Write on the sub-share.

This model fits for scenarios where group members get Read, Write, and Execute permissions to a sub-folder only. The authorization allows users to upload, download, create folders/files, and delete folders/files. The access control is specific to the folder “project1.” In this model you can have multiple folders under the container, but only provide authorization to a specific folder.

The sub-share process is only applied if a sub-share is needed.

  • Any folder needing explicit authorization is considered a sub-share.
  • We apply a sub-share security group access control with Read, Write, and Execute on the folder.
  • We nest the sub-share security group in the parent share security group used for Execute only. This allows members who do not have access to the container enough authorization to Read, Write, and Execute the specific sub-share folder without having Read or Write permissions to any other folders in the container.

Applying access controls for each type of share (container and or sub-share)

The parent share process is standard for each storage account.

  • Each storge account has a unique security group. This security group will have access control applied for any containers. This allows data owners to add members and effectively give access to all containers (current and future) by simply changing the membership of one group.
  • Each container will have a unique security group for Read, Write, and Execute. This security group is used to isolate authorization to a single container.
  • Each container will have a unique group for execute. This security group is needed in the event sub-shares are created. Sub-shares are folder-specific shares in the hierarchical namespace.
  • We always use the default access control option. This is a feature that automatically applies the parent permissions to all new child folders (sub-folders).

The first design enabled us to offer MDTS while our engineers defined, designed, and developed an improved experience for all the personas. It quickly became evident that Storage Admins needed the ability to see an aggregate view of all storage actions in near real-time to successfully operate the service. It was important for our administrators to easily discover the most active accounts and which user, service principle, or managed service identity was making storage requests or performing storage actions. In July 2020, we added the Aggregate Storage Actions service.

Adding aggregate storage actions

For our second MDTS design, we augmented the out of the box Microsoft Azure Storage capabilities used in our first design with the capabilities of Microsoft Azure Monitor, Event Hubs, Stream Analytics, Function Apps, and Microsoft Azure Data Explorer to provide aggregate storage actions. Once the Aggregate Storage Actions capability was deployed and configured within MDTS, storage admins were able to aggregate the storage actions of all their storage accounts and see them in a single pane view.

 

The second design of MDTS introduces aggregate storage actions.
The second design of MDTS introduces aggregate storage actions.

The Microsoft Azure Storage Diagnostic settings in Microsoft Azure Portal makes it possible for us to configure specific settings for blob actions. Combining this feature with other Azure Services and some custom data manipulation gives MDTS the ability to see which users are performing storage actions, what those storage actions are, and when those actions were performed. The data visualizations are near real-time and aggregated across all the storage accounts.

Storage accounts are configured to route logs from Microsoft Azure Monitor to Event Hub. We currently have 45+ storage accounts that generate around five million logs each day. Data filtering, manipulation, and grouping is performed by Stream Analytics. Function Apps are responsible for fetching UPNs using Graph API, then pushing logs to Microsoft Azure Data Explorer. Microsoft Power BI and our modern self-service portal query Microsoft Azure Data Explorer and provide the visualizations, including dashboards with drill down functionality. The data available in our dashboard includes the following information aggregated across all customers (currently 35 storage accounts).

  • Aggregate view of most active accounts based on log activity.
  • Aggregate total of GB uploaded and download per storage account.
  • Top users who uploaded showing the user principal name (both external and internal).
  • Top users who downloaded showing the user principal name (both external and internal).
  • Top Accounts uploaded data.
  • Top Accounts downloaded data.

The only setting required to onboard new storage accounts is to configure them to route logs to the Event Hub. Because we can have an aggregate store of all the storage account activities, we are able to offer MDTS customers an account view into their storage account specific data.

Following the release of Aggregate Storage Actions, the MDTS team, along with feedback from customers, identified another area of investment—the need for storage customers to “self-service” and view account specific insights without having role-based access to the subscription or storage accounts.

Providing a self-service experience

To enhance the experience of the other personas, MDTS is now focused on the creation of a Microsoft Azure web portal where customers can self-service different storage and transfer capabilities without having to provide any Microsoft Azure Role Based Access (RBAC) to the underlying subscription that hosts the MDTS service.

When designing MDTS self-service capabilities we focused on meeting these primary goals:

  • Make it possible for Microsoft Azure Subscription owners (Storage Admins) to provide the platform and services while not needing to be in the middle of making changes to storage and transfer services.
  • The ability to create custom persona experiences so customers can achieve their storage and transfer goals through a single portal experience in a secure and intuitive way. Some of the new enterprise scale capabilities include:
    • Onboarding.
    • Creating storage shares.
    • Authorization changes.
    • Distributions. Automating the distribution of data from one source to one or multiple destinations.
    • Provide insights into storage actions (based on the data provided in Storage Actions enabled in our second MDTS release).
    • Reporting basic consumption data, like the number of users, groups, and shares on a particular account.
    • Reporting the cost of the account
  • As Azure services and customer scenarios change, the portal can also change.
  • If customers want to “self-host” (essentially take our investments and do it themselves), we will easily be able to accommodate.

Our next design of MDTS introduces a self-service portal.
Our next design of MDTS introduces a self-service portal.

Storage consumer user experiences

After storage is created and configured, data owners can then share steps for storage consumers to start using storage. Upload and download are the most common storage actions, and Microsoft Azure provides software and services needed to perform both actions for manual and automated scenarios.

Microsoft Azure Storage Explorer is recommended for manual scenarios where users can connect and perform high speed uploads and downloads manually. Both Microsoft Azure Data Factory and AzCopy can be used in scenarios where automation is needed. AzCopy is heavily preferred in scenarios where synchronization is required. Microsoft Azure Data Factory doesn’t provide synchronization but does provide robust data copy and data move. Azure Data Factory is also a managed service and better suited in enterprise scenarios where flexible triggering options, uptime, auto scale, monitoring, and metrics are required.

Using Microsoft Azure Storage Explorer for manual storage actions

Developers and Storage Admins are accustomed to using Microsoft Azure Storage Explorer for both storage administration and routine storage actions (e.g., uploading and downloading). Non-storage admin, otherwise known as Storage Consumers, can also use Microsoft Azure Storage Explorer to connect and perform storage actions without needing any role-based access control or access keys to the storage account. Once the storage is authorized, members of authorized groups can perform routine steps to attach the storage they are authorized for, authenticating with their work email, and leveraging the options based on their authorization.

The processes for sign-in and adding a resource via Microsoft Azure Active Directory are found in the Manage Accounts and Open Connect Dialogue options of Microsoft Azure Storage Explorer.

After signing in and selecting the option to add the resource via Microsoft Azure Active Directory, you can supply the storage URL and connect. Once connected, it only requires a few clicks to upload and download data.

 

Microsoft Azure Storage Explorer Local and Attached module.
Microsoft Azure Storage Explorer Local and Attached module. After following the add resource via Microsoft Azure AD process, the Azure AD group itshowcase-engineering is authorized to Read, Write, and Edit (rwe) and members of the group can perform storage actions.

To learn more about using Microsoft Azure Storage Explorer, Get started with Storage Explorer. There are additional links in the More Information section at the end of this document.

Note: Microsoft Azure Storage Explorer uses AzCopy. Having AzCopy as the transport allows storage consumers to benefit from high-speed transfers. If desired, AzCopy can be used as a stand-alone command line application.

Using AzCopy for manual or automated storage actions

AzCopy is a command line interface used to perform storage actions on authorized paths. AzCopy is used in Microsoft Azure Storage Explorer but can also be used as a standalone executable to automate storage actions. It’s a multi-stream TCP based transport capable of optimizing throughput based on the bandwidth available. MDTS customers use AzCopy in scenarios which require synchronization or cases where Microsoft Azure Storage Explorer or Microsoft Azure Data Factory copy activity doesn’t meet the requirements for data transfer. For more information about using AzCopy please see the More Information section at the end of this document.

AzCopy is a great match for standalone and synchronization scenarios. It also has options that are useful when seeking to automate or build applications. Because AzCopy is a single executable running on either a single client or server system, it isn’t always ideal for enterprise scenarios. Microsoft Azure Data Factory is a more robust Microsoft Azure service that meets the most enterprise needs.

Using Microsoft Azure Data Factory for automated copy activity

Some of the teams that use MDTS require the ability to orchestrate and operationalize storage uploads and downloads. Before MDTS, we would have either built a custom service or licensed a third-party solution, which can be expensive and/or time consuming.

Microsoft Azure Data Factory, a cloud-based ETL and data integration service, allows us to create data-driven workflows for orchestrating data movement. Including Azure Data Factory in our storage hosting service model provided customers with a way to automate data copy activities. MDTS’s most common data movement scenarios are distributing builds from a single source to multiple destinations (3-5 destinations are common).

Another requirement for MSTS was to leverage private data stores as a source or destination. Microsoft Azure Data Factory provides the capability to use a private system, also known as a self-hosted integration runtime. When configured this system can be used in copy activity communicating with on-premises file systems. The on-premises file system can then be used as a source and/or destination datastore.

In the situation where on-premises file system data needs to be stored in Microsoft Azure or shared with external partners, Microsoft Azure Data Factory provides the ability to orchestrate pipelines which perform one or multiple copy activities in sequence. These activities result in end-to-end data movement from one on-premises file systems to Microsoft Azure Storage, and then to another private system if desired.

The graphic below provides an example of a pipeline orchestrated to copy builds from a single source to several private destinations.

 

Detailed example of a Microsoft Azure Data Factory pipeline, including the build system source.
Microsoft Azure Data Factory pipeline example. Private site 1 is the build system source. Build system will build, load the source file system, then trigger the Microsoft Azure Data Factory pipeline. Build is then uploaded, then Private sites 2, 3, 4 will download. Function apps are used for sending email notifications to site owners and additional validation.

For more information on Azure Data Factory, please see Introduction to Microsoft Azure Data Factory. There are additional links in the More Information section at the end of this document.

If you are thinking about using Microsoft Azure to develop a modern data transfer and storage solution for your organization, here are some of the best practices we gathered while developing MDTS.

Close the technical gap for storage consumers with a white glove approach to onboarding

Be prepared to spend time with customers who are initially overwhelmed with using Azure Storage Explorer or AzCopy. At Microsoft, storage consumers represent a broad set of disciplines—from engineers and developers to project managers and marketing professionals. Azure Storage Explorer provides an excellent experience for engineers and developers but can be a little challenging for less technical roles.

Have a standard access control model

Use Microsoft Azure Active Directory security groups and group nesting to manage authorization; Microsoft Azure Data Lake Gen2 storage has a limit to the number of Access Controls you can apply. To avoid reaching this limit, and to simplify administration, we recommend using Microsoft Azure Active Directory security groups. We apply the access control to the security group only, and in some cases, we nest other security groups within the access control group. We nest Member Security Groups within Access Control Security Groups to manage access. These group types don’t exist in Microsoft Azure Active Directory but do exist within our MDTS service as a process to differentiate the purpose of a group. We can easily determine this differentiation by the name of the group.

  • Access Control Security Groups: We use this group type for applying Access Control on ADLS Gen2 storage containers and/or folders.
  • Member Security Groups: We use these to satisfy cases where access to containers and/or folders will constantly change for members.

When there are large numbers of members, nesting prevents the need to add members individually to the Access Control Security Groups. When access is no longer needed, we can remove the Member Group(s) from the Access Control Security Group and no further action is needed on storage objects.

Along with using Microsoft Azure Active Directory security groups, make sure to have a documented process for applying access controls. Be consistent and have a way of tracking where access controls are applied.

Use descriptive display names for your Microsoft Azure AD security groups

Because Microsoft Azure AD doesn’t currently organize groups by owners, we recommend using naming conventions that capture the group’s purpose and type to allow for easier searches.

  • Example 1: mdts-ac-storageacct1-rwe. This group name uses our service standard naming convention for Access Control group type on Storage Account 1, with access control Read, Write, and Execute. mdts = Service, ac = Access Control Type, storageacct1 = ADLS Gen2 Storage Account Name, rwe = permission of the access control.
  • Example 2: mdts-mg-storageacct1-project1. This group name uses our service standard naming convention for Member Group type on Storage Account 1. This group does not have an explicit access control on storage, but it is nested in mdts-ac-storageacct1-rwe where any member of this group has the Read, Write, and Execute access to storage account1 because it’s nested in mdts-ac-storageacct1-rwe.

Remember to propagate any changes to access controls

Microsoft Azure Data Lake Gen2 storage, by default, doesn’t automatically propagate any access control changes. As such, when removing, adding, or changing an access control, you need to follow an additional step to propagate the access control list. This option is available in Microsoft Azure Storage Explorer.

Storage Consumers can attempt Administrative options

Storage Consumers use Microsoft Azure Storage Explorer and are authenticated with their Microsoft Azure Active Directory user profile. Since Azure Storage Explorer is primarily developed for Storage Admin and Developer personas, all administrative actions are visible. It is common for storage consumers to attempt administrative actions, like managing access or deleting a container. Those actions will fail due to only being accessed via access control lists (ACLs). There isn’t a way to provide administration actions via ACL’s. If administrative actions are needed, then users will become a Storage Admin which has access via Azure’s Role Based Access Control (RBAC).

Microsoft Azure Storage Explorer and AzCopy are throughput intensive

As stated above, AzCopy is leveraged by Microsoft Azure Storage Explorer for transport actions. When using Azure Storage Explorer or AzCopy it’s important to understand that transfer performance is its specialty. Because of this, some clients and/or networks may benefit from throttling AzCopy’s performance. In circumstances where you don’t want AzCopy to consume too much network bandwidth, there are configurations available. In Microsoft Azure Storage Explorer use the Settings option and select the Transfers section to configure Network Concurrency and/or File Concurrency. In the Network Concurrency section, Adjust Dynamically is a default option. For AzCopy, there are flags and environment variables available to optimize performance.

For more information, visit Configure, optimize, and troubleshoot AzCopy.

Microsoft Azure Storage Explorer sign-in with MSAL

Microsoft Authentication Library, currently in product preview, provides enhanced single sign-on, multi-factor authentication, and conditional access support. In some situations, users won’t authenticate unless MSAL is selected. To enable MSAL, select the Setting option from Microsoft Azure Storage Explorer’s navigation pane. Then in the application section, select the option to enable Microsoft Authentication Library.

B2B invites are needed for external accounts (guest user access)

When there is a Microsoft business need to work with external partners, leveraging guest user access in Microsoft Azure Active Directory is necessary. Once the B2B invite process is followed, external accounts can be authorized by managing group membership. For more information, read What is B2B collaboration in Azure Active Directory?

Key Takeaways

We used Microsoft Azure products and services to create an end-to-end modern data transfer and storage service that can be used by any group at Microsoft that desires cloud data storage. The release of Microsoft Azure Data Lake Gen 2, Microsoft Azure Data Factory, and the improvements in the latest release of Azure Storage Explorer made it possible for us to offer MDTS as a fully native Microsoft Azure service.

One of the many strengths of using Microsoft Azure is the ability to use only what we needed, as we needed it. For MDTS, we started by simply creating storage accounts, requesting Microsoft Azure Active Directory Security Groups, applying an access control to storage URLs, and releasing the storage to customers for use. We then invested in adding storage actions and developed self-service capabilities that make MDTS a true enterprise-scale solution for data transfer and storage in the cloud.

We are actively encouraging the adoption of our MDTS storage design to all Microsoft engineering teams that still rely on legacy storage hosted in the Microsoft Corporate network. We are also encouraging any Microsoft Azure consumers to consider this design when evaluating options for storage and file sharing scenarios. Our design has proven to be scalable, compliant, and performant with the Microsoft Zero Trust security initiative, handling extreme payloads with high throughput and no constraints on the size or number of files.

By eliminating our dependency on third-party software, we have been able to eliminate third-party licensing, consulting, and hosting costs for many on-premises storage systems.

Are you ready to learn more? Sign up for your own Microsoft Azure subscription and get started today.

To receive the latest updates on Azure storage products and features to meet your cloud investment needs, visit Microsoft Azure updates.

Related links

 

The post Providing modern data transfer and storage service at Microsoft with Microsoft Azure appeared first on Inside Track Blog.

]]>
8732
Digital transformation spotlight: Learning from deploying Microsoft Viva, data and AI across Microsoft http://approjects.co.za/?big=insidetrack/blog/digital-transformation-spotlight-learning-from-deploying-microsoft-viva-data-and-ai-across-microsoft/ Wed, 08 Feb 2023 22:30:24 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=9578 We periodically update our stories, but we can’t verify that they represent the full picture of our current situation at Microsoft. We leave them on the site so you can see what our thinking and experience was at the time. For a transcript, please view the video on YouTube: https://www.youtube.com/watch?v=sEI3kFWPvSQ. Microsoft’s internal IT leaders share […]

The post Digital transformation spotlight: Learning from deploying Microsoft Viva, data and AI across Microsoft appeared first on Inside Track Blog.

]]>
We periodically update our stories, but we can’t verify that they represent the full picture of our current situation at Microsoft. We leave them on the site so you can see what our thinking and experience was at the time.
For a transcript, please view the video on YouTube: https://www.youtube.com/watch?v=sEI3kFWPvSQ.

Microsoft’s internal IT leaders share their learnings from deploying Microsoft Viva and talk about trends in data and AI across Microsoft.

Microsoft Digital video

Welcome to the first episode of “Spotlight on Digital Transformation,” a new video-based series that shines the spotlight on trends in digital transformation globally. In this episode, Inside Track leader Keith Boyd interviews Dan Scarbrough and Alan Stone, who lead Microsoft Digital’s Regional Experience teams. The discussion includes reflections on 2022, insights about Microsoft Viva, trends in data and AI, and predictions for 2023.

This new recurring series will feature some of our world-class experts in Microsoft Digital as they share ideas, insights, and trends that are impacting IT practitioners and the business of Information Technology globally.

Related links

The post Digital transformation spotlight: Learning from deploying Microsoft Viva, data and AI across Microsoft appeared first on Inside Track Blog.

]]>
9578
How Microsoft employees are leveraging the cloud for file storage with OneDrive Folder Backup http://approjects.co.za/?big=insidetrack/blog/how-microsoft-employees-are-leveraging-the-cloud-for-file-storage-with-onedrive-folder-backup/ Wed, 29 Jun 2022 16:00:45 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=8211 We periodically update our stories, but we can’t verify that they represent the full picture of our current situation at Microsoft. We leave them on the site so you can see what our thinking and experience was at the time. Any device, no matter the operating system, is susceptible to a ransomware attack or a […]

The post How Microsoft employees are leveraging the cloud for file storage with OneDrive Folder Backup appeared first on Inside Track Blog.

]]>
Microsoft Digital technical storiesWe periodically update our stories, but we can’t verify that they represent the full picture of our current situation at Microsoft. We leave them on the site so you can see what our thinking and experience was at the time.

Any device, no matter the operating system, is susceptible to a ransomware attack or a device crash.

Microsoft OneDrive Folder Backup (known as Known Folder Move) is a policy deployed by Microsoft that automatically syncs the contents of a user’s critical folders—Documents, Desktop, and Pictures—to the cloud to protect it in the event of device crashes and ransomware attacks. Files are safe in the cloud, easy to share and collaborate on, and are accessible across different devices.

“The goal of this project was to empower every OneDrive user in Microsoft to protect their critical files and sync their “known” folders to the cloud—this gives them seamless access from any of their devices from anywhere without changing the way they work,” says Priya Chebiyam, a senior product manager who leads Microsoft’s internal use of OneDrive for the Microsoft Digital team—the organization that powers, protects, and transforms the company.

Putting data security first

Carini and Chebiyam smile for the camera in a photo taken in an office in a Microsoft building.
Priya Chebiyam (left) and Gaia Carini were instrumental in piloting, testing, and deploying OneDrive Folder Backup (Known Folder Move) across Microsoft. Chebiyam is a senior product manager for Microsoft Digital and Carini is a principal group product manager for the OneDrive product group.

The Known Folder Move project was piloted at the end of 2019, starting with a small group of employees.

A significant step in the pilot was to decide on the deployment approach—would it be silent or prompt-based? With a silent approach, the policy would be automatically initiated for users who would then be notified when their backup was complete. With a prompt-based system, users would be notified at the start of the process and choose whether to opt in or opt out.

While a silent approach is widespread across the industry, Microsoft opted at first to give employees a choice during the program pilot. As the team rolled out the pilot program, a surge in cyberattacks altered the plan.

“We found during the pilot program that opt-in security measures raise levels of vulnerability,” says Chebiyam. “Adoption of security measures was slow in the opt-in pilot. There was also an increased risk of low employee participation.”

“Pivoting to a silent deployment reduces risks,” continues Chebiyam. “So, faced with rising levels of cyberattacks, the choice was clear.”

The Microsoft team swiftly countered rising cyberattacks by switching to silent deployment and rewrote the Microsoft’s corporate security policy to require that all work documents and files reside in a corporate-approved storage system; OneDrive is that system.

With this shift in tactics, the team has been progressively rolling out a new plan that emphasizes security and disaster recovery across the company.

“Security is ingrained in the fabric of our culture,” says James Speller, a client deployment engineer on the project with Microsoft Digital. “The idea is to make data security as easy and non-disruptive as possible without compromising on safety.”

Learning from the results of the Known Folder Move pilot, the company took a different path at LinkedIn from the start, choosing the silent deployment approach.

It’s ideal to keep security measures as non-disruptive to employees as possible, and striking the right balance between security and efficiency has been at the top of our minds during this project.

—Priya Chebiyam, senior product manager

“At LinkedIn, doing it that way was right for their culture and the way they run their business,” Chebiyam says. “We focused on accelerating the adoption of security measures.”

Additionally, the cross-company Known Folder Move team relied heavily on employee feedback to create a better solution and user experience. They took their time to get this rollout right, as this policy affects employee productivity.

“We had to take a step back and consider how the rollout will affect productivity,” says Chebiyam. “It’s ideal to keep security measures as non-disruptive to employees as possible, and striking the right balance between security and efficiency has been at the top of our minds during this project.”

The team used Viva Engage (formerly known as Microsoft Yammer) and OneDrive in-app surveys to collect feedback that would be sent directly to the help desk. Feedback was communicated to the product team, continuously improving the product to provide a better user experience.

After enough feedback was gathered and implemented, the rollout came to the entire Microsoft user base—approximately 290,000 targeted employees and vendors. This user base was divided based on role and geography, and the team started rolling it out to about 5,000 users per batch.

Because files are automatically synced to OneDrive, users don’t have to worry about what happens to their computer, giving them peace of mind that their files are safe.

—Gaia Carini, principal group product manager

New employees and vendors are given this feature by default.

“The rapid growth of KFM-enabled OneDrives will significantly help the admins with any data investigation issues efficiently, with a quicker turnaround during critical emergencies. As a tenant admin, this KFM capability helps me to apply improved security controls on our Corp content residing in user OneDrives across the company,” says Abhishek Sharma, a senior service engineer with the team.

Change management

To get employees on board with using the cloud, messaging focused on the benefits of using OneDrive. These benefits include the amount of storage provided (all OneDrive accounts in Microsoft come with 5 TB of free cloud storage), the ability to access files if your computer is lost, broken, or in a refresh cycle, more secure sharing, easier access, improved collaboration, and real-time versioning.

“Because files are automatically synced to OneDrive, users don’t have to worry about what happens to their computer, giving them peace of mind that their files are safe” says Gaia Carini, a principal group product manager on the experience and devices team. “You don’t have to worry about where your data is or where your content lives.”

While Eva Etchells, a senior content publishing manager on the Microsoft Digital team, worked on messaging internally to employees, our OneDrive product marketing team shaped the narrative around OneDrive Folder Backup outside of Microsoft, communicating the benefits to external stakeholders.

The narrative formed around figuring out how to automatically backup all users’ content without disrupting the way they work. Like Etchells’s messaging, the OneDrive product team focused on device crashes, stolen PCs, ransomware attacks, and so on to drive change management and adoption of the product.

Out of sight, out of mind

With OneDrive Folder Backup, users don’t have to think about the safety and security of their documents or worry about it affecting their productivity. It’s invisible, seamless, and always in sync. Millions of files and hundreds of terabytes of data have been uploaded to OneDrive, and it continues to grow each month.

“OneDrive has provided a valuable benefit to me for a long time,” says Susan Sims, a fan of the service who works in Microsoft Digital as a team Senior Program Manager.

Sims managed global file services years ago that hosted shared content. According to Sims, there was an attack on those file servers nearly monthly, attacks that led to manual lockdowns to make sure the company didn’t lose business-critical content. Microsoft OneDrive Folder Backup has eliminated the risk and concern around losing content from device crashes as well as attacks.

“OneDrive is crucial for recovery from ransomware attacks,” says Vivek Vinod Sharma, a Senior Security Architect who served as the security point of contact for the project for the Microsoft Digital Security and Resilience team. “As a best practice for fast-tracking people to get back to a productive state if affected by an attack, we want more business data to reside in OneDrive.”

Moving forward, the team aims to enable OneDrive Folder Backup through silent deployment for all Windows users.

“OneDrive Folder Backup brings the power of the cloud to the desktop on Windows and macOS,” Carini says.It’s a critical part of the strategy and important for customers to enable in their organizations.”

Key Takeaways
  • Backing up files to the cloud is one of the most secure ways to store critical content to prevent file loss from ransomware attacks.
  • For faster and more effective change management across the organization, focus on the features and benefits employees will gain by adopting the policy to make them more likely to opt-in.
  • For a global rollout, communication is vital to ensure everything runs smoothly, especially when working across four or five different teams and geographies. Defining roles for each person and group is crucial.
  • When you begin moving your employees to OneDrive in the cloud, make sure their needs are at the center of everything you do. Get them as involved in the process as possible and act on as much of their feedback as you can to create a better user experience for everyone.
  • Acknowledge your organization’s policies and processes regarding security and compliance and use that as guidance when rolling out an approach to the entire user base.
  • Employees should be informed regarding what data is being collected and how that data is being used as part of the company security measures.
  • Consider the risks of workers not participating in security back-up options. Enforcing security uniformly as a company-wide policy minimizes potential damage to company assets from ransomware attacks.
Related links
We'd like to hear from you!

The post How Microsoft employees are leveraging the cloud for file storage with OneDrive Folder Backup appeared first on Inside Track Blog.

]]>
8211
Implementing Microsoft Azure cost optimization internally at Microsoft http://approjects.co.za/?big=insidetrack/blog/implementing-microsoft-azure-cost-optimization-internally-at-microsoft/ Tue, 07 Jun 2022 17:35:40 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=9389 We periodically update our stories, but we can’t verify that they represent the full picture of our current situation at Microsoft. We leave them on the site so you can see what our thinking and experience was at the time. Our Microsoft Digital team is aggressively pursuing Microsoft Azure cost optimization as part of our […]

The post Implementing Microsoft Azure cost optimization internally at Microsoft appeared first on Inside Track Blog.

]]>
Microsoft Digital technical storiesWe periodically update our stories, but we can’t verify that they represent the full picture of our current situation at Microsoft. We leave them on the site so you can see what our thinking and experience was at the time.

Our Microsoft Digital team is aggressively pursuing Microsoft Azure cost optimization as part of our continuing effort to improve the efficiency and effectiveness of our enterprise Azure environment here at Microsoft and for our customers.

Adopting data-driven cost-optimization techniques, investing in central governance, and driving modernization efforts throughout our Microsoft Azure environment, makes it so our environment—one of the largest enterprise environments hosted in Azure—is a cost efficient blueprint that all customers can look to for lessons on how to lower their Azure costs.

We began our digital transformation journey in 2014 with the bold decision to migrate our on-premises infrastructure to Microsoft Azure so we could capture the benefits of a cloud-based platform—agility, elasticity, and scalability. Since then, our teams have progressively migrated and transformed our IT footprint to the largest cloud-based infrastructure in the world—we host more than 95 percent of our IT resources in Microsoft Azure.

The Microsoft Azure platform has expanded over the years with the addition of hundreds of services, dozens of regions, and innumerable improvements and new features. In tandem, we’ve increased our investment in Azure as our core destination for business solutions at Microsoft. As our Azure footprint has grown, so has the environment’s complexity, requiring us to optimize and control our Azure expenditures.

Optimizing Microsoft Azure cost internally at Microsoft

Our Microsoft Azure footprint follows the resource usage of a typical large-scale enterprise. In the past few years, our cost-optimization efforts have been more targeted as we attempted to minimize the rising total cost of ownership in Azure due to several factors, including increased migrations from on-premises and business growth. This focus on optimization instigated an investment in tools and data insights for cost optimization in Azure.

The built-in tools and data that Microsoft Azure provides form the core of our cost-optimization toolset. We derive all our cost-optimization tools and insights from data in Microsoft Azure Advisor, Microsoft Azure Cost Management and Billing, and Microsoft Azure Monitor. We’ve also implemented design optimizations based on modern Azure resource offerings. We extract recommendations from Azure Advisor across the different Azure service categories and push those recommendations into our IT service management system, where the services’ owners can track and manage the implementation of recommendations for their services.

Understanding holistic optimization

As the first and largest adopter of Microsoft Azure, we’ve developed best practices for engineering and maintenance in Azure that support not only cost optimization but also a comprehensive approach to capturing the benefits of cloud computing in Azure. We developed and refined the Microsoft Well-Architected Framework as a set of guiding tenets for Azure workload modernization and a standard for modern engineering in Azure. Cost optimization is one of five components in the Well-Architected Framework that work together to support an efficient and effective Azure footprint. The other pillars include reliability, security, operational excellence, and performance efficiency. Cost optimization in Azure isn’t only about reducing spending. In Azure’s pay-for-what-you-use model, using only the resources we need when we need them, in the most efficient way possible, is the critical first step toward optimization.

Optimization through modernization

Reducing our dependency on legacy application architecture and technology was an important part of our first efforts in cost optimization. We migrated many of our workloads from on-premises to Microsoft Azure by using a lift-and-shift method: imaging servers or virtual machines exactly as they existed in the datacenter and migrating those images into virtual machines hosted in Azure. Moving forward, we’ve focused on transitioning those infrastructure as a service (IaaS) based workloads to platform as service (PaaS) components in Azure to modernize the infrastructure on which our solutions run.

Focus areas for optimization

We’ve maintained several focus areas for optimization. Ensuring the correct sizing for IaaS virtual machines was critical early in our Microsoft Azure adoption journey, when those machines accounted for a sizable portion of our Azure resources. We currently operate at a ratio of 80 percent PaaS to 20 percent IaaS, and to achieve this ratio we’ve migrated workloads from IaaS to PaaS wherever feasible. This means transitioning away from workloads hosted within virtual machines and moving toward more modular services such as Microsoft Azure App Service, Microsoft Azure Functions, Microsoft Azure Kubernetes Service, Microsoft Azure SQL, Microsoft Azure Cosmos database. PaaS services like these offer better native optimization capabilities in Microsoft Azure than virtual machines, such as automatic scaling and broader service integration. As the number of PaaS services has increased, automating scalability and elasticity across PaaS services has been a large part of our cost-optimization process. Data storage and distribution has been another primary focus area as we modify scaling, size, and data retention configuration for Microsoft Azure Storage, Azure SQL, Azure Cosmos DB, Microsoft Azure Data Lake, and other Azure storage-based services.

Implementing practical cost optimization

While Microsoft Azure Advisor provides most recommendations at the individual service level—Microsoft Azure Virtual Machines, for example—implementing these recommendations often takes place at the application or solution level. Application owners implement, manage, and monitor recommendations to ensure continued operation, account for dependencies, and keep the responsibility for business operations within the appropriate business group at Microsoft.

For example, we performed a lift-and-shift migration of our on-premises virtual lab services into Microsoft Azure. The resulting Azure environment used IaaS-based Azure virtual machines configured with nested virtualization. The initial scale was manageable using the nested virtualization model. However, the Azure-based solution was more convenient for hosting workloads than the on-premises solution, so adoption began to increase exponentially, which made management of the IaaS-based solution more difficult. To address these challenges, the engineering team responsible for the virtual lab environment re-architected the nested virtual machine design to incorporate a PaaS model using microservices and Azure-native capabilities. This design made the virtual lab environment more easily scalable, efficient, and resilient. The re-architecture addressed the functional challenges of the IaaS-based solution and reduced Azure costs for the virtual lab by more than 50 percent.

In another example, an application used Microsoft Azure Functions with the Premium App Service Plan tier to account for long-running functions that wouldn’t run properly without the extended execution time enabled by the Premium tier. The engineering team converted the logic in the Function Apps to use Durable Functions, an Azure Functions extension, and more efficient function-chaining patterns. This reduced execution time to less than 10 minutes, which allowed the team to switch the Function Apps to the Consumption tier, reducing cost by 82 percent.

Governance

To ensure effective identification and implementation of recommendations, governance in cost optimization is critical for our applications and the Microsoft Azure services that those applications use. Our governance model provides centralized control and coordination for all cost-optimization efforts. Our model consists of several important components, including:

  • Microsoft Azure Advisor recommendations and automation. Advisor cost management recommendations serve as the basis for our optimization efforts. We channel Advisor recommendations into our IT service management and Microsoft Azure DevOps environment to better track how we implement recommendations and ensure effective optimization.
  • Tailored cost insights. We’ve developed dashboards to identify the costliest applications and business groups and identify opportunities for optimization. The data that these dashboards provide help empower engineering leaders to observe and track important Azure cost components in their service hierarchy to ensure that optimization is effective.
  • Improved Microsoft Azure budget management. We perform our Azure budget planning by using a bottom-up approach that involves our finance and engineering teams. Open communication and transparency in planning are important, and we track forecasts for the year alongside actual spending to date to enable accurate adjustments to spending estimates and closely track our budget targets. Relevant and easily accessible spending data helps us identify trend-based anomalies to control unintentional spending that can happen when resources are scaled or allocated unnecessarily in complex environments.

Implementing a governance solution has enabled us to realize considerable savings by making a simple change to Microsoft Azure resources across our entire footprint. For example, we implemented a recommendation to convert Microsoft Azure SQL Database instances from the Standard database transaction unit (DTU) based tier to the General Purpose Serverless tier by using a simple Microsoft Azure Resource Manager template and the auto-pause capability. The configuration change reduced costs by 97 percent.

Benefits of Microsoft Azure

Ongoing optimization in Microsoft Azure has enabled us to capture the value of Azure to help increase revenue and grow our business. Our yearly budget for Azure has remained almost static since 2014, when we hosted most of our IT resources in on-premises datacenters. Over that period, Microsoft has grown by more than 20 percent,

Our recent optimization efforts have resulted in significantly reduced spending across numerous Microsoft Azure services. Examples, in addition to those already mentioned, include:

  • Right-sizing Microsoft Azure virtual machines. We generated more than 300 recommendations for VM size changes to increase cost efficiency. These recommendations included switching to burstable virtual machine sizes and accounted for a 15 percent cost savings.
  • Moving virtual machines to latest generation of virtual machine sizes. Moving from older D-series and E-series VM sizes to their current counterparts generated more almost 2,500 recommendations and a cost savings of approximately 30 percent.
  • Implementing Microsoft Azure Data Explorer recommendations. More than 200 recommendations were made for Microsoft Azure Data Explorer optimization, resulting in significant savings.
  • Incorporating Cosmos DB recommendations. More than 170 Cosmos DB recommendations reduced cost by 11 percent.
  • Implementing Microsoft Azure Data Lake recommendations. More than 30 Azure Data Lake recommendations combined to reduce costs by approximately 15 percent.

Key Takeaways

Cost optimization in Microsoft Azure can be a complicated process that requires significant effort from several parts of the enterprise. The following are some the most important lessons that we’ve taken from our cost-optimization journey:

Implement central governance with local accountability

We implemented a central audit of our Microsoft Azure cost-optimization efforts to help improve our Azure budget-management processes. This audit enabled us to identify gaps in our methods and make the necessary engineering changes to address those gaps. Our centralized governance model includes weekly and monthly leadership team reviews of our optimization efforts. These meetings allow us to align our efforts with business priorities and assess the impact across the organization. The service owner still owns and is accountable for their optimization effort.

Use a data-driven approach

Using optimization-relevant metrics and monitoring from Microsoft Azure Monitor is critical to fully understanding the necessity and impact of optimization across services and business groups. Accurate and current data is the basis for making timely optimization decisions that provide the largest cost savings possible and prevent unnecessary spending.

Be proactive

Real-time data and effective cost optimization enable proactive cost-management practices. Cost-management recommendations provide no financial benefit until they’re implemented. Getting from recommendation to implementation as quickly as possible while maintaining governance over the process is the key to maximizing cost-optimization benefits.

Adopt modern engineering practices

Cost optimization is one of the five components of the Microsoft Azure Well-Architected Framework, and each pillar functions best when supported by proper implementation of the other four. Adopting modern engineering practices that support reliability, security, operational excellence, and performance efficiency will help to enable better cost optimization in Microsoft Azure. This includes using modern virtual machine sizes where virtual machines are needed and architecting for Azure PaaS components such as Microsoft Azure Functions, Microsoft Azure SQL, and Microsoft Azure Kubernetes Service when virtual machines aren’t required. Staying aware of new Azure services and changes to existing functionality will also help you recognize cost-optimization opportunities as soon as possible.

Looking forward to more optimization

As we continue our journey, we’re focusing on refining our efforts and identifying new opportunities for further cost optimization in Microsoft Azure. The continued modernization of our applications and solutions is central to reducing cost across our Azure footprint. We’re working toward ensuring that we’re using the optimal Azure services for our solutions and building automated scalability into every element of our Azure environment. Using serverless and containerized workloads is an ongoing effort as we reduce our investment in the IaaS components that currently support some of our legacy technologies.

We’re also improving our methods for decentralizing optimization recommendations to enable our engineers and application owners to make the best choices for their environments while still adhering to central governance and standards. This includes automating the detection of anomalous behavior in Microsoft Azure billing by using service-wide telemetry and logging, data-driven alerts, root-cause identification, and prescriptive guidance for optimization.

Microsoft Azure optimization is a continuous cycle. As we further refine our optimization efforts, we learn from what we’ve done in the past to improve what we’ll do in the future. Our footprint will continue to grow in the years ahead, and our cost-optimization efforts will expand accordingly to ensure that our business is capturing every benefit that the Azure platform provides.

Related links

We'd like to hear from you!

Want more information? Email us and include a link to this story and we’ll get back to you.

Please share your feedback with us—take our survey and let us know what kind of content is most useful to you.

The post Implementing Microsoft Azure cost optimization internally at Microsoft appeared first on Inside Track Blog.

]]>
9389
Shining a light on how Microsoft manages Shadow IT http://approjects.co.za/?big=insidetrack/blog/shining-a-light-on-how-microsoft-manages-shadow-it/ Mon, 06 Jun 2022 16:01:33 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=9381 We periodically update our stories, but we can’t verify that they represent the full picture of our current situation at Microsoft. We leave them on the site so you can see what our thinking and experience was at the time. Shadow IT is the set of applications, services, and infrastructure that are developed and managed […]

The post Shining a light on how Microsoft manages Shadow IT appeared first on Inside Track Blog.

]]>
Microsoft Digital technical storiesWe periodically update our stories, but we can’t verify that they represent the full picture of our current situation at Microsoft. We leave them on the site so you can see what our thinking and experience was at the time.

Shadow IT is the set of applications, services, and infrastructure that are developed and managed outside of defined company standards. These line-of-business-built solutions (aka Shadow IT) have always existed at Microsoft and are a common industry problem.

Over the years, corporate function teams—including business development, legal, finance, human resources, marketing and sales, support, and consulting—have looked to alternative engineering solutions for many different reasons. Some examples include a lack of IT engineering capacity or prioritization of the business need, historically decentralized budgets, a lack of trust between IT and shadow teams, the need for specialized domain solutions, and the availability of modern tools that enable no-code/low-code solutions to be stood up by citizen developers.

Many of these reasons make strong business sense, if it can be done securely. However, because Shadow IT solutions are often built outside of the guardrails of the company’s engineering systems, they pose a potential compliance risk to the enterprise, specifically in the areas of security, privacy, data governance, and accessibility.

At Microsoft, we needed to first understand if applications built by shadow teams met our security compliance standards. In 2019, we conducted a security assessment on a small random sampling built by shadow teams that showed that all the Shadow apps failed to meet at least two out of three of the key security requirements, and one Shadow app failed all key security requirement areas. This presents a huge and unnecessary risk to the whole company.

Ensuring we address our biggest security vulnerabilities has been our first priority internally at Microsoft in our Shadow IT journey, as the risk in today’s environment is huge. The average data breach in the United States costs $4.2 million (2021 IBM), and cybercrime costs the world $6-7 trillion annually (2020 Annual Cybercrime Report).

Vision

Rather than centralizing all applications into IT, our goal is to reduce or eliminate Microsoft risk by enabling teams to self-manage their assets and ensure that they adhere to the compliance standards set forth by Microsoft. Teams must not only get clean, but also stay clean.

Compliance standards

Microsoft compliance standards are typically defined as four areas of focus, which are all supported by our set of Engineering Fundamentals:

 

Compliance scope includes security, privacy, data governance, accessibility, and engineering fundamentals.
Microsoft compliance standards.

Security: To ensure that the confidentiality, integrity, and availability of the data and systems of an organization is maintained.

Privacy: To ensure control over the collection, use, and distribution of information.

Data Governance: To ensure that the organizational roles and responsibilities by which information is retrieved is captured and maintained appropriately.

Accessibility: To ensure that our products or services are usable by everyone.

Of note, Engineering Fundamentals is seen as an enabler to many compliance areas. Solid engineering fundamentals enables teams with the data, processes, and tools to build solutions that are compliant by design. Retrofitting compliance requirements after a solution has been designed creates additional risk and more work for Microsoft. Additionally, engineering fundamentals enable compliance scale.

Engineering maturity

Given the size and scope of this program, we approached the journey as if we were running a marathon, not a sprint. We kicked off this program in 2020 and have been operating on a multi-year time horizon. Our work has involved and impacted many people, processes, and technology across the enterprise.

Initially, it was important for us to recognize that not all teams were at the same level of maturity. As such, we use the following model to ensure a consistent set of criteria is used to measure engineering maturity, which allowed us to engage with teams at the right level and provide the resources they needed to advance.

 

Moving through the Shadow IT journey, from Level 0 (Unsanctioned) through Level 4 (Optimized).
Moving through the Shadow IT journey starts with lower levels of maturity that focus on centralizing tools and platforms, moves to driving culture change, then to full automation and continuous compliance.

Over time, shadow teams matured their level of engineering fundamentals and ability to adhere to compliance requirements. Most teams started their journey with manual efforts, and have made progress over time, but to date are not fully mature yet. We’re continuing to work toward scaling our efforts, especially as the work gets more complex.

Customized support

Likewise, each division had specific needs for the amount and kind of engagement we provided them, depending on the size, scope, and nature of the team. At Microsoft, we customized the approach based on the nature of the team to successfully move the shadow teams forward in their journey.

 

Pattern Characteristics Approach
Small teams with small asset footprints
  • Teams with a smaller asset (services and Azure subscriptions) footprint.
  • These teams can be more agile in how they organize their efforts around the program.
  • Push teams forward in their modern engineering and security journey.
  • Use learnings from Pattern 1 teams to inform Patterns 2 and 3.
Medium to large asset footprints
  • Teams with a medium-to-large asset footprint.
  • These teams may be able more agile in pockets but will need to look at automation and policy in some cases, and will require the organization to solve for the program collectively.
  • Identify points of contact per organization.
  • Push forward with smaller, more technical teams.
  • Ensure more thorough plans and support models in place for less technical teams.
Large to very-large asset footprint
  • Teams with a large-to-very-large asset footprint.
  • Given the size, complexity, and geographic dispersion of assets, these teams will require automation and much more rigorous planning to move forward.
  • Go slow to go fast: take the time required to define plans and engagement model.
  • Take advantage of established processes, channels, and communication models to mobilize the organization.

While we recognize the difference in approach required for each pattern, the intent of the program remains the same, albeit the timing and approach to the work may be different. Eventually, we plan for this program to become a standard operating principle that is absorbed within normal business functions, instead of being managed as a separate program.

Program approach

We prioritized addressing cloud-based solutions because most Shadow applications existed in the cloud, and the digital environment allowed us to scale the program. We developed a three-step approach to guide our work: visibility, controls, and enforcement.

  • Visibility: Understanding all the assets, devices, identities, cloud tenants and subscriptions, and applications allowed us to create an inventory with clear ownership. To help with visibility in our cloud assets, we built a scanner that inventories Microsoft Azure assets and reads their configurations. Once we identified the assets, we were able to clean up by ensuring each asset was aligned to an appropriate division and eliminate assets that were empty or unused. This helped reduce our scope for moving onto the next phase. The Microsoft Azure Tenant Security Solutions scanner ​is available on GitHub.
  • Controls: We used information from our scanner to compare the remaining assets’ configurations to our defined controls and create reports for all configurations that were out of compliance.
  • Enforcement: We used our inventory and controls reports to start enforcing security and engineering compliance. In many cases, we were able to prevent misconfigurations from the start. When that wasn’t possible, we worked to auto-remediate the non-compliant items to quickly resolve existing issues at scale. To date, we’ve been able to auto-remediate about half of the Microsoft Azure controls we enforce. When auto-remediation wasn’t possible, we employed manual remediation. To manage all this activity, we use a central notification tool that tracks action items and notifies owners of pending deliverables. The tool also allows us to create executive-level reporting to bring awareness of our security risk across all levels of the company.

Lessons learned

Over the past two years, we’ve made a lot of progress, but also encountered many roadblocks. One important discovery is that in specific cases, there may be valid business reasons why an engineering asset may not be able to comply with a security control, and we continue to work with those teams to work around individual parameters to ensure both business and security priorities are met. We also know that this work is never “complete” because security is never-ending; we will continue to update our compliance requirements and approaches as the threat landscape and our technology evolve.

Looking back, there are a few key elements of our Shadow program that enabled our success so far:

Build a team: We funded a central Shadow team within the security organization, led by a dedicated Shadow IT program manager who is fully dedicated to this program. We also obtained program support from the security, IT, and finance departments, and worked together to ensure there were enough IT resources dedicated to this effort to assist with inventory, drive engineering tooling adoption, and provide engineering guidance to the shadow teams. Finally, it was critical to build accountability across the business divisions by appointing one “Directly Responsible Individual” (also known as a DRI) within each participating team, who was accountable for helping their teams work toward compliance, and served as our primary contact and for engaging executive support from those teams.

Drive culture change: While the leaders within the space are important, we quickly realized that we needed to reach the individuals who own and run the Shadow solutions across the company. They needed to understand the importance of security and how to ensure security as a part of their day-to-day actions. We began educating our employees by sharing real security events and highlighting the impacts of these events to emphasize the importance of the actions people take.

We have also adopted a culture to “embrace the red” metrics on the scorecards. We shifted our mindsets to understand that “red” or uncompliant metrics help guide our priorities and work. Once we addressed specific security gaps, those specific metrics turned green, and we immediately replaced the “good” metrics with another “red” metric so that we can continually see progress and address new gaps.

We also provided training, support, and best practice guidance to the shadow teams, including:

  • Gathering compliance activities into requirements in quarterly asks
  • Providing guidance on funding and skills needs in the first year
  • Catering to the lowest knowledge state in wikis and trainings

Be data driven: Managing our reporting process was critical in our ability to drive progress and show the importance of this work. In the early stages, we frequently reviewed our status with executives across the company, and took advantage of our executive sponsor to facilitate these conversations, which helped build momentum. We learned quickly that it was important for us to engage the middle management layer in addition to executives. Our DRIs typically sat two to three layers below the executives, so we needed to ensure there was support for the DRIs between them and the executive.

We also learned over time how to interpret our reporting. We started out reporting on compliance, which worked well until a team had an exception against a control. The exceptions would show up green on our reports. However, an exception is an acceptance of risk, not a sign of compliance. So, we made a plan to start reducing exceptions and began reporting on risk instead of compliance. Reporting on risk aligns well with our Zero Trust reporting, so this was a natural way to drive alignment and create clarity across the company.

What’s next

Our Shadow journey is far from over. We will continue expanding our technology controls and governance to ensure all new solutions and cloud tenants meet compliance standards, and work toward securing the developer pipeline. As for the future of the program, we will reduce custom support processes and enable all teams to adopt our standard enterprise-wide security practices, like the enterprise scorecard and the risk committee. Once teams have met the agreed-upon threshold, the security work will transition from a program into the normal operations of the business.

Key Takeaways

Addressing Shadow IT risk at any company can feel overwhelming at first. Here are a few things that we learned along the way that can help you get started:

Build a team

  • Designate a Shadow IT security program manager
  • Obtain a Directly Responsible Individual (DRI) and executive sponsor from all targeted divisions
  • Engage your CIO and finance partner for sponsorship

Define the scope

  • Scan cloud inventory and configurations within your organization
  • Define cloud security controls

Support

  • Expand engineering and security capabilities to support additional services
  • Develop a communication plan for driving compliance
  • Implement a reporting process to identify focus areas and show progress

Related links

The post Shining a light on how Microsoft manages Shadow IT appeared first on Inside Track Blog.

]]>
9381
Modernizing enterprise integration services at Microsoft with Microsoft Azure http://approjects.co.za/?big=insidetrack/blog/modernizing-enterprise-integration-services-at-microsoft-with-microsoft-azure/ Mon, 11 Apr 2022 16:00:41 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=9398 We periodically update our stories, but we can’t verify that they represent the full picture of our current situation at Microsoft. We leave them on the site so you can see what our thinking and experience was at the time. Our Platform Engineering team in Microsoft Digital Employee Experience (MDEE) wanted to improve the capabilities, […]

The post Modernizing enterprise integration services at Microsoft with Microsoft Azure appeared first on Inside Track Blog.

]]>
Microsoft Digital technical storiesWe periodically update our stories, but we can’t verify that they represent the full picture of our current situation at Microsoft. We leave them on the site so you can see what our thinking and experience was at the time.

Our Platform Engineering team in Microsoft Digital Employee Experience (MDEE) wanted to improve the capabilities, performance, and resiliency of our on-premises integration platform. To do this, the team used Microsoft Azure Integration Services to build a cloud-based integration platform as a service (iPaaS) solution that increased data-transaction throughput and integration capabilities for our enterprise data footprint and improved platform reliability.

Business-to-business (B2B) and app-to-app (A2A) integration are imperatives in modern software solutions. Integration services use middleware technology that helps secure communication between integration points and data exchange between diverse enterprises and business applications. At Microsoft, our business demands integration across multiple independent software systems with diverse message formats such as EDIFACT, X12, XML, JSON, and flat file. Modern integration requires many modes of connectivity and data exchange, and includes the ability to connect:

  • Two or more internal applications.
  • Internal applications to one or more business partners.
  • Internal applications to software as a service (SaaS) applications.

Building on a foundation of enterprise integration

For decades, we as a company have worked to integrate our business data internally and in business-to-business scenarios with partners, vendors, and suppliers. BizTalk Server has been a standard for integration services for us and our partners, providing a foundation for dependable, easy-to-configure data integration.

Our ongoing digital transformation is driving cloud adoption to move business resources out of datacenters. As data storage and application development has evolved, cloud-native solutions based on SaaS and PaaS models have predominated among enterprise applications in most industries. To meet the growing need to supply increased scalability, reduce maintenance overhead for infrastructures, and decrease total cost of ownership, our Platform Engineering team has increasingly moved toward cloud-based solutions for enterprise integration.

Transforming integration with Microsoft Azure

Our Platform Engineering team began investigating Microsoft Azure Integration Services as a potential solution for scalable, cloud-based enterprise integration. Integration Services combines several Microsoft Azure services, including Logic Apps, API Management, Service Bus, Event Grid, and Azure Functions. These services provide a complete platform that companies can use to integrate business applications and data sources. Our team began working with Integration Services to gauge feasibility, test integration scenarios, and plan for enterprise-scale integration capabilities on the platform.

Collaborating to improve Microsoft Azure Integration Services

Throughout the development process, our Platform Engineering team worked closely with the Integration Services product group to enhance and build connectors. This collaboration allowed us to suggest improvements to existing Integration Services functionality. This effort prompted the creation of two new Logic Apps connectors—SAP with Secure Network Communication (SNC) and Simple Mail Transport Protocol (SMTP)—and enhancements to two existing Logic Apps connectors (EDIFACT and X12).

Examining our Azure Integration Services architecture

We in MDEE use all Microsoft Azure Integration Services components in its architecture to support end-to-end integration. Each component supplies an important part of the larger solution, including:

  • API Management for APIs, policies, rate limiting, and authentication.
  • Logic Apps for business workflows, orchestration, message decoding and encoding, schema validations, transformations, and integration accounts to store B2B partner profiles, agreements, schemas, and certificates.
  • Microsoft Azure Event Grid for event-driven integration to publish and subscribe to business events.
  • Microsoft Azure Functions for writing custom logic tasks, including metadata and config lookup, data lookup, duplicate check, replace namespace, and replace segments.
  • Microsoft Azure Data Factory for processing low volume, large payload messages, ETL processes, and data transformation.

We used Microsoft Azure Front Door as the entry point for all inbound traffic and helped secure endpoints by using Microsoft Azure Web Application Firewall configured with assignment permissions for allowed IP addresses. Additionally, API Management enabled us to abstract the authentication layer from the processing pipeline to help increase security and simplify processing of incoming data.

We deployed the entire solution to an integration service environment, which supplied a fully isolated and dedicated integration environment and other benefits, including autoscaling, increased throughput limits, larger storage retention, improved availability, and a predictable cost model.

The following figure illustrates our solution’s architecture using Microsoft Azure Integration Services.

Azure Integration Services architecture diagram, showing the experience layer, messaging layer, and operations layer.
Microsoft Azure Integration Services architecture for Microsoft Digital Employee Experience.

The solution architecture adheres to several important design principles and goals, including:

  • Pattern-based workflows that enable dynamic decisions using partner information.
  • Self-contained extensible workflows that can be modified and improved without affecting existing components.
  • A gateway component to store and forward messages.
  • Publish and subscribe services for data pipeline output.
  • Complete B2B and A2A pipeline processing with 100 transactions per second throughput and message handling up to 100 megabytes (MB) per message.

Designing dataflow pipelines

Our dataflow pipelines perform processing for most of our business-data transformation and movement tasks. We designed the B2B and A2A processing pipelines using Logic Apps and Microsoft Azure Functions, processing documents in their native format and delivering them to line of business (LOB) or enterprise resource planning (ERP) systems such as Finance, HR, Volume Licensing, Supply Chain, and SAP.

  • B2B pipeline. Electronic data interchange (EDI) documents such as purchase orders are brought in using AS2, processed using X12 standards, transformed, decoded and encoded using Logic Apps and Azure Functions, and then sent to the LOB app using the Logic Apps HTTP adapter.
  • A2A pipeline. Documents such as XML/JSON come in using one of the built-in adapters including SAP, File, SQL, SSH File Transport Protocol (SFTP), or HTTP. The documents are debatched, transformed, decoded, and encoded using Logic Apps and Azure Functions, and then sent to the line-of-business system using the appropriate Logic Apps adapter.

Our integration solution used these pipelines in practical business scenarios across many lines of business at Microsoft, such as for volume licensing. A hardware manufacturer that includes Windows or Microsoft Office in their laptops submits an order for Windows or Office license to Microsoft’s ordering system, which sends the order details to our integration suite. The suite validates the messages, transforms them to IDoc format, and routes the IDoc to SAP using a data gateway for taxation and invoice generation. SAP generates an order acknowledgement in IDoc format and then passes it to the integration suite, which transforms the IDoc message into a format that the Microsoft ordering system will recognize.

Here’s another example from Microsoft Finance. An employee incurs an expense using a corporate credit card and the issuing financial institution sends a transaction report to the integration solution, which validates the message and performs currency conversion before sending it to Microsoft’s expense-management system for further approvals. After it’s approved in the expense-management system, the remittance transaction flows through the integration suite back to the banking system for payment settlement.

Capturing end-to-end messaging telemetry

We designed our solution to monitor message flow across the pipeline. Every transaction injects data into the telemetry pipeline using Microsoft Azure Event Hubs. The pipeline synthesizes and correlates that data to identify end-to-end processing status and recognize runtime failures. We built a custom tracking service that monitors and tracks important metrics for end-to-end workflows by using visual indicators on a dashboard. Accurate and readily available telemetry creates a more robust and reliable integration environment and improves the customer experience across pipelines.

Key Takeaways

We’ve realized several benefits across our integration environment, including:

  • Increased scalability. Our integration solution processes millions of monthly transactions, including 10 million B2B, 2.5 million A2A, and 74 million hybrid cloud transactions.
  • Improved quality of service. We used cross-region deployment with active-active configuration and thorough handling of faults to help achieve 99.9 percent in availability and reliability metrics.
  • Reduced total cost of ownership. We’ve reduced monthly costs in Microsoft Azure by more than 40 percent with this iPaaS solution.
  • Increased customer engagements. We’re working toward increasing Microsoft Azure Integration Services adoption by promoting this solution to our partners, vendors, and suppliers.

Microsoft Azure Integration Services has created an improved and more efficient integration environment for Microsoft. The increased scalability, reliability, and cost-effectiveness of Azure Integration Services has moved our business into a better position to actively collaborate with and operate alongside our partners, suppliers, and vendors. We’re continuing to transform our integration services landscape with Azure Integration Services to keep pace with the rapidly changing modern business environment.

Related links

The post Modernizing enterprise integration services at Microsoft with Microsoft Azure appeared first on Inside Track Blog.

]]>
9398
Five key learnings from Microsoft’s Windows 11 upgrade http://approjects.co.za/?big=insidetrack/blog/five-key-learnings-from-microsofts-windows-11-upgrade/ Tue, 05 Apr 2022 21:15:09 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=9638 We periodically update our stories, but we can’t verify that they represent the full picture of our current situation at Microsoft. We leave them on the site so you can see what our thinking and experience was at the time. For a transcript, please view the video on YouTube: https://www.youtube.com/watch?v=1d4z5N5XCsA. Watch as Biswa Jaysingh, a […]

The post Five key learnings from Microsoft’s Windows 11 upgrade appeared first on Inside Track Blog.

]]>
We periodically update our stories, but we can’t verify that they represent the full picture of our current situation at Microsoft. We leave them on the site so you can see what our thinking and experience was at the time.
For a transcript, please view the video on YouTube: https://www.youtube.com/watch?v=1d4z5N5XCsA.

Watch as Biswa Jaysingh, a principal group program manager on the Microsoft Digital Employee Experience team, shares five key learnings from releasing Windows 11 across Microsoft. Jaysingh shares how understanding your hardware environment plays a critical role in a good upgrade and explains how existing tools provided by Microsoft make it easy to prepare for a new release.

Microsoft Digital video

The most important lesson learned during this deployment, according to Jaysingh, was the use of Windows Update for Business deployment service. Jaysingh shares how the tool helped create the “smoothest deployment” in the history of Windows releases by supplying a single, simple plan for IT admins to follow.

“In our experience, with every passing day, we are noticing the value it delivers in their day-to-day lives—working remotely or working in-person,” Jaysingh says.

Try it out

Learn about the many advantages of upgrading to Windows 11.

We'd like to hear from you!

The post Five key learnings from Microsoft’s Windows 11 upgrade appeared first on Inside Track Blog.

]]>
9638
Employees are at the heart of Microsoft’s internal Windows 11 upgrade http://approjects.co.za/?big=insidetrack/blog/employees-are-at-the-heart-of-microsofts-internal-windows-11-upgrade/ Tue, 05 Apr 2022 15:28:09 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=9623 We periodically update our stories, but we can’t verify that they represent the full picture of our current situation at Microsoft. We leave them on the site so you can see what our thinking and experience was at the time. For a transcript, please view the video on YouTube: https://www.youtube.com/watch?v=40B99JJpaUo. Wangui McKelvey and Nathalie D’Hers […]

The post Employees are at the heart of Microsoft’s internal Windows 11 upgrade appeared first on Inside Track Blog.

]]>
We periodically update our stories, but we can’t verify that they represent the full picture of our current situation at Microsoft. We leave them on the site so you can see what our thinking and experience was at the time.
For a transcript, please view the video on YouTube: https://www.youtube.com/watch?v=40B99JJpaUo.

Wangui McKelvey and Nathalie D’Hers speak about Microsoft’s internal Windows 11 upgrade. McKelvey is the general manager of Microsoft 365 and D’Hers is Microsoft’s corporate vice president of Microsoft Digital Employee Experience.

Microsoft Digital Video

Watch the video to hear Wangui McKelvey and Nathalie D’Hers discuss how Windows 11 is helping Microsoft employees embrace the new hybrid workplace. McKelvey is the general manager of Microsoft 365 and D’Hers is the corporate vice president of the Microsoft Digital Employee Experience team.

“During the pandemic, we’ve seen just how important Windows 11 has become,” McKelvey says. “From remote onboarding to virtual meetings, emails, and casual coffee chats, Windows 11 has become the secure platform that’s foundational to our hybrid workplace strategy.”

Employees are the backbone of any organization, so getting the employee experience right is a foundational tenet of our company’s success.

“Whether it’s day one or year 20, every employee needs the right experience to be successful in their role,” D’Hers says. “To have success in the hybrid workplace requires strong alignment between your digital experiences, physical spaces, and organizational culture.“

Try it out

Learn about the many advantages of upgrading to Windows 11.

We'd like to hear from you!

Want more information? Email us and include a link to this story and we’ll get back to you.

The post Employees are at the heart of Microsoft’s internal Windows 11 upgrade appeared first on Inside Track Blog.

]]>
9623
Unpacking Microsoft’s speedy upgrade to Windows 11 http://approjects.co.za/?big=insidetrack/blog/unpacking-microsofts-speedy-upgrade-to-windows-11/ Tue, 05 Apr 2022 12:24:19 +0000 http://approjects.co.za/?big=insidetrack/blog/?p=9193 We periodically update our stories, but we can’t verify that they represent the full picture of our current situation at Microsoft. We leave them on the site so you can see what our thinking and experience was at the time. Like our customers, we at Microsoft have a strong business need to address the new […]

The post Unpacking Microsoft’s speedy upgrade to Windows 11 appeared first on Inside Track Blog.

]]>
We periodically update our stories, but we can’t verify that they represent the full picture of our current situation at Microsoft. We leave them on the site so you can see what our thinking and experience was at the time.

Microsoft Digital technical storiesLike our customers, we at Microsoft have a strong business need to address the new challenges created by remote and hybrid work. The internal adoption of Windows 11 is helping our company meet those needs, while enabling our employees to work smarter and more securely, regardless of where they are.

Upgrading to Windows 11 at Microsoft

Our priority in rolling out Windows 11 internally was to provide employees uninterrupted access to a safe and productive workspace while giving them a chance to try out the new operating system.

Introducing a new operating system, especially across a distributed workforce, naturally led to questions about device downtime and app compatibility. However, with established practices and evolved solutions in hand, historical obstacles became just that—a thing of the past. The rollout of Windows 11 at Microsoft was our most streamlined to date, frictionlessly delivering employees the latest operating system in record time.

What made the deployment of Windows 11 a success?

Over the past decade, our Microsoft Digital Employee Experience team, the organization that powers, protects, and transforms employee experiences, has worked closely with teams such as the Windows product group to improve how it runs Microsoft’s updates, upgrades, and deployments.

Whereas significant time and resources were once dedicated to testing app compatibility, building out multiple disk images, and managing a complex delivery method, processes and tools introduced during Windows 10 have streamlined upgrades and enabled the transformation to a frictionless experience.

Data from App Assure, a Microsoft service available to all customers with eligible subscriptions, shows the company had 99.7 percent compatibility for all apps in Windows 11—that eliminated the need for extensive testing. It also meant that employees’ Windows 10 apps work seamlessly in Windows 11. Additionally, Microsoft Endpoint Manager and Windows Update for Business eliminated the need for using more than one disk image and made it easier for employees to get Windows 11.

Our Microsoft Digital Employee Experience team relied on the same familiar tools and process as a Windows 10 feature update to quickly deliver the upgrade to employees.

The upgrade was divided into three parts:

Plan: Identify an execution and communication plan, then develop a timeline.

Prepare: Establish reporting systems, run tests, ready employees, and build back-end services.

Deploy: Deploy Windows 11 to eligible devices.

It all starts with a good plan

We at Microsoft Digital Employee Experience have a successful history of deploying new services, apps, and operating systems to employees. And it all starts at the same place—creating a disruption-free strategy that enables employees to embrace the latest technology as soon as possible without sacrificing productivity.

Assess the environment

Before the deployment of Windows 11 could begin, we had to take a careful inventory of all devices at Microsoft and determine which they should target. Windows 11 has specific hardware requirements, and a percentage of employees running ineligible devices meant that not every device would be upgraded. Employees with these devices will upgrade to Windows 11 during their next device refresh.

To evaluate the device population, we used Windows Update for Business reports and Microsoft Endpoint Manager’s Endpoint analytics feature. This allowed our team to generate reports on devices that either met or failed to comply with minimum specifications. For example, certain devices, especially older desktops, lacked the Trusted Platform Module 2.0 (TPM) chipset requirements for security in Windows 11.

In the end, 190,000 devices were deemed eligible based on hardware and role requirements. Over the course of five weeks, our Microsoft Digital Employee Experience team deployed Windows 11 to 99 percent of qualifying devices.

Address ineligible devices and exclusions

After evaluating the broad population of devices, our team developed a plan for devices that would not receive a Windows 11 upgrade. Since Windows 10 and Windows 11 can be seamlessly managed side-by-side within the same management system, we only had to designate the number of devices that would not receive the upgrade. Using Windows Update for Business reports to inform deployment policies, we applied controls on ineligible devices, automatically skipping them during deployment. These measures made it easy to know why a device didn’t upgrade, but also assured a disruption-free experience for both employees and those on our team responsible for managing the upgrade.

These controls also allowed the company to bypass deployment on any device that had been incorrectly targeted for an upgrade.

Ineligible devices. Windows 10 and Windows 11 can be managed side-by-side and will be supported concurrently at Microsoft until all devices are upgraded or retired. As devices are refreshed, more and more of our employees will gain access to Windows 11.

Devices that should not receive the upgrade. Other devices, like servers and test labs—where we validate new products on previous operating systems—were issued controls and excluded from receiving Windows 11.

Establish a deployment timeline

Once upgradeable devices were identified, our team was able to create a clear timeline. From this schedule, our communications team developed an outreach plan, support teams readied the helpdesk, and the deployment team developed critical reporting mechanisms to track progress.

For the deployment itself, our team used a ring-based approach to segment the deployment into several waves. This allowed us to gradually release Windows 11 across the company, reducing the risk of disruption.

Graphic showing Microsoft's internal Windows 11 upgrade milestones on a timeline.
Microsoft’s internal upgrade to Windows 11 hinged on effective end-to-end communication.

Create a rollback plan

Windows 11 has built-in support for rolling back to Windows 10 with a default window of 10 days after installation. If needed, our Microsoft Digital Employee Experience team could have revised this period via group policy or script using Microsoft Intune. Post-upgrade, there wasn’t much demand for a rollback, but the strategic release cadence that the team used, paired with the rollback capability, gave our team an easy way to quickly revert devices that might require going back to Windows 10 for a business need.

Preparing for success

Prior to starting the Windows 11 upgrade, we asked employees to complete pre-work needed for a successful upgrade. Because the upgrade was so smooth, only light readiness communications were needed. Instead, we focused on ensuring that employees were aware and excited about the benefits of Windows 11 and that they were ready to share their feedback on what it was like to use it.

Reach everyone

To maximize the impact of our communications, our team readied content that was digestible for every employee, regardless of role, in an onboarding kit. Employees needed clear and concise messaging that would resonate, so that they could understand what Windows 11 would mean for them.

Our team in Microsoft Digital Employee Experience targeted a variety of established channels, including Yammer, FAQs on Microsoft SharePoint, email, Microsoft Teams, Microsoft’s internal homepage, and digital signage to promote Windows 11.

To generate interest, our materials focused on:

  • The new look and features of Windows 11, designed for hybrid work and built on Zero Trust
  • Flexible and easy upgrade options, including the ability to schedule upgrades at a time that worked best for the employee
  • The speed at which employees could be up and running with Windows 11—as quickly as 20 minutes
  • New terms related to Windows 11 and where employees could go to learn more

An entire page on our company’s internal helpdesk site was dedicated to links related to the upgrade, including Microsoft Learn, where users could find a comprehensive library on new features.

Executive announcements from company leadership also conveyed the benefit of moving to Windows 11 and the ease with which it could be done.

Set expectations

Our team directed employees waiting to see if their device met Windows 11’s hardware requirements to the PC Health Check app. At an enterprise level, the team relied on Windows Update for Business reports to assess the device population.

We also used this opportunity to reinforce messaging to Windows 10 users—both operating systems would continue to operate side-by-side until all devices were refreshed. This helped ease concerns for employees who had to wait for an upgrade.

Ready support

Getting the deployment right wasn’t just about sending messages outward. Our team needed to receive and respond to employee questions before, during, and after the Windows 11 rollout.

Our support teams were given an opportunity to delve into Windows 11 prior to the deployment, which, based on experiences with previous upgrades, gave them time to categorize and group by severity any potential issues they might encounter. This familiarity not only helped them give employees informed answers, but also served as another feedback-gathering mechanism.

Open for feedback

We run Microsoft on Microsoft technology and we encourage our employees to join the Windows Insider Program, where users are free to provide feedback directly to developers and product teams.

That’s why communications didn’t just focus on what was new with Windows 11, but on how feedback could be shared. If an employee had comments, they submitted them through a Feedback Hub where other employees could upvote tickets, giving visibility to our engineers in Microsoft Digital Employee Experience and the Windows product group.

Pre-work for deployment readiness

In addition to readying employees, we had to make sure all the back-end services were in place prior to the deployment. This included building several processes, setting up analytics, and testing.

Establish analytics reports

Evolving beyond previous upgrades, the deployment of Windows 11 was the most data driven release we have ever done. Looking closer at diagnostic data and creating better adoption reporting gave our team clear data to look at throughout the deployment.

Using Microsoft Power BI, our team could share insights regarding the company’s environment. This better prepared everyone on the team and allowed us to monitor progress during deployment.

Our team captured the following metrics:

  • Device population
  • Devices by country
  • Devices by region
  • Eligibility
  • Adoption

In addition to visibility into project status, access to this data empowered our team to engage employees whose eligible devices did not receive the upgrade.

Build an opt-out process

To accommodate users whose eligible devices might need to be excluded from the deployment, our team created a robust workback plan that included a request and approval process, a tracking system, and a set timeline for how long devices would be excluded from the upgrade.

Our Microsoft Digital Employee Experience team released communications specifying the timeframe for employees to opt out, including process steps. Employees who needed to remove their devices from the upgrade submitted their alias, machine name, and reason for exclusion. From there, our team evaluated their requests. Only users with a business reason were allowed to opt out. For example, Internet Explorer 11 requires Windows 10, so employees who need that browser for testing purposes were allowed to remove their devices from the deployment.

Once we had approved devices for exclusion, a block was put in place to remove them from the deployment. Data gathered during the opt-out process enabled us to follow up with these employees, upgrading them to Windows 11 at a more appropriate time.

Create a security model

At Microsoft, security is always top of mind for us. A careful risk assessment, including testing out a series of threat scenarios, was performed before Windows 11 was deployed across the company.

Our Microsoft Digital Employee Experience team built several specific Windows 11 security policies in a test environment and benchmarked them against policies built for Windows 10.

After testing the policies and scenarios to see if they would have any impact on employees, we found that devices with Windows 11 would meet Microsoft’s rigorous security thresholds without creating any disruptions. Just as importantly, users would experience the same behaviors in Windows 11 as they might expect from Windows 10.

The deployment

A decade ago, our efforts to deploy feature updates could be challenging, as we needed to account for different builds, languages, policies, and more. This required careful management of distribution points and VPNs prior to beginning deployment efforts in earnest.

When Windows 10 was released in 2015, our team used two deployment strategies: one for on-premises managed devices and one for cloud-managed devices.

Today, the situation is much simpler.

Launched during the Windows 10 era, Windows Update for Business established some of the trusted practices that make product releases and feature updates a great experience for us here at Microsoft. Windows Update for Business deployment service introduces new efficiencies for our team, consolidating two deployment strategies into one.

For the deployment of Windows 11, our team had an advantage—Windows Update for Business deployment service.

Windows Update for Business deployment service enabled our Microsoft Digital Employee Experience team to grab device IDs from across the environment and use them to automate the deployment. Windows Update for Business deployment service handled all the back-end processing and scheduling for us; all we needed to do was determine the start and end dates.

Our team easily managed exclusions and opt-outs with Windows Update for Business deployment service, and when a device needed to be upgraded, the service made it easier to remove and roll them back to Windows 10.

Importantly, Windows Update for Business deployment service provides a single deployment strategy for us moving forward. Deployment has been simplified, and the data loaded into Windows Update for Business deployment service for this upgrade will help speed up future releases.

Policies for success

We had to decide which policies they wanted to work with for the greatest outcome. This included how many alerts an employee would receive before receiving an upgrade to Windows 11.

Windows Update for Business deployment services reduced the long list of policies that our team needed to manage during deployment. This accelerated deployment without compromising security.

From pilot to global deployment

By structuring the deployment timeline to hit a small group of employees before incrementally moving on to a larger population, our Microsoft Digital Employee Experience team ensured Windows Update for Business deployment service ran as expected and that all required controls and permissions were set.

As our team used the Windows Update for Business deployment service to plot out upgrade waves, Windows 11 downloaded in the background and employees received pop-up alerts when their device was ready. The employee could restart at any time and would boot into Windows 11 after a few automated systems completed the installation. Employees could also schedule Windows 11 to upgrade overnight or during the weekend.

Onboarding OEMs

Working closely with Microsoft Surface and other Original Equipment Manufacturer (OEM) partners, the companies who supply Microsoft with new devices, our team was able to ensure that our employees had Windows 11 pre-loaded onto their PCs. This approach guaranteed that new devices complied with the hardware requirements of the new system.

A new device, straight out of the box, only needs to be powered on and connected to the internet before Windows Autopilot authenticates and configures everything for the user. Once initial setup is complete, Windows Autopilot ensures that new devices are equipped with Windows 11 and all the correct policies and settings.
For a transcript, please view the video on YouTube: https://www.youtube.com/watch?v=1d4z5N5XCsA.

Biswa Jaysingh shares five key learnings from releasing Windows 11 across Microsoft. Jaysingh is a principal group program manager on the Microsoft Digital Employee Experience team.

Entering the next stage of Windows at Microsoft

The deployment of Windows 11 at Microsoft validates our team’s approach to product releases and upgrades. With no measured uptick in support tickets, the deployment of Windows 11 has been a frictionless experience for employees and the wide adoption of new features confirms the value of the effort. The speed at which the team completed the deployment—190,000 devices in five weeks—represents the fastest deployment of a new operating system in company history.

We credit the success of this deployment to good planning, tools, strong communication, and the positive upgrade experience Windows 11 provides.

Windows Update for Business deployment service proved to be a big step in the evolution of how employees get the latest version of Windows. The service’s ease of use meant the team had a higher degree of control, flexibility, and confidence.

The tighter hardware-to-software ecosystem that comes with Windows 11 means our employees and all users of the operating system benefit from richer experiences. This, along with integration to Microsoft Teams, are just a few examples of what users are seeing now that they’re empowered by Windows 11.

Key Takeaways
  • Understand the hardware eligibility requirements for Windows 11.
  • The better you understand your environment the easier it will be to create a timeline, a communication plan, and ultimately track the deployment.
  • Messaging is key for leaders in the organization to share, especially for adoption.
  • Run a pilot with a handful of devices before deploying company wide. This will allow you to check policies for consistent experiences. Then move on to a ring-based deployment to carefully manage everything.
  • There’s no need to create multiple deployment plans with Windows Update for Business deployment service; it can automate the experience, streamlining the entire workflow. Instead of waiting until everyone is ready, consider running Windows 10 and Windows 11 side-by-side. Prepare today by deploying to those who are ready now.
Try it out

Learn about the many advantages of upgrading to Windows 11.

Related links
We'd like to hear from you!

Want more information? Email us and include a link to this story and we’ll get back to you.

The post Unpacking Microsoft’s speedy upgrade to Windows 11 appeared first on Inside Track Blog.

]]>
9193