Azure Archives - Inside Track Blog

Enhancing VPN performance at Microsoft

Inside Track staff — Sun, 26 Jan 2025 17:00:13 +0000

[Editor’s note: This content was written to highlight a particular event or moment in time. Although that moment has passed, we’re republishing it here so you can see what our thinking and experience was like at the time.]

Modern workers are increasingly mobile and require the flexibility to get work done outside of the office. Here at Microsoft headquarters in the Puget Sound area of Washington State, every weekday an average of 45,000 to 55,000 Microsoft employees use a virtual private network (VPN) connection to remotely connect to the corporate network. As part of our overall Zero Trust Strategy, we have redesigned our VPN infrastructure, something that has simplified our design and let us consolidate our access points. This has enabled us to increase capacity and reliability, while also reducing reliance on VPN by moving services and applications to the cloud.

Providing a seamless remote access experience

Remote access at Microsoft is reliant on the VPN client, our VPN infrastructure, and public cloud services. We have had several iterative designs of the VPN service inside Microsoft. Regional weather events in the past required large increases in employees working from home, heavily taxing the VPN infrastructure and requiring a completely new design. Three years ago, we built an entirely new VPN infrastructure, a hybrid design, using Microsoft Azure Active Directory (Azure AD) load balancing and identity services with gateway appliances across our global sites.

Key to our success in the remote access experience was our decision to deploy a split-tunneled configuration for the majority of employees. We have migrated nearly 100% of previously on-premises resources into Microsoft Azure and Microsoft Office 365. Our continued efforts in application modernization are reducing the traffic on our private corporate networks as cloud-native architectures allow direct internet connections. The shift to internet-accessable applications and a split-tunneled VPN design has dramatically reduced the load on VPN servers in most areas of the world.

Using VPN profiles to improve the user experience

We use Microsoft Endpoint Manager to manage our domain-joined and Microsoft Azure AD–joined computers and mobile devices that have enrolled in the service. In our configuration, VPN profiles are replicated through Microsoft Intune and applied to enrolled devices; these include certificate issuance that we create in Configuration Manager for Windows 10 devices. We support Mac and Linux device VPN connectivity with a third-party client using SAML-based authentication.

We use certificate-based authentication (public key infrastructure, or PKI) and multi‑factor authentication solutions. When employees first use the Auto-On VPN connection profile, they are prompted to authenticate strongly. Our VPN infrastructure supports Windows Hello for Business and Multi-Factor Authentication. It stores a cryptographically protected certificate upon successful authentication that allows for either persistent or automatic connection.

For more information about how we use Microsoft Intune and Endpoint Manager as part of our device management strategy, see Managing Windows 10 devices with Microsoft Intune.

Configuring and installing VPN connection profiles

We created VPN profiles that contain all the information a device requires to connect to the corporate network, including the supported authentication methods and the VPN gateways that the device should connect to. We created the connection profiles for domain-joined and Microsoft Intune–managed devices using Microsoft Endpoint Manager.

For more information about creating VPN profiles, see VPN profiles in Configuration Manager and How to Create VPN Profiles in Configuration Manager.

The Microsoft Intune custom profile for Intune-managed devices uses Open Mobile Alliance Uniform Resource Identifier (OMA-URI) settings with XML data type, as illustrated below.

Creating a Profile XML and editing the OMA-URI settings to create a connection profile in System Center Configuration Manager.

Installing the VPN connection profile

The VPN connection profile is installed using a script on domain-joined computers running Windows 10, through a policy in Endpoint Manager.

For more information about how we use Microsoft Intune as part of our mobile device management strategy, see Mobile device management at Microsoft.

Conditional Access

We use an optional feature that checks the device health and corporate policies before allowing it to connect. Conditional Access is supported with connection profiles, and we’ve started using this feature in our environment.

Rather than just relying on the managed device certificate for a “pass” or “fail” for VPN connection, Conditional Access places machines in a quarantined state while checking for the latest required security updates and antivirus definitions to help ensure that the system isn’t introducing risk. On every connection attempt, the system health check looks for a certificate that the device is still compliant with corporate policy.

Certificate and device enrollment

We use an Azure AD certificate for single sign-on to the VPN connection profile. And we currently use Simple Certificate Enrollment Protocol (SCEP) and Network Device Enrollment Service (NDES) to deploy certificates to our mobile devices via Microsoft Endpoint Manager. The SCEP certificate we use is for wireless and VPN. NDES allows software on routers and other network devices running without domain credentials to obtain certificates based on the SCEP.

NDES performs the following functions:

It generates and provides one-time enrollment passwords to administrators.
It submits enrollment requests to the certificate authority (CA).
It retrieves enrolled certificates from the CA and forwards them to the network device.

For more information about deploying NDES, including best practices, see Securing and Hardening Network Device Enrollment Service for Microsoft Intune and System Center Configuration Manager.

VPN client connection flow

The diagram below illustrates the VPN client-side connection flow.

The client-side VPN connection flow.

When a device-compliance–enabled VPN connection profile is triggered (either manually or automatically):

The VPN client calls into the Windows 10 Azure AD Token Broker on the local device and identifies itself as a VPN client.
The Azure AD Token Broker authenticates to Azure AD and provides it with information about the device trying to connect. A device check is performed by Azure AD to determine whether the device complies with our VPN policies.
If the device is compliant, Azure AD requests a short-lived certificate. If the device isn’t compliant, we perform remediation steps.
Azure AD pushes down a short-lived certificate to the Certificate Store via the Token Broker. The Token Broker then returns control back over to the VPN client for further connection processing.
The VPN client uses the Azure AD–issued certificate to authenticate with the VPN gateway.

Remote access infrastructure

At Microsoft, we have designed and deployed a hybrid infrastructure to provide remote access for all the supported operating systems—using Azure for load balancing and identity services and specialized VPN appliances. We had several considerations when designing the platform:

Redundancy. The service needed to be highly resilient so that it could continue to operate if a single appliance, site, or even large region failed.
Capacity. As a worldwide service meant to be used by the entire company and to handle the expected growth of VPN, the solution had to be sized with enough capacity to handle 200,000 concurrent VPN sessions.
Homogenized site configuration. A standard hardware and configuration stamp was a necessity both for initial deployment and operational simplicity.
Central management and monitoring. We ensured end-to-end visibility through centralized data stores and reporting.
Azure AD–based authentication. We moved away from on-premises Active Directory and used Azure AD to authenticate and authorize users.
Multi-device support. We had to build a service that could be used by as much of the ecosystem as possible, including Windows, OSX, Linux, and appliances.
Automation. Being able to programmatically administer the service was critical. It needed to work with existing automation and monitoring tools.

When we were designing the VPN topology, we considered the location of the resources that employees were accessing when they were connected to the corporate network. If most of the connections from employees at a remote site were to resources located in central datacenters, more consideration was given to bandwidth availability and connection health between that remote site and the destination. In some cases, additional network bandwidth infrastructure has been deployed as needed. The illustration below provides an overview of our remote access infrastructure.

Microsoft remote access infrastructure.

VPN tunnel types

Our VPN solution provides network transport over Secure Sockets Layer (SSL). The VPN appliances force Transport Layer Security (TLS) 1.2 for SSL session initiation, and the strongest possible cipher suite negotiated is used for the VPN tunnel encryption. We use several tunnel configurations depending on the locations of users and level of security needed.

Split tunneling

Split tunneling allows only the traffic destined for the Microsoft corporate network to be routed through the VPN tunnel, and all internet traffic goes directly through the internet without traversing the VPN tunnel or infrastructure. Our migration to Office 365 and Azure has dramatically reduced the need for connections to the corporate network. We rely on the security controls of applications hosted in Azure and services of Office 365 to help secure this traffic. For end point protection, we use Microsoft Defender Advanced Threat Protection on all clients. In our VPN connection profile, split tunneling is enabled by default and used by the majority of Microsoft employees. Learn more about Office 365 split tunnel configuration.

Full tunneling

Full tunneling routes and encrypts all traffic through the VPN. There are some countries and business requirements that make full tunneling necessary. This is accomplished by running a distinct VPN configuration on the same infrastructure as the rest of the VPN service. A separate VPN profile is pushed to the clients who require it, and this profile points to the full-tunnel gateways.

Full tunnel with high security

Our IT employees and some developers access company infrastructure or extremely sensitive data. These users are given Privileged Access Workstations, which are secured, limited, and connect to a separate highly controlled infrastructure.

Applying and enforcing policies

In Microsoft Digital, the Conditional Access administrator is responsible for defining the VPN Compliance Policy for domain-joined Windows 10 desktops, including enterprise laptops and tablets, within the Microsoft Azure Portal administrative experience. This policy is then published so that the enforcement of the applied policy can be managed through Microsoft Endpoint Manager. Microsoft Endpoint Manager provides policy enforcement, as well as certificate enrollment and deployment, on behalf of the client device.

For more information about policies, see VPN and Conditional Access.

Early adopters help validate new policies

With every new Windows 10 update, we rolled out a pre-release version to a group of about 15,000 early adopters a few months before its release. Early adopters validated the new credential functionality and used remote access connection scenarios to provide valuable feedback that we could take back to the product development team. Using early adopters helped validate and improve features and functionality, influenced how we prepared for the broader deployment across Microsoft, and helped us prepare support channels for the types of issues that employees might experience.

Measuring service health

We measure many aspects of the VPN service and report on the number of unique users that connect every month, the number of daily users, and the duration of connections. We have invested heavily in telemetry and automation throughout the Microsoft network environment. Telemetry allows for data-driven decisions in making infrastructure investments and identifying potential bandwidth issues ahead of saturation.

Using Power BI to customize operational insight dashboards

Our service health reporting is centralized using Power BI dashboards to display consolidated data views of VPN performance. Data is aggregated into an SQL Azure data warehouse from VPN appliance logging, network device telemetry, and anonymized device performance data. These dashboards, shown in the next two graphics below, are tailored for the teams using them.

Global VPN status dashboard.

Microsoft Power BI reporting dashboards.

With our optimizations in VPN connection profiles and improvements in the infrastructure, we have seen significant benefits:

Reduced VPN requirements. By moving to cloud-based services and applications and implementing split tunneling configurations, we have dramatically reduced our reliance on VPN connections for many users at Microsoft.
Auto-connection for improved user experience. The VPN connection profile automatically configured for connection and authentication types have improved mobile productivity. They also improve the user experience by providing employees the option to stay connected to VPN—without additional interaction after signing in.
Increased capacity and reliability. Reducing the quantity of VPN sites and investing in dedicated VPN hardware has increased our capacity and reliability, now supporting over 500,000 simultaneous connections.
Service health visibility. By aggregating data sources and building a single pane of glass in Microsoft Power BI, we have visibility into every aspect of the VPN experience.

The post Enhancing VPN performance at Microsoft appeared first on Inside Track Blog.

Revamped Microsoft business intelligence platform boosts data handling and builds trust

Serah Delaini — Thu, 26 Dec 2024 17:00:25 +0000

Imagine an important meeting where you spend most of your time discussing the accuracy of metrics and reports. Too often, that was the reality for many Microsoft teams before we launched Microsoft Sales Experience (MSX) Insights, a Microsoft business intelligence platform.

Now MSX Insights provides a single source of truth to more than 40,000 users, including salespeople, their managers, leaders, and multiple operations and finance teams across Microsoft. Based on Microsoft Fabric, a suite of Microsoft Azure technologies that includes OneLake, Data Factory, Synapse, Azure Analysis Services, and Power BI, MSX Insights is a project of the Microsoft Commerce and Ecosystem team, which powers, transforms, and protects our organization.

Michael Toomey, senior director of Business Operations and Programs for Microsoft Worldwide Sales Engineering, led the creation of the innovative internal Microsoft business intelligence platform known as MSX Insights. (Photo by Michael Toomey)

“Once we began using Azure and then Power BI, the technology limitations that had been holding back our data unification were eliminated,” says one of the project sponsors, Michael Toomey, senior director of business operations and programs for Microsoft Worldwide Sales Engineering. “Was it finally possible to get to a single view of our commercial business that everyone could understand?”

That’s exactly what happened when Microsoft Customer and Partner Solutions (MCAPS), Microsoft Finance and Data Experiences, and Customer Experience Data Engineering (CX Data) collaborated to create today’s comprehensive Microsoft business intelligence platform: MSX Insights.

[See how we automated our legacy revenue processing systems with optical character recognition (OCR) technology. Find out how we reinvented sales processing and financial reporting with Azure.]

Reports and metrics that didn’t add up

At Microsoft, it’s critical that decision makers have access to data they can trust across the sales pipeline, contracts, revenue, and consumption. But when they pulled reports from separate systems or used data updated at different times, the numbers and results often didn’t match.

“We used to have a lot of complaints,” says RJ Smith, principal group engineering manager with Microsoft Commercial Business. “I talked to people in Paris, Munich, Sydney, and they said that those reports wouldn’t load or that they didn’t show the right data.”

Praveen Vittalrao Ambekar, a principal group program manager for CX Data, also analyzed the MSX customer experience to find out what was driving the support volume.

We needed a 360-degree view of our customer to correctly evaluate key metrics. We wanted end-to-end visibility.

– Michael Toomey, senior director of business operations and programs, Microsoft Worldwide Sales Engineering

“Multiple data platforms powered these reports,” Ambekar says. “The insights weren’t aligned across sellers, managers, and leaders, and that was causing a lot of churn for the team. The groups were looking at the data from different angles.”

Those weren’t the only problems. The scope of the available reports didn’t fulfill the needs of senior executives.

“We needed a 360-degree view of our customer to correctly evaluate key metrics,” Toomey says. “We wanted end-to-end visibility.”

Although we empower our teams extensively to develop their own reporting platforms, we were seeing broad duplications of effort and cost. “It was a highly federated budget model,” Toomey says.

It was also risky. As people developed one-off solutions using copies of datasets, it became harder to secure the information and enforce compliance with standard data-handling practices.

“The more replication you have, the less likelihood that everyone’s compliant with the rules,” Ambekar says.

The insular systems also impacted engagement and satisfaction levels for our partners and customers.

“Close coordination across sales teams, partners, marketing, and operations is critical for our customers to get a connected experience,” Toomey says. “It’s impossible to achieve that if we have multiple datasets with mismatched data on opportunities, consumption, licenses, revenue, and other kinds of information.”

Multiple waves of data handling improvements

After we began using a standardized system, teams could migrate from competing products and use the same software regardless of department. The solution has stood the test of time.

“Power BI as a product is almost 12 years old and so is our platform,” Toomey says. “It has successfully adapted through the transitions of Microsoft’s core business model and the priorities of multiple engineering leaders—and our commercial business as a whole. We always need a central place to go and get insights.”

Then we launched Microsoft Azure cloud computing services, making it easier for users in different departments to access the same source of data.

“We started to take the approach of giving people what they needed based on their roles,” Toomey says.

That might be a seller who wants to see their scorecard broken down by account, a manager who needs an aggregate of the entire team’s pipeline, or a leader looking for patterns and trends over time. Microsoft Azure was a major enabler for this new direction, and Microsoft Power BI was the team’s choice of a front end for the evolving business intelligence platform.

“We had a quarterly business connection, an event where we bring all the executives together, area by area, segment by segment,” Toomey says. “Several of us got together and worked for six weeks to automate the data handling. We moved everything into Power BI and ran visuals there.”

That proof of concept was a success, so the next step was to make a cultural shift to get to an aligned environment. To that end, the team built a community around Microsoft Power BI practitioners in the field. This BI round table community gets together a few times a month to share best practices, what they’re doing locally, and what has the potential to scale up.

“We tried to connect with the people building the tools and explain that this was a better way for them to be successful,” Toomey says.

The team also focused on increasing the tool’s speed for users around the world.

“It’s not a problem anymore,” Smith says. “In fact, performance metrics have improved 50 percent. We spent a lot of time on performance to make sure the JavaScript implementation in the browser works well.”

In November 2020, representatives from the teams who were most involved in creating MSX Insights came together to address one remaining issue: getting the data right.

“It was a result of partnership and alignment between the three different teams—MCAPS, Finance and Data Experiences, and Microsoft Commerce and Ecosystems,” says Diego Ulloa, a data strategy lead with Microsoft Worldwide Enablement and Operations who works on MSX Insights. “Together we consolidated data, set business rules, and designed the architecture.”

Power BI as a self-service tool has enabled more consumption of the insights. You don’t need layers of people to pull it into Excel.

– Praveen Vittalrao Ambekar, principal group program manager, Microsoft Partner and Sales Experience Business Insights

“We had to make hard decisions,” Ambekar says. “We had to align to one or the other’s hierarchy.”

The team divided the rules and definitions up by functional area. In the first six months after the launch of MSX Insights, we eliminated 80 percent of the user complaints associated with data hygiene and report accuracy. We’re also improving its ease of use over time.

“Power BI as a self-service tool has enabled more consumption of the insights,” Ambekar says. “You don’t need layers of people to pull it into Excel.”

Collaboration continues to improve data quality and integrity

The group that built MSX Insights isn’t done yet. There’s a robust roadmap planned for the coming months and years.

More trends and customer reports are on the agenda. MSX Insights and the Partner Sales Experience are still evolving, and several other teams are now contributing to these platforms.

“We’ll continue to evolve the user experience,” Ambekar says. “We want the user interface to really match how and where people are working, embedding insights more directly into the experience.”

The collaboration across teams is helping avoid the duplication of efforts.

“Because we hold each other accountable when we go through and talk about the designs, we’re doing it once and doing it well,” Smith says. “It’s better by virtue of us working on it together.”

The path ahead for MSX Insights includes continuous rollouts of additional services and functionality using the latest capabilities in Microsoft Fabric and Power BI. We’re also looking at ways to integrate AI throughout the experience in an effort to enable faster decision making.

“By introducing these abilities to partners, we’re allowing more teams to create a comprehensive set of reports,” Ulloa says. “We’re taking our narrow vision and extending it to a One Microsoft model.”

Share your feedback with us—take our survey and let us know what kind of content is most useful to you.

The post Revamped Microsoft business intelligence platform boosts data handling and builds trust appeared first on Inside Track Blog.

Migrating from Microsoft Monitoring Agent to Azure Arc and Azure Update Manager at Microsoft

Idunn Wolfe — Thu, 26 Sep 2024 16:05:00 +0000

As organizations grow and transform their IT infrastructures, maintaining consistency in patch management across various environments and cloud architectures has become a priority here at Microsoft and at companies elsewhere.

A recent shift from Microsoft Monitoring Agent (MMA) to Microsoft Azure Arc and Microsoft Azure Update Manager (AUM) offers us and others a unified solution for both on-premises and cloud resources. This transition is improving our patch orchestration while offering our IT leaders more robust control of our diverse systems internally here in Microsoft Digital, the company’s IT organization.

Moving to Azure Arc

Transitioning from Microsoft Monitoring Agent to Azure Arc ensures streamlined updates across diverse systems, say Cory Granata (left) and Humberto Arias. Granata is a senior site reliability engineer on the Microsoft Digital Security and Compliance team and Arias is a senior product manager in Microsoft Digital.

Using MMA and shifting to AUM with Microsoft Azure Arc integration requires using Azure Arc as a bridge, enabling management of both on-premises and cloud-based resources under a single source.

Historically, the MMA allowed for “dual homing,” where IT teams could connect machines to multiple Microsoft Azure subscriptions with ease. This flexibility streamlined patch management and reporting across different environments.

This feature is particularly useful for us and other large organizations with multiple Azure environments, says Cory Granata, a senior site reliability engineer on the Microsoft Digital Security and Compliance team in Microsoft Digital. However, the newer Azure Arc-based AUM only allows machines to report into one subscription and resource group at a time.

This limitation required some coaching for teams accustomed to MMA’s dual-homing capabilities.

“It wasn’t really an issue or a challenge—just coaching and getting other teams in the mindset that this is how the product was developed,” Granata says.

Azure Arc’s streamlined approach offers an efficient path for IT teams like ours looking to centralize patch management, especially for diverse infrastructures that include cloud and on-premises assets.

Centralizing patch orchestration

One of the standout advantages of Azure Update Manager with Azure Arc is its ability to support patch orchestration across a wide range of environments.

“You have the ability to patch on-premises, off-premises, Azure IaaS, and other resources,” Granata says. “This flexibility extends beyond Azure to cover machines hosted on other platforms, and on-premises Hyper-V servers.”

For organizations with complex infrastructures like ours, this unified approach simplifies operations, reducing the need for multiple tools and platforms to handle updates. Whether managing physical servers in data centers, virtual machines across different cloud providers, or edge computing devices, Azure Arc ensures that patch management is consistent and reliable.

These changes have been very helpful internally here at Microsoft.

“The AUM is our one-stop solution for patching all these different inventories of devices, regardless of where they reside—on-premises, in the cloud, or in hybrid environments,” says Humberto Arias, a senior product manager in Microsoft Digital.

This multi-cloud and edge computing capability offers IT leaders here and elsewhere the flexibility to scale their patch management efforts without being tied to a specific platform.

Migration challenges

While the transition to Azure Arc and AUM has brought us significant benefits, there have been some challenges, particularly around managing expectations for dual-homing capabilities.

The key thing we had to work through was that Azure Arc could only connect to one Azure subscription and resource group at a time. This required additional training for us—we needed to shift our mindset and adopt new workflows. However, after our people understood this limitation, the migration process was smooth.

“Fortunately, it only phones into one subscription and one resource group,” Granata says. “So, wherever it phones in is where all of your patch orchestration logs and everything must go as well, and it can’t connect into another subscription. This centralized approach simplifies reporting and patch management, but it did require some initial adjustments for teams accustomed to multi-subscription environments.”

Through coaching and training, our teams were able to adapt, and the long-term benefits of a more streamlined system quickly became apparent.

Azure Arc and AUM benefits

Following our migration, our teams began to realize the true benefits of using Azure Arc and AUM for their patch orchestration needs.

“The neat thing about using AUM with patch management and patch orchestration is the centralized control it provides,” Granata says.

For IT teams managing both internal IT assets and lab environments, the ability to oversee patching across a diverse range of systems from one location was a game-changer.

Additionally, the new system provided enhanced reporting and visibility.

While MMA offered flexibility in terms of connecting to multiple subscriptions, Azure Arc’s centralized model makes it easier to manage logs, reports, and patch statuses from a single dashboard.

“We’ve really enjoyed the increased visibility and ease of use that this has given us,” Arias says. “This is particularly valuable for large organizations like ours with distributed environments, where maintaining visibility across multiple systems can be a challenge.”

The integration with Azure Arc also extends your platform’s reach to non-Azure environments, including AWS and other cloud providers. This means that organizations running multi-cloud or hybrid cloud strategies can benefit from a unified patch management system, regardless of where their machines are hosted.

For IT leaders here and elsewhere, these improvements represent a significant step forward in our operational efficiency and security. By centralizing patch management under Azure Arc and AUM, we can ensure that our systems are up-to-date, secure, and compliant, without the need for multiple tools or platforms. We hope sharing our story helps you do the same at your company.

Here are some tips for getting started at your company:

Azure Arc allows for a centralized management approach, providing IT leaders with a comprehensive view of their infrastructure.
Azure Update Manager offers improved patch orchestration and update management, leveraging the latest Azure technologies.
While the transition to Azure Arc brings numerous benefits, it also necessitates adjustments, particularly for teams accustomed to dual homing with the Microsoft Monitoring Agent.
With some light coaching, teams can easily learn the new system’s capabilities and limitations.

Discover more about Azure Arc from the Microsoft Azure product group, including About Azure Arc, Azure Arc for servers, and Azure’s Cloud Adoption Framework.

The post Migrating from Microsoft Monitoring Agent to Azure Arc and Azure Update Manager at Microsoft appeared first on Inside Track Blog.

Microsoft uses a scream test to silence its unused servers

Pete Apple — Sat, 17 Aug 2024 08:00:59 +0000

Do you have unused servers on your hand? Don’t be alarmed if I scream about it—it’ll be for a good reason (and not just because it’s almost Halloween)!

Check out Pete Apple’s expedition to the cloud series

I talked previously about our efforts here in Microsoft Digital to inventory our internal-to-Microsoft on-premises environments to determine application relationships (mapping Microsoft’s expedition to the cloud with good cartography) as well as look at performance info for each system (the awesome ugly truth about decentralizing operations at Microsoft with a DevOps model).

With this info, it was time to begin making plans to move to the cloud. Looking at the data, our overall CPU usage for on-premises systems was far lower than we thought—averaging around six percent! We realized this was so low due to many underutilized systems. First things first, what to do with the systems that were “frozen,” or not being used, based upon the 0-2 percent CPU they were utilizing 24/7?

We created a plan to closely examine those assets towards the goal of moving as few as possible. We used our home-built change management database (CMDB) to check whether there was a recorded owner. In some cases, we were able to work with that owner and retire the system.

Before we turned even one server off, we had to be sure it wasn’t being used. (If a server is turned off and no one is there to see it, does it make a sound?)

Developing a scream test

Pete Apple, a cloud services engineer in Microsoft Digital, shares how Microsoft scares teams that have unused servers that need to be turned off. (Photo by Jim Adams | Inside Track)

But what if the owner information was wrong? Or what if that person had moved on? For those, we created a new process: the Scream Test. (Bwahahahahaaaa!)

What’s the Scream Test? Well, in our case it was a multistep process:

Display the message “Hey, is this your server, contact us?” on the sign-in splash page for two weeks.
Restart the server once each day for two weeks to see whether someone opens a ticket (in other words, screams).
Shut down the server for two weeks and see whether someone opens a ticket. (Again, whether they scream.)
Retire the server, retaining the storage for a period, just in case.

With this effort, we were able to retire far more unused servers—around 15 percent—than we had expected, without worrying about moving them to the cloud. Winning! We also were able to reclaim more resources on some of the Hyper-V hosts that were slated to continue running on-premises. And as a final benefit, we cleaned up our CMDB a bit!

In parallel, we initiated an effort to look at some of the systems that were infrequently used or used a very low level of CPU (less than 10 percent, or “Cold”). From that, we had two outcomes that proved critical for our successful migration to the cloud.

The first was to identify the systems in our on-premises environments that were oversized. People had purchased physical machines or sized virtual machines according to what they thought the load would be, and either that estimate was incorrect or the load diminished over time. We took this data and created a set of recommended Azure VM sizes for every on-premises system to use for migration. In other words, we downsized on the way to the cloud versus after the fact.

At the time, we did a bunch of this work by hand, manually because we were early adopters. Microsoft now has a number of great products available that help assist with this inventory and review of your on-premises environment that you should check out. To learn more, check out this article with documentation on Azure Migrate.

Another statistic that the data revealed was the number of systems that were used for only a few days or a week out of each month. Development machines, test/QA machines, and user acceptance testing machines reserved for final verification before moving code to production were used for only short periods. The machines were on continuously in the datacenter, mind you, but they were actually being used for only short periods each month.

For these, we investigated ways to have those systems running only when required by investing in two technologies: Azure Resource Manager Templates and Azure Automation. But this is a story for the next time. Until then, happy Halloween!

Read the rest of the series on Microsoft’s move to the cloud:

The post Microsoft uses a scream test to silence its unused servers appeared first on Inside Track Blog.

How automation is transforming revenue processing at Microsoft

Lukas Velush — Fri, 26 Jan 2024 15:05:35 +0000

The Microsoft partner and customer network brings in more than $100 billion in revenue each year, most of the company’s earnings.

Keeping tabs on the millions of annual transactions is no small task—just ask Shashi Lanka Venkata and Mark Anderson, two company employees who are leading a bid to automate what historically has been a painstakingly manual revenue transaction process.

“We support close to 50 million platform actions per day,” says Venkata, a principal group engineering manager in Microsoft Digital. “For a quarter-end or a month-end, it can double. At June-end, we’re getting well more than 100 million transactions per day.”

That’s a lot, especially when there cannot be any mistakes and every transaction must be processed in 24 hours.

To wrangle that high-stakes volume, Venkata and Anderson, a director on Microsoft’s Business Operations team, teamed up to expand the capabilities of Customer Obsessed Solution Management and Incident Care (COSMIC), a Dynamics 365 application built to help automate Microsoft’s revenue transactions.

[Learn more about COSMIC including where to find the code here: Microsoft Dynamics 365 and AI automate complex business processes and transactions.]

First tested in 2017 on a small line of business, the solution expanded quickly and was handling the full $100 billion-plus workload within one year.

That said, the team didn’t try to automate everything at once—it has been automating the many steps it takes to process a financial transaction one by one.

Mark Anderson (shown here) partnered with Shashi Lanka Venkata from Microsoft Digital to revamp the way the company processes incoming revenue. Anderson is a director on Microsoft’s Business Operations team.

“We’re now about 75 percent automated,” Anderson says. “Now we’re much faster, and the quality of our data has gone way up.”

COSMIC is saving Microsoft $25 million to $30 million over the next two to three years in revenue processing cost. It also automates the rote copy-and-paste kind of work that the company’s team of 3,800 revenue processing agents used to get bogged down on, freeing them up to do higher value work.

The transformation that Anderson, Venkata, and team have been driving is part of a larger digital transformation that spans all Microsoft Digital. Its success has led to a kudos from CEO Satya Nadella, a well-received presentation to the entire Microsoft Digital organization, and lots of interest from Microsoft customers.

“It’s been a fantastic journey,” Anderson says. “It’s quite amazing how cutting edge this work is.”

Unpacking how COSMIC works

Partners transact, purchase, and engage with Microsoft in over 13 different lines of businesses, each with its own set of requirements and rules for processing revenue transactions (many of which change from country to country).

To cope with all that complexity, case management and work have historically been handled separately to make it easier for human agents to stay on top of things.

That had to change if COSMIC was going to be effective. “When we started, we knew we needed to bring them together into one experience,” Venkata says.

Doing so would make transactions more accurate and faster, but there was more to it.

“The biggest reason we wanted to bring them together is so we could get better telemetry,” he says. “Connecting all the underlying data gives us better insights, and we can use that to get the AI and machine learning we need to automate more and more of the operation.”

Giving automation its due

The first thing the team decided to automate was email submissions, one of the most common ways transactions get submitted to the company.

“We are using machine learning to read the email and to automatically put it in the right queue,” Venkata says. “The machine learning pulls the relevant information out of the email and enters it into the right places in COSMIC.”

The team also has automated sentiment analysis and language translation.

What’s next?

Using a bot to start mimicking the work an agent does, like automatic data entry or answering basic questions. “This is something that is currently being tested and will soon be rolled out to all our partners using COSMIC,” he says.

How does it work?

When a partner submits a transactional package to Microsoft, an Optical Character Recognition bot scans it, opens it, checks to see if everything looks correct, and makes sure business roles are applied correctly. “If all looks good, it automatically gets routed to the next step in the process,” Venkata says.

The Dynamics workflow engine also is taking on some of the check-and-balance steps that agents used to own, like testing to see if forms have been filled out correctly and if information extracted out of those forms is correct.

“Azure services handle whatever has to be done in triage or validation,” he says. “It can check to see if a submission has the right version of the document, or if a document is the correct one for a particular country. It validates various rules at each step.”

All of this is possible, Venkata says, because the data was automatically abstracted. “If, at any point the automation doesn’t work, the transaction gets kicked back for manual routing,” he says.

As for the agents? They are getting to shift to more valuable, strategic work.

“The system is telling them what the right next action is going to be,” Venkata says. “Before this, the agent had to remember what to do next for each step. Now the system is guiding them to the next best action—each time a step is completed, the automation kicks in and walks the agent through the next action they should take.”

Eventually the entire end-to-end process will be automated, and the agents will spend their time doing quality control checks and looking for ways to improve the experience. “We want to get to the point where we only need them to do higher level work,” he says.

Choosing Dynamics 365 and Microsoft Azure

There was lots of technology to choose from, but after a deep assessment of the options, the team chose Dynamics 365 and Microsoft Azure.

“We know many people thought Dynamics couldn’t scale to an enterprise the size of Microsoft, but that’s not the case anymore,” Venkata says. “It has worked very well for us. Based on our experience, we can definitively say it can cover Microsoft’s needs.”

The team also used Azure to build COSMIC—Azure Blob Storage for attachments, Azure Cosmos DB for data archival and retention, SQL Azure for reporting on data bases, and Microsoft Power BI for data reporting.

Anderson says it’s a major leap forward to be using COSMIC’s automation to seamlessly route customers to the right place, handing them off from experience to experience without disrupting them.

Another major improvement is how the team has gained an end-to-end view of customers (which means the company no longer must ask customers what else they’re buying from Microsoft).

“It’s been a journey,” Anderson says. “It isn’t something we’ve done overnight. At times it’s been frustrating, and at times it’s been amazing. It’s almost hard to imagine how far we’ve come.”

Learn more about COSMIC including where to find the code here: Microsoft Dynamics 365 and AI automate complex business processes and transactions.

The post How automation is transforming revenue processing at Microsoft appeared first on Inside Track Blog.

Boosting Microsoft’s migration to the cloud with Microsoft Azure

Lukas Velush — Fri, 27 Oct 2023 15:30:40 +0000

When Microsoft set out to move its massive internal workload of 60,000 on-premises servers to the cloud and to shutter its handful of sprawling datacenters, there was just one order from company leaders looking to go all-in on Microsoft Azure.

Please start our migration to the cloud, and quickly.

As a team, we had a lot to learn. We started with a few Azure subscriptions. We were kicking the tires, figuring things out, assessing how much work we had to do.

– Pete Apple, principal service engineer, Microsoft Digital

However, it was 2014, the early days of moving large, deeply rooted enterprises like Microsoft to the cloud. And the IT pros in charge of making it happen had few tools to do it and little guidance on how to go about it.

“As a team, we had a lot to learn,” says Pete Apple, a principal service engineer in Microsoft Digital. “We started with a few Azure subscriptions. We were kicking the tires, figuring things out, assessing how much work we had to do.”

As it turns out, quite a bit of work. More on that in a moment.

Now, seven years later, the company’s migration to the cloud is 96 percent complete and the list of lessons learned is long. Six IT datacenters are no more and there are fewer than 800 on-prem servers left to migrate. And that massive workload of 60,000 servers? Using a combination of modern engineering to redesign the company’s applications and to prune unused workloads, that number has been reduced. Microsoft is now running on 7,474 virtual machines in Azure and 1,567 virtual machines on-premises.

“What we’ve learned along the way has been rolled into the product,” Apple says. “We did go through some fits and starts, but it’s very smooth now. Our bumpy experience is now helping other companies have an easier time of it (with their own migrations).”

[Learn how modern engineering fuels Microsoft’s transformation. Find out how leaders are approaching modern engineering at Microsoft.]

The beauty of a decision framework

It didn’t start that way, but migrating a workload to Azure inside Microsoft is super smooth now, Apple says. He explains that everything started working better when they began using a decision tree like the one shown here.

Microsoft Digital’s migration to the cloud decision tree

The cloud migration team used this decision tree to guide it through migrating the company’s 60,000 on-premises servers to the cloud. (Graphic by Marissa Stout | Inside Track)

First, the Microsoft Digital migration team members asked themselves, “Are we building an entirely new experience?” If the answer was “yes,” then the decision was easy. Build a modern application that takes full advantage of all the benefits of building natively in the cloud.

If you answer “no, we need to move an existing application to the cloud,” the decision tree is more complex. It requires the team to answer a couple of tough questions.

Do you want to take the Platform as a Service (PaaS) approach? Do you want to rebuild your experience from the ground up to take full benefit of the cloud? (Not everyone can afford to take the time needed or has the budget to do this.) Or do you want to take the Infrastructure as a Service (IaaS) approach? This requires lifting and shifting with a plan to rebuild in the future when it makes more sense to start fresh.

Tied to this question were two kinds of applications: those built for Microsoft by third-party vendors, and those built by Microsoft Digital or another team in Microsoft.

On the third-party side, flexibility was limited—the team would either take a PaaS approach and start fresh, or it would lift and shift to Azure IaaS.

“We had more choices with the internal applications,” Apple says, explaining that the team divvied those up between mission-critical and noncritical apps.

For the critical apps, the team first sought money and engineering time to start fresh and modernize. “That was the ideal scenario,” Apple says. If money wasn’t available, the team took an IaaS approach with a plan to modernize when feasible.

As a result, noncritical projects were lifted and shifted and left as-is until they were no longer needed. The idea was that they would be shut down once something new could be built that would absorb that task or die on the vine when they become irrelevant.

“In a lot of cases, we didn’t have the expertise to keep our noncritical apps going,” Apple says. “Many of the engineers who worked on them moved onto other teams and other projects. Our thinking was, if there is some part of the experience that became important again, we would build something new around that.”

Getting migration right

When Microsoft started its migration to the cloud, the company had a lot to learn, says Pete Apple, a principal service engineer in Microsoft Digital. That migration is nearly finished and those learnings? “They have been rolled into the product,” Apple says. (Photo by Jim Adams | Inside Track)

Apple says the Microsoft Digital migration team initially thought the migration to the cloud would be as simple as implementing one big lift-and-shift operation. It was a common mindset at the time: Take all your workloads and move them to the cloud as-is and figure out the rest later.

“That wasn’t the best way, for a number of reasons,” he says, adding that there was a myriad of interconnections and embedded systems to sort out first. “We quickly realized our migration to the cloud was going to be far more complex than we thought.”

After a lot of rushing around, the team realized it needed to step back and think more holistically.

The first step was to figure out exactly what they had on their hands—literally. Microsoft had workloads spread across more than 10 datacenters, and no one was tracking who owned all of them or what they were being used for (or if they were being used at all).

Longtime Microsoft culture dictated that you provision whatever you thought you might need, and to go big to make sure you covered your worst-case scenario. Once the upfront cost was covered, teams would often forget about how much it cost to keep all those servers running. With teams spinning up production, development, and test environments, the amount of untracked capacity was large and always growing.

“Sometimes, they didn’t even know what servers they were using,” Apple says. “We found people who were using test environments to run their main services.”

And figuring out who was paying for what? Good luck.

“There was a little bit of cost understanding, of what folks were thinking they had versus what they were paying for, that we had to go through,” Apple says. “Once you move to Azure, every cost is accounted for—there is complete clarity around everything that you’re paying for.”

There were some surprising discoveries.

“Why are we running an entire Exchange Server with only eight people using it? That should be on Office 365,” Apple says. “There were a lot of ‘let’s find an alternative and just retire it’ situations that we were able to work through. It was like when you open your storage facility from three years ago and suddenly realize you don’t need all the stuff you thought you needed.”

Moving to the cloud created opportunities to do many things over.

“We were able to clean up many of our long-running sins and misdemeanors,” Apple says. “We were able to fix the way firewalls were set up, lock down our ExpressRoute networks, and (we) tightened up access to our Corpnet. Moving to the cloud allowed us to tighten up our security in a big way.”

Essentially, it was a greenfield do-over opportunity.

“We didn’t do it enough, but when we did it the right way, it was very powerful,” says Heather Pfluger. She is a partner group manager on Microsoft Digital’s Platform Engineering Team, who had a front-row seat during the migration.

That led to many mistakes, which makes sense because the team was trying to both learn a new technology and change decades of ingrained thinking.

“We did dumb things,” Pfluger says. “We definitely lifted and shifted into some financial challenges, we didn’t redesign as we should have, and we didn’t optimize as we should have.”

All those were learning moments, she says. She points to how the team now uses an optimization dashboard to buy only what it needs. It’s a change that’s saving Microsoft millions of dollars.

Apple says those new understandings are making a big difference all over the company.

“We had to get people into the mindset that moving to the cloud creates new ways to do things,” he says. “We’re resetting how we run things in a lot of ways, and it’s changing how we run our businesses.”

He rattled off a long list of things the team is doing differently, including:

Sending events and alerts straight to DevOps teams versus to central IT operations
Spinning up resources in minutes for just the time needed. (Versus having to plan for long racking times or VMs that used to take a week to manually build out.)
Dynamically scale resources up and down based upon load
Resizing month-to-month or week-to-week based upon cyclical business rhythms versus using the old “continually running” model
Having some solutions costs drop to zero or near zero when idle
Moving from custom Windows operating system image for builds to using Azure gallery image and Azure automation to update images
Creating software defined networking configurations in the cloud versus physical networked firewalled configurations that required many manual steps
Managing on premises environments with Azure tools

There is so much more we can do now. We don’t want our internal users to find problems with our reporting. We want to find them ourselves and fix them so fast that our employee users never notice anything was wrong.

– Heather Pfluger, partner group manager, Platform Engineering Team

Pfluger’s team builds the telemetry tools Microsoft employees use every day.

“There is so much more we can do now,” she says, explaining that the goal is always to improve satisfaction. “We don’t want our internal users to find problems with our reporting. We want to find them ourselves and fix them so fast that our employee users never notice anything was wrong.”

And it’s starting to work.

“We’ve gotten to the point where our employee users discovering a problem is becoming more rare,” Pfluger says. “We’re getting better, but we still have a long way to go.”

Apple hopes everyone continues to learn, adjust, and find better ways to do things.

“All of our investments and innovations are now all occurring in the cloud,” he says. “The opportunity to do new and more powerful things is just immense. I’m looking forward to seeing where we go next.”

The post Boosting Microsoft’s migration to the cloud with Microsoft Azure appeared first on Inside Track Blog.

Building a secure and efficient self-service application using Azure ACI, Azure Compute Gallery, and the Microsoft Azure SDK

Anish Paranjpe — Thu, 12 Oct 2023 20:04:13 +0000

Editor’s note: This is the second in an ongoing series on moving our network to the cloud internally at Microsoft.

At Microsoft, the Microsoft Digital Employee Experience (MDEE) team—our company IT organization—is using the Azure SDK, Azure Container Instances, and the Azure Compute Gallery to create a platform for deploying our virtual labs into secure, user-defined hub-and-spoke networks in Microsoft Azure. These labs provide isolated environments where our employees can create their own on-demand, scalable virtual machine and network environments for testing and development purposes.

This collection of technologies enables our employees to create virtual lab environments across multiple Azure tenants at scale, using infrastructure as code (IaC) to quickly deploy lab templates using the Azure Compute Gallery.

Here’s an architecture diagram that shows the flow of our Microsoft Azure-based virtual lab platform.

[Read the first blog in our “Moving our network to the cloud” series.]

ACI for flexibility and scalability

Azure Container Instances (ACI) is a critical component of our provisioning process. ACI is a fully managed service offered by Azure that enables users to deploy and run containerized applications in the cloud without having to manage virtual machines or learn new tools. It offers exceptional flexibility and scalability, making it ideal for managing our virtual labs environment.

ACI enables simplified orchestration of containers, especially when compared to more complex solutions like Kubernetes. ACI offers simple configuration for isolated containers, eliminating the need for deep knowledge of the network stack and the need to create complex YAML-based configurations. This simplicity streamlines the development process, reduces complexity, and ensures that container security measures are always included.

ACI also supports a wide variety of container images, including Docker containers and containers from other sources, such as Azure Container Registry, Docker Hub, or private container registries. In our experience, it scales very well with lightweight .Net Core images.

ACI offers rapid container deployment and orchestration. Our containers are available quickly to coordinate virtual lab deployment and can be shut down promptly when their work is completed. This dynamic allocation ensures that resources are only utilized when necessary. This works well in our stateless workload scenarios and is especially useful for batch processing. It also eliminates the overhead of cluster management tasks and lets us focus on deploying containers immediately.

We configure ACI to ensure graceful region-based failover. ACI offers versatile options for region failover and makes our business continuity and disaster recovery scenarios simple to implement. We use an Azure function to initialize failover groups based on region availability, creating a seamless user experience.

We use ACI for data processing, batch jobs, and event-driven functions where the workload varies and can be executed independently from the API services. We use messaging queues like Azure Service Bus to coordinate between the APIs running in Azure Kubernetes Service (AKS) and the background processing tasks in ACI. This configuration ensures that the API services can trigger or communicate with the background processing components when necessary.

Due to its ability to scale horizontally and quickly spin up instances without delay, we could continue delivering high performance to our users, even during heavy loads on our system. Our platform creates almost 40 thousand ACI instances each month.

The dynamic nature of ACI ensures that the resources are only utilized when necessary, keeping costs at a minimum. Additionally, we initialize containers with the fewest vCPU and memory resources required for their specific tasks to optimize resource allocation and cost tracking.

Getting started with containers can be intimidating, but ACI makes it very simple to deploy a container. With Hyper-V isolation by default, support for burst workloads, and a wide array of powerful capabilities, we can scale to the highest performance applications.

— Justin Song, senior software engineering manager, Azure Container Instances team

This fine-grained resource allocation ensures efficient utilization and simplifies cost tracking for each lab deployment, resulting in highly available, high-performing, cost-effective operations.

ACI’s serverless infrastructure allows developers to focus on developing their applications, not managing infrastructure. ACI provides the capacity to deploy containers and apply platform updates promptly to ensure security and compliance.

“Getting started with containers can be intimidating, but ACI makes it very simple to deploy a container,” says Justin Song, a senior software engineering manager on the Azure Container Instances team at Microsoft. “With Hyper-V isolation by default, support for burst workloads, and a wide array of powerful capabilities, we can scale to the highest performance applications.”

Anjali Sujatha Nair (left) and Anish Paranjpe are part of the team in Microsoft Digital Employee Experience that’s built a self-service virtual lab deployment application internally at Microsoft. Nair and Paranjpe are software engineers.

Azure Compute Gallery for rapid VM provisioning

We use the Azure Compute Gallery to bring efficiency and scalability to VM provisioning for our labs.

Azure Compute Gallery enables us to manage lab virtual machine images globally, with replication across multiple Azure regions.

Managed replication helps us ensure that VM images are readily available wherever our users need them. We’re also using custom least recently used (LRU) cache logic on top of the Gallery Image SDK to reduce the costs associated with hosting images across multiple regions. This custom logic ensures that unused replications are cleaned when not needed, reducing costs while still maintaining the accessibility and reliability of our virtual labs.

We allow our users to deploy pre-configured lab environments called templates. We can create versioned labs using Azure Compute Gallery’s versioning capabilities, effectively capturing unique lab configurations at different development stages. This feature enables our users to save and share meticulously crafted lab setups through templates, fostering global collaboration and knowledge sharing.

They can effortlessly create snapshots of their labs, simplifying collaboration, promoting consistency, and providing control over their virtual lab experiences. Azure Compute Gallery’s versioning puts lab management in the hands of our users, offering flexible, streamlined collaboration.

Role-based access control provides the core access management functionality for Azure Compute Gallery images. Using RBAC and Azure Active Directory identities, access to images and image versions can be shared or restricted to other users, service principals, and groups.

Azure SDK for efficient resource orchestration at scale

The Azure SDK for .NET provides the foundation for our platform’s scalability and resource management. We’re using the Azure SDK’s comprehensive set of open-source libraries, tools, and resources to simplify and expedite application and service development in Azure. The Azure SDK enables our development teams to ensure uniform features and design patterns for Azure applications and services across different programming languages and platforms.

Azure SDK packages adhere to common design guidelines—the Azure.Core package that is included in the SDK supplies a broad feature set, including HTTP request handling, authentication, retry policies, logging, diagnostics, and pagination. We’ve used the SDK to develop additional APIs that are easily integrated with other cloud-based services.

With the Azure SDK APIs, our developers have a unified interface to Azure services without needing to learn distinct APIs for each resource type. Development and resource management are streamlined across the entire Azure platform.

With a unified approach, we can use the Azure SDK to manage diverse resources across multiple Azure subscriptions and accounts.

Here are some tips for getting started with the Azure SDK, Azure Container Instances, and the Azure Compute Gallery at your company:

Use ACI to simplify container orchestration with a smaller developer learning curve, especially when compared to more complex solutions like Kubernetes.
Configure region failover using resources across multiple Azure regions to quickly deploy containers in healthy regions when another region fails. This ensures service continuity and provides a seamless experience for users.
Use ACI scaling to quickly deploy instances across Azure regions, delivering high performance and availability for heavy loads systems.
Configure replication in Azure Compute Gallery to provide global replication management for virtual machine images, ensuring images are readily available to users worldwide.
Use Azure Compute Gallery versioning capabilities to allow users to capture unique virtual machine configurations at different development stages.
Access important resources that can help you navigate this process with the Azure SDK. The Azure.Core package in the SDK offers a unified, standardized approach to accessing Azure functionality across various resource types.
Use the Azure SDK to enable seamless management and deployment of data plane resources at scale across different Azure subscriptions and accounts.

Try out managed identities with Azure Container Instances.

Please share your feedback with us—take our survey and let us know what kind of content is most useful to you.

The post Building a secure and efficient self-service application using Azure ACI, Azure Compute Gallery, and the Microsoft Azure SDK appeared first on Inside Track Blog.

Move to Microsoft Azure boosts Microsoft’s inventory management system

Colin Mitchell — Fri, 08 Sep 2023 15:29:22 +0000

Turn on a device within Microsoft and there’s a good chance that it’s listed in OneAsset, Microsoft’s inventory management system. Microsoft recently moved OneAsset to the cloud—a move that has dramatically improved system performance and sets the stage for future development.

OneAsset is what you might call a very mission-critical app.

“OneAsset is essentially the inventory system for every device that exists within Microsoft. That’s every single piece of hardware from laptops to servers—all the various pieces of network, storage equipment, even virtual devices,” says Pete Apple, principal service engineer in Microsoft Digital, the engineering organization at Microsoft that builds and manages the products, processes, and services that Microsoft runs on.

The data collected within OneAsset is used throughout MS Digital. OneAsset data also interacts with 35 other systems for inventory management, compliance, system management, and other operations.

With the move to Microsoft Azure, the OneAsset team wanted to do more than simply move machines to the cloud. They also wanted to take advantage of a whole host of Microsoft Azure services that could handle increasingly complex workflows, while also scaling to meet growing demand.

The OneAsset team also recognized the complexity of the migration, Apple says. To ensure a smooth transition with minimal disruption, the team broke the migration down into several stages, starting with a lift-and-shift approach to Microsoft Azure infrastructure as a service (IaaS).

Moving all on-premises virtual machines (VMs) to Microsoft Azure IaaS provided better control of the OneAsset infrastructure, enabling the team to spin up environments and resources on demand.

[Learn how Microsoft Azure Front Door makes it easy for two apps to boost availability and security. Read how Microsoft moves its IT infrastructure management to the cloud with Microsoft Azure. Learn how Microsoft is boosting productivity and security with a move to Microsoft Azure.]

More scale, less management

After moving to the cloud, the OneAsset team turned its attention to the benefits of Microsoft Azure platform as a service (PaaS).

First, the OneAsset data tier was moved to Microsoft Azure SQL Database. This instantly freed the team from a long list of maintenance tasks, from patching to backups.

“By moving OneAsset to Azure, we reduced maintenance time by over 30 percent,” says Abhishek Khilnani, senior software engineer with Microsoft Digital.

Moving to Azure PaaS has simply enabled us to do more, faster, says Prakash Assudani, a senior software engineering manager in Microsoft Digital.

Less time spent on maintenance also meant more time to focus on developing other OneAsset features.

“We’re delivering more of the features that our users are asking for, which makes for a better overall experience,” says Richa Agarwal, also a software engineer with Microsoft Digital.

Microsoft Azure SQL Database also brought other key benefits, such as active geo-replication and auto-failover groups, Microsoft Azure Traffic Manager for the data integrity between primary and secondary instances, and a minimum 99.99 percent up-and-running guarantee.

As an additional safeguard, the team rolled out each part of the PaaS migration as a preview, with the original IaaS environment made available to users as a backup. This helped support a seamless migration.

“We had almost no incidents raised during the migration,” says Prakash Assudani, senior software engineering manager on the Microsoft Digital team.

Completing the move to PaaS

The next step in the journey was to migrate the OneAsset web and user interface (UI) front end to Microsoft Azure App services. With all parts of the app infrastructure now outside of the corporate network, VPN access was no longer required. This was great news for all OneAsset users.

“A lot of the services using OneAsset were already on Azure. When OneAsset was on premises, this meant that these services had to use VPN to connect,” Apple says. “With Azure, we got rid of that dependency on VPN.”

OneAsset also leveraged multiple security features in Microsoft Azure. For identity and authentication, the team moved to Microsoft Azure Active Directory and implemented Microsoft Azure Key Vaults to store and manage passwords. Microsoft Azure Front Door—a secure entry point for apps—also ensures availability for users around the world.

With Microsoft Azure Application Insights, logs that previously had to be stored directly on the web servers and downloaded for triaging issues can now be maintained in Microsoft Azure. “This enables us to run analytics and query the logs easily, and provides us with automatic alerts, analytics dashboards, and other benefits,” Agarwal says.

Turning to Microsoft Azure PaaS enabled Microsoft to transform its OneAsset inventory management system.

Better data reporting and analysis

With data management and UI covered, the team turned to data analysis.

To better meet the needs of its users, the team developed a solution for data analytics using Microsoft Azure Data Explorer, a log analytics cloud platform optimized for ad-hoc big data queries. Additionally, Microsoft Azure Data Explorer, sometimes known as Kusto, provides users with elaborate, built-in analytics and comprehensive indexing and reporting features. It also provides several key improvements over the old system.

Complex queries that took hours with the old system can now be completed within seconds.

“We used to receive numerous tickets for query issues, and those tickets have been practically eliminated,” says Amit Raghuwanshi, a senior software engineer on the CPF team.

Microsoft Azure Data Explorer has also enabled the team to create a more relevant user experience.

Over time, users on the old system had created specific views of data containing specific attributes. The problem was that this exposed lots of similar data in different views, and to access such views, the OneAsset team had to give read-only access to SQL to selective users and teams. With Microsoft Azure Data Explorer, direct SQL connectivity is no longer required, and a better, more compact solution with performance and availability is available with analytics over the internet.

“We were quickly able to go from 190 views down to just 20 views that were more relevant to a larger number of users,” says Disha Chauhan, a program manager at Microsoft.

This same improvement could have also been done with the on-premises system, but Microsoft Azure Data Explorer made it easier. “Moving to Azure PaaS has simply enabled us to do more, faster,” Assudani says.

Looking to the future

Microsoft Azure hasn’t just improved legacy features—it’s made it easier to create new ones.

Moving to Microsoft Azure allows the OneAsset team to go mobile. With the on-premises system running within the Corpnet environment, running internet-based solutions outside of VPN wasn’t possible. With Microsoft Azure, users can now access OneAsset from anywhere, even via mobile. “Starting with capabilities like this made it possible for us to build the app,” Chauhan says.

And the journey continues.

“Our goal is to ultimately bring every connected device at Microsoft into OneAsset,” Assudani says.

A quick check of the latest metrics indicates that the team is on track to achieve that goal. The number of records in OneAsset has doubled in just the last six months.

Thanks to Microsoft Azure, the OneAsset team has been able to keep pace every step of the way.

“The scalability of Azure has enabled us to manage tremendous growth while also giving us the flexibility to plan and adapt for future growth,” Assudani says.

The post Move to Microsoft Azure boosts Microsoft’s inventory management system appeared first on Inside Track Blog.

Microsoft smart buildings bolstered by machine learning model, IoT

Josh Krenz — Tue, 08 Aug 2023 15:04:03 +0000

A new machine learning model and Internet of Things (IoT) sensors and automation enables Microsoft smart buildings to keep company employees as comfortable as possible. Microsoft’s real estate operations team relies on energy smart buildings, structures with interconnected automation and sensors, to responsibly maintain a base level of comfort.

Microsoft has deployed more than 50,000 sensors in roughly 100 buildings throughout Microsoft’s Puget Sound region in Washington state. The company is using data captured from these sensors to identify issues and inefficiencies as they happen, allowing them to be fixed before employees even notice them.

“Hot and cold calls are the biggest part of our facilities management requests,” says Mark Obermayer, a senior program manager on the Real Estate & Facilities (RE&F) team, the group responsible for managing the buildings across Microsoft. “A lot of our work is making sure our employees are comfortable and productive. It makes a big difference.”

Fortunately for those responsible for responding when one of these sensors goes off, the vast majority of all the signals emitted from Microsoft smart buildings don’t necessitate a response. Puget Sound could see hundreds of thousands of signals in a single week, with fewer than 1 percent being actionable.

“A portion of a building being off by a couple of degrees might not be a big deal,” Obermayer says. “It might be that the wind is blowing from the north that day.”

What Microsoft Digital came up with was a way to not only generate work orders in a quick manner—a few clicks—but also to predict which faults are a priority.

– Mark Obermayer, senior program manager, Real Estate & Security

To wrangle and maximize this data, RE&F tapped Microsoft Digital, the organization that powers, protects, and transforms the company, to figure out when a response is needed.

This meant finding a better way to parse the plethora of IoT data that the sensors were producing. In short, artificial intelligence and machine learning were needed.

“In the past, someone would manually enter tickets to check out a group of faults,” Obermayer says. “What Microsoft Digital came up with was a way to not only generate work orders in a quick manner—a few clicks—but also to predict which faults are a priority.”

[Discover how Microsoft’s smart buildings showcase Azure Digital Twins. Learn about Microsoft’s new era of smart building in Singapore. Find out how Microsoft promotes environmental sustainability from the inside out.]

Making sure work orders work

As sensors from Microsoft smart buildings feed this IoT data to Iconics (a third-party solution), faults, or specific violations of established rules, are identified. When a fault is recognized, a technician creates a ticket in Facility Link, the building management system Microsoft Digital built on Microsoft Dynamics 365 to manage work orders.

“Iconics and Facility Link weren’t communicating,” says Garima Gaurav, a senior program manager with Microsoft Digital, who identified several opportunities to introduce improvements across Microsoft’s heating, ventilation, and air conditioning (HVAC) systems. “Some technicians were spending the same amount of time writing tickets as working on the fix.”

In addition to being inefficient, manual processes were generating errors due to incomplete or inaccurate tickets. Incorrect work orders left jobs unfinished, leaving equipment running suboptimally and requiring additional technician visits.

To fix this, Obermayer and Gaurav reached out to Kundan Karma, a senior software engineer with Microsoft Digital.

“Technicians had to go to two places,” Karma says. “They went to Iconics, to perform the analysis, and they used Facility Link to submit the ticket. The new IoT Connector that we built brings them together.”

Senior software engineer Kundan Karma helped build the IoT Connector and machine learning model Microsoft is using to improve the operational efficiency of its energy smart buildings. (Photo by Kundan Karma)

Built on Microsoft Azure, the IoT Connector immediately removed manual steps, reducing errors, and improving communication. Creating a ticket became a one-click process, with greater accuracy and faster processing time for technicians.

“In the IoT Connector, we take care of all the data,” Karma says. “It’s a bridge between two systems.”

Designed with auto-healing and telemetry fail-safes, the IoT Connector gives RE&F confidence that faults will be captured and reported as tickets with greater accuracy.

“If messages between the two systems fail, the IoT Connector will resubmit,” Karma says. “After a certain number of retries or if there’s a major problem, it will create a ticket for an engineer to look at.”

Improved communication introduced a handful of ancillary benefits—specifically, visibility.

Where a technician might previously circumvent inputting information into a work order, automated copying facilitated by the IoT Connector made tickets in Facility Link a single click away.

“In cases where someone just does the fix without a work order, we don’t know what’s been done,” Obermayer says. “This left us with an incomplete history. We couldn’t see the demand for certain things.”

Now capable of tracking work orders, RE&F has a better understanding of what’s going on within specific buildings and assets. These insights are improving decision-making, especially as it relates to energy efficiency.

A firehose of IoT data

The IoT Connector shines a light on some challenges that come with scaling energy smart buildings.

“The target was 100 buildings,” Karma says. “We were so focused on integrating Iconics with Facility Link that we didn’t consider the volume of data. When we first rolled out the IoT Connector, we had to stop at 13 buildings. One building was generating approximately 2,000 faults per day.”

Extrapolated across Puget Sound’s 100 buildings, that amounted to roughly 200,000 faults in a single day. The scale of data being generated by IoT sensors could overload Microsoft’s entire Dynamics 365 system, bringing things to a standstill.

“The issue was conversions,” Gaurav says. “Only meaningful faults require an actionable response. We only want to check on real issues.”

Getting useful information out of IoT sensors is a challenge.

“There are different tolerances and different polling schedules for different pieces of equipment,” Obermayer says. “It changes from building to building.”

Microsoft Digital needed to separate the wheat from the chaff.

“If you have data generated in the thousands, it’s easy to miss important alerts,” Gaurav says.

Reducing the number of faults meant rethinking the way alerts from energy smart buildings were generated.

“What we realized is that 75 percent of the total faults were coming from one source, terminal units, and most of them were never converted to any work orders,” Gaurav says. “It was taking up most of the UI and creating too much noise. The way this data is now processed has adjusted how we’re digesting and prioritizing alerts.”

Terminal units, for example, were reordered and reprioritized to reduce the amount of noise being generated.

“We tried to group faults together,” Gaurav says. “One fault can trigger other alerts, but you don’t need multiple work orders.”

We want the model to mimic the behavior of a technician. It can go through the same decisions a human being can and reach the same conclusion.

– Kundan Karma, senior software engineer, Microsoft Digital

Instead of treating all alerts as individual issues, alerts could be grouped so several related faults resulted in a single ticket.

“Would a technician investigate that?” Karma says. “We want the model to mimic the behavior of a technician. It can go through the same decisions a human being can and reach the same conclusion.”

Teaching a machine to think like a technician

To get things started, Microsoft Digital looked at the history of faults and determined how they were converted to work orders.

Brendan Bryant, a mechanical engineer with DB Engineering, one of Microsoft’s partners, helped translate the technician’s process to the team. These inputs allowed the Microsoft Digital team to build a machine learning model that could mimic the behavior of a technician.

“We had key performance metrics from six to eight months’ worth of IoT Connector data,” Bryant says. “I helped Kundan look at HVAC telemetry and all the IoT metrics to get his team the information they needed to train the algorithm the right way.”

But before they could get there, naming conventions for assets and structures had to be standardized.

“This is one of the reasons we put in our own system,” Obermayer says. “How things would work was that a vendor would decide on an asset name when the building was constructed, then we’d change vendors or use a different vendor for a different building.”

The result was a variety of similar, yet varied, naming conventions. Facility Link meant RE&F could standardize and align all data points for energy smart buildings across campus.

“We can now look at a data point and tell you the number of air valves in Puget Sound,” Obermayer says. “Data and problem types are now the same on every system, making energy smart buildings more precise and efficient.”

Alignment of nomenclature also meant Bryant could better convey priority issues.

“There’s a lot of engineering intuition involved, especially when checking what’s false and what’s true,” Bryant says. “It’s a large amount of data provided by all of the equipment, so you have to make a judgement based on what you’re seeing.”

To help train the model to identify real issues over false alarms, Bryant and Karma moved away from real-time response and started viewing faults in aggregate.

“Something might show up on a Tuesday and be gone by Wednesday,” Bryant says. “There’s no value in creating a work order for that. But if it’s an issue for most of a week, that’s something we want to flag.”

Once aggregated, certain key performance metrics became strong predictors of a fault.

“In order to maintain high confidence that a fault needs to be addressed, we need a longer period of data,” Bryant says.

As the team continued their efforts, items that would result in a work order were flagged while all others were archived. From this, the model began to predict the faults that would result in work orders, flagging them for attention and archiving the rest.

“The technician can view anything flagged as ‘false’ and review it,” Karma says. “If needed, the technician can pull the fault from the archive and review it on the fly. The model learns from the mistake when it’s time to retrain.”

Thanks to machine learning and new practices, the number of faults was reduced by 80 percent to 90 percent.

“When we were onboarding, we couldn’t do all of Puget Sound’s smart buildings because the number of faults was huge,” Gaurav says. “Once we were confident that the faults generated were manageable and convertible to work orders, we were able to quickly onboard the rest of campus.”

Predicting the future for smart buildings

With the IoT Connector, Microsoft’s technicians are more efficient, disparate systems are better integrated, and modern infrastructure is in place to further sustain energy smart buildings.

“Right now, we’re only looking at HVAC, but there are so many other IoT assets throughout Microsoft,” Karma says. “A/V, security cameras—you name it. The next phase is to integrate all of these items into the IoT Connector.”

Flexibility within the IoT Connector allows it to be utilized with any asset across any region in the world.

“It becomes a scalable implementation,” Gaurav says. “We can even use it in areas that will eventually become energy smart buildings to help support those efforts.”

Karma also sees the IoT Connector, which is built on Microsoft Dynamics 365, as being available to other companies looking to improve the efficiencies of energy smart buildings.

“What we’re planning is to create the IoT Connector in a generic way so that other people can benefit from it outside of Microsoft,” Karma says. “Any other team should be able to use our learnings.”

The standardization of assets in Facility Link has helped spur other RE&F initiatives.

“Having this data is super important,” Obermayer says. “This will impact everything from procurement decisions to the management of movable assets.”

As Karma continues to refine the model, retraining hones prediction accuracy.

With each iteration, the model gets stronger.

“The big thing looking forward is helping to teach the algorithm so that we understand when it makes a decision and why,” Karma says. “Eventually the model will be able to assign work orders automatically.”

Gaurav agrees.

“The model is robust and converts some fixed number of alerts to tickets automatically. However, we also allow technicians to review through the list of alerts and allow them to manually create tickets as and when needed,” Gaurav says.

For Obermayer, all of this is a dramatic improvement.

“We started with thousands of faults but could only address about one percent of the issues,” Obermayer says. “We got the number of faults down so that we’re actioning 10 to 20 percent, which means we’re hitting meaningful faults. Artificial intelligence and machine learning are improving the business of energy smart buildings.”

The post Microsoft smart buildings bolstered by machine learning model, IoT appeared first on Inside Track Blog.

Learning to Listen: OneFinance Improves Customer Service with Microsoft LUIS Tool

Inside Track staff — Thu, 20 Apr 2023 20:41:51 +0000

Bonnie Cowan, director of finance at OneFinance, a global Microsoft partner for financial transactions, had a problem: 1 million support tickets flooded her team’s resources each year. Handling simple queries such as cancel an order and find the status of an order left operators little time to provide individualized attention to more challenging customer requests. The OneFinance team needed a new tool to break this bottleneck.

OneFinance already used Microsoft Dynamics – a line of enterprise resource planning and customer relationship management software applications – as its customer service support tool, and Cowan isn’t shy with her praise for Dynamics: “Everything about it has been awesome.” Managing Microsoft accounts payable and buy center processes is no small task, and Dynamics is essential in orchestrating workflow at OneFinance.

To address the overwhelming volume of routine support tickets, Jovalene Teo, senior global program manager at OneFinance, collaborated with the Dynamics team to find a solution. Teo and the Dynamics team decided to leverage a Microsoft machine-learning technology called Language Understanding, or LUIS. The goal was to automate the management of 30 percent of support tickets—those that were deemed not to require human staff intervention. If the initiative succeeded, 300,000 tickets per year could be handled and closed by chat bots.

[Learn more about how Finance is using AI and chatbots to simplify finance tools at Microsoft and how Microsoft is creating efficiencies in finance with Dynamics 365 and machine learning.]

LUIS is a machine learning-based service that enables users to build natural language into apps, IoT devices, and bots – such as OneFinance’s chat bots. “LUIS builds apps around customers’ natural way of getting answers,” said Azharuddin Mohammed, a Microsoft senior software engineer. “Customers don’t need to learn something new. They talk to a ‘person’—a chatbot—to get their answers.” In other words, when a user tells an app to book a flight, order a pizza, or remember to call Dad, the LUIS technology translates these commands to the app so that the action can be executed. “LUIS is the bridge that converts natural language into something a machine can understand,” Mohammed explained.

At OneFinance, the LUIS service integrated seamlessly with Dynamics tools already in place, enabling staff to divert and resolve routine support tickets while funneling more complicated tickets to human operators. The suite of services even allowed the team to work in a variety of languages.

As a global organization, the OneFinance team responds to support tickets generated in dozens of languages, so LUIS had to work with the integrated Bing Translator. The Dynamics team anticipated and accommodated this need. Mohammed pointed out that now when tickets come into OneFinance in Japanese, Bing Translator automatically rewrites the ticket in English. Most translated tickets are resolved quickly by the chat bot or operators, saving the company from paying premiums for translation services. In fact, LUIS supports 37 languages.

Recently, Microsoft’s voice-controlled virtual assistant Cortana and LUIS joined forces at OneFinance. Customers seeking support can speak their requests or call in, just as they would with a live operator. “Now users can go ahead and talk to our chatbot and get those same answers, instead of waiting for up to two days to talk to an agent,” said Mohammed.

LUIS has quickly streamlined OneFinance’s customer service. About 20 percent of users now direct their simple queries to chatbots and operators, and 97 percent of those tickets can be closed at “first touch.” “People can come in today and interact with the bot,” Cowan said. “If their question is answered, their query is closed, and we don’t have to go any further. For more complicated queries, they can choose to chat with an operator.” At this time, bots can answer 200 standard user questions, and to date, users have interacted with the chatbots 28,000 times.

LUIS’ positive impact on OneFinance operations is clear. “It is producing cost savings because we can drive efficiencies, but it is also about improving the customer experience,” said Cowan. “Because we have so many simple queries and we are trying to get to all queries, we weren’t able to provide the high touch we are looking for.” When the resolution of simple queries is automated, Cowan points out, complex queries can receive a “white glove service without an increase in costs.”

Mohammed agrees, describing the LUIS integration solution as a win-win situation for customers and businesses. “The benefits are huge. Customers are getting information faster, and they’re getting information accurately,” he said. “When machines take care of things an agent used to cover, you save money.”

Looking ahead, OneFinance plans to push LUIS to its full potential. Next in line: utilize LUIS’ active learning capabilities. “We are collecting data from chat logs with our bots and using that data to see what has been useful to our users and what hasn’t,” Teo said. “We are working on how to translate that feedback into improvement through LUIS.”

Using LUIS’ active learning capabilities, businesses can constantly update and improve the service. For example, when a user asks LUIS a question that it cannot process, the business can review the chat logs to adjust right away. “The best part is that this is easy to do. You can see in real time what users are doing and saying,” said Mohammed, “and then change the model to incorporate that input.”

Based on the quick success OneFinance has experienced by incorporating LUIS, Cowan now believes her team’s goal is in sight. With technical support from Dynamics, OneFinance should be able to close 300,000 simple queries per year—approximately 30 percent of all support tickets—using chatbots by next year. “The technology is new, and our volumes are low but our quality is high,” Cowan said. “We are just now starting the process of spreading the word across Microsoft. We are right on track.”

The post Learning to Listen: OneFinance Improves Customer Service with Microsoft LUIS Tool appeared first on Inside Track Blog.

Microsoft Azure sellers gain a data edge with the Microsoft Power Platform

Cody Bay — Mon, 04 Jan 2021 16:40:42 +0000

Data is great to have, but it’s only as good as our ability to digest it.

Alex Thiede, digital transformation lead for Microsoft in Western Europe and a former Microsoft Azure field seller based in Vienna, set out to talk to other Microsoft Azure sellers to discover how to help them serve their clients better.

For a multi-billion dollar business with more than 3,000 sellers, the potential for impact was huge.

– Alex Thiede, digital transformation lead

What emerged was a common pain point around exploding data. An enormous amount of customer data was being produced, but it was being siloed into different systems that never connected. Cloud Solution Architects (CSAs) and Microsoft Azure specialists would have to go into Microsoft Azure portals for customer data, Microsoft Dynamics 365 to track their customer engagements, and the Microsoft Account Planning Tool to manage account plans.

For Microsoft Azure sellers, whose mission is to help their clients be successful with their cloud experience, it was difficult to get a clear picture of how their accounts were performing. They were spending hours analyzing their data, running it through their own Microsoft Excel sheets and Microsoft Power BI reports, before finally sharing their insights with their account teams which required even more hours spent building Microsoft PowerPoint slides.

“For a multi-billion dollar business with more than 3,000 sellers, the potential for impact was huge,” Thiede says. “So how do you bring those teams together on the IT side to have a customer-centric view?”

Thiede realized that this was a great question to answer with a Hackathon project.

Thiede assembled a team that included data scientists, field sellers, security specialists, and Microsoft Power Platform developers who were all passionate about solving the problem. They set out to build a solution using Microsoft Power Platform while demonstrating how IT and sales teams could come together in a citizen developer approach.

Within two weeks, the team had come up with the S500 Azure Standup Cloud Cockpit, a tool that brought all the data together in a configurable dashboard that put the individual sellers in the pilot seat.

For Jochen van Wylick, a cloud solutions architect, Hackathon team member and the lead CSA for strategic accounts in the Netherlands, that meant there could finally be a real tool to replace all of the manual unofficial hacking they had been doing to try to layer data in a meaningful way.

Van Wylick showed the team how they were adding additional metadata to the dozens of engagements they were tracking in their CRM to stay organized, and they incorporated that capability in an automated way.

“I like the fact that Alex implemented these ideas in the Stand Up Cockpit,” van Wylick says. “I also like the fact that it will boost my productivity.”

[Learn how Microsoft has automated its revenue processing with Power Automate. Find out how Microsoft is monitoring end-to-end enterprise health with Azure.]

The Microsoft Power Platforms and the power of citizen development

The team wanted to enter the Hackathon competition with a viable product to wow the judges. So, they used the Microsoft Power Platform to create a low-code tool that proved the feasibility of the Stand Up Cockpit while demonstrating how sales and IT teams could innovate together using a citizen developer approach.

Collaborating across six different regions on three continents in the first all-virtual Hackathon, the IT team members built the application environment while leaving the user interface up to the sellers to customize as they wished.

Stefan Kummert, a senior business program manager for Microsoft’s Field App and Data Services team, built the cockpit’s components on Microsoft Power Platform. Kummert says the challenge was the ability to create composite models layering Microsoft Power BI data with Microsoft Azure data analysis. While this is in fact a new Microsoft Power Platform Power Apps feature slated for release sometime in November, it wasn’t available to them at the time of the Hackathon in July.

“So, we tried to remodel this concept, more or less,” Kummert says. “We factored what’s available out of the box with some other Power Platform building blocks, and that’s what gave us all the functionality we needed.”

Sellers could now integrate their data sources into a composite data model, add custom mapping and commenting, gain insights at the child and business unit levels, and more quickly identify issues and potential for optimizations that would serve their clients. At the end of the Hackathon, they had a working prototype using real customer data.

The Azure Stand Up Cockpit used citizen development to create a composite model of disconnected data sets from Core Platforms to provide deeper understanding and insights of client accounts.

The team largely credits this agility to the citizen developer approach, which empowers non-developers to create applications using low-code platforms sanctioned by IT. “There’s often not enough time to create applications in the classic way,” Kummert says. “I think citizen dev is changing the picture significantly, giving us a fair chance to address the huge amount of change happening in the business environment.”

Microsoft’s 2020 Empower Employees hackathon category. With their win, they were awarded dedicated resources and sponsorship from Microsoft Digital.

Turning the dream into reality

Fresh off their Hackathon win, the team is now working on moving the app into production and getting it into the hands of Microsoft Azure sellers.

They’ll first roll it out to 10 customers, then another 100, and if it’s successful, it will be built into the core platform and scaled out across the Microsoft Sales Experience, MSX Insights, Microsoft Organizational Master, and Microsoft Account Planning programs.

This rapid prototyping and incremental rollout is a strategy targeting increased adoption–an approach that’s appreciated by program managers like Henry Ro, who maintains sales and marketing platforms for Microsoft Digital.

Without the Hackathon, it would have been harder to bring this team together. Rather than doing this just once a year, why not have it as a regular working style? It’s about the energy, the inclusive culture, and the people coming together who have real passion.

– Alex Thiede, digital transformation lead

“Projects like the Azure Cockpit really make it easy for our team and others to validate an idea and take it to fruition,” Ro says. “We’re excited about its capabilities and how we can enable it.”

For their part, Thiede and the team are already itching for another Hackathon–or at least more projects driven by the same kind of inspiration and agility.

“Without the Hackathon, it would have been harder to bring this team together,” Thiede says. “Rather than doing this just once a year, why not have it as a regular working style? It’s about the energy, the inclusive culture, and the people coming together who have real passion.”

The post Microsoft Azure sellers gain a data edge with the Microsoft Power Platform appeared first on Inside Track Blog.

Transforming 70 million calls at Microsoft with Microsoft Azure

Colin Mitchell — Tue, 07 Jul 2020 22:15:34 +0000

Microsoft’s call center was calling out for some major updates.

Originally designed over 20 years ago, the call center had grown into a complex patchwork of different systems handling over 73 million calls a year.

The result was a global network made up of over 20 separate phone systems, over 1,600 different customer-facing phone numbers, over 30,000 support agents, and a dedicated team per region to manage all this. In addition to the technical complexities, multiple platforms also resulted in inconsistent calling experiences for customers.

“Each time we came up with a new product, we’d spin up a new phone number and a new phone system,” says Matt Hayes, a program manager with Microsoft Digital, Microsoft’s IT and Operations division.

As some of the infrastructure started reaching the end of its life and required increasing complexity to support it, the Microsoft Digital team had a decision to make: update and replace the existing infrastructure, or overhaul the entire network?

It was then that employees in Microsoft Digital began exploring a more centralized model using cloud-related technologies. The new system promised to streamline operations and deliver a better customer experience for Microsoft customers around the world.

In 2015, Microsoft embarked on a journey to replace their legacy contact center infrastructure with a next-generation, digital cloud-based offering.

[Read about the transformation of Microsoft.]

A new system takes shape

Microsoft’s move to the new cloud system allowed it to retire most traditional isolated platforms, simplify its carrier networks, and provide huge cost savings.

“We have streamlined down to just a few carriers,” Hayes says. “It’s significantly more efficient.”

That kind of efficiency has also made it easier to monitor and troubleshoot network issues, resulting in dramatically higher and more consistent call quality.

What’s equally exciting are other innovations that Microsoft Digital has added on top of their routing cloud-based platform.

By integrating tools like Microsoft Power BI, Microsoft Azure AI, and Microsoft Azure Cognitive Services, Microsoft is driving efficiencies and unlocking insights across its call center network.

“Moving to a common platform in the cloud—and being able to innovate on that platform—has given us greater visibility into just about everything, which has led to some amazing improvements,” Hayes says.

Resource management moves into high gear

Microsoft’s customer call volume is constantly changing across different products and regions. This often overloaded one call center while another went underutilized. With the older system, it was difficult, if not impossible, to reroute calls from an overloaded center to one with more capacity.

Now, Microsoft customers go into global queues, making it easy to distribute volumes between suppliers and regions. Leveraging agents across regions during high call volumes reduces wait times and removes the burden on specific carriers and regions.

“If one supplier is overloaded, call volume simply shifts to another qualified supplier,” Hayes says. “There’s no need to reach out to an IT service provider or supplier—it’s just part of the system.”

Call center managers can now also better forecast call volumes and provide these forecasts to outsourced suppliers, which they use for staffing plans. And of course, a more agile resource plan enables a smoother customer experience.

Just as calls and customers can be routed anywhere, so too can sales agents and supervisors be located anywhere, which includes remote sites outside a call center. This has proved particularly valuable during this period of remote working.

“We had to move all of our agents from call centers to physically working from their home, which the cloud let us do,” Hayes says.

More effective monitoring

Managers also have access to more holistic data. From the outside looking in, they can see exactly how customers move around the system. They can track how their customers are doing as they navigate through the support process, how long it takes to resolve an issue, and how agents are performing.

“We can provide a summary of every phone call that’s made, how a customer actually got to the agent, which agent they talked to, and all in one view, including both a recording and real-time transcription of the call,” Hayes says. “Better yet, we can bring up that call record in just a few hours, versus several days with the old system.”

Along with monitoring agent performance better, managers can use this data to customize coaching for great customer interactions. The system also helps ensure solid contract performance from suppliers via documented historical data.

Centralized reporting

One common cloud platform has produced one source of call center truth—which, in turn, provides a cohesive view of actuals and forecasts, as well as real-time access to key performance indicators like hold and handle times, disconnects, and even call sentiment.

This view also allows Microsoft to fine-tune the system more effectively over time.

For example, the team was able to drill into data detailing exactly why some callers were being transferred incorrectly. This led to changes in the system, which decreased transfer rates by over 48 percent.

Both invoicing and forecasting are more accurate with a single source of telephony data. And by automating many of the system’s auditing and compliance requirements, Microsoft has eliminated more costly and time-consuming manual processes.

With a move to the cloud, Microsoft’s call center operations are ready for the future.

“Ultimately for Microsoft, the shift of call center routing to the cloud has improved performance while creating exciting new opportunities for growth,” Hayes says.

Read about the transformation of Microsoft.

The post Transforming 70 million calls at Microsoft with Microsoft Azure appeared first on Inside Track Blog.