How auto-scaling SAP on Microsoft Azure is benefitting Microsoft

|

A photo shows the faces of the different team members who worked on the new SAP application: Maski, Rajput, Ganguli, Parseja, and Ahmed.
Team members who worked on the auto-scaling SAP on Microsoft Azure project include (clockwise from upper left) Niranjan Maski, Santosh Rajput, Amit Ganguli, Karan Parseja, and Sofia Ahmed.

Microsoft Digital storiesMicrosoft has implemented auto-scaling SAP on Microsoft Azure to help its SAP workloads run more efficiently.

Why?

Like many enterprises, Microsoft runs on SAP.

It uses the software to run everything from tracking servers in its supply chain to making sure the company’s 140,000 employees are paid on time. It has one of the largest SAP deployments in the world.

In fact, Microsoft manages 50 terabytes of SAP data (enough to hold nearly 7 million digital photos) on 700 Microsoft Azure virtual machines. The company’s SAP usage is doubling year over year, and it conducts 300 million operations per month.

In 2017, Microsoft Digital team orchestrated a massive lift-and-shift of the company’s SAP business process services, moving that data trove to 700 virtual machines.

The move was intended to save Microsoft operational costs, while also improving reliability.

And it succeeded.

Still, more could be done, says Sanoop Thrivikraman Nampoothiri, a senior software engineer in Microsoft Digital.

“We thought the benefits of moving to Azure could be even greater if we just took a few more steps,” he says. “So we designed a way to more efficiently manage the resources we use to power our SAP applications.”

The team had already made progress. Costs of managing the SAP workload dropped 18 percent during the first two years of Microsoft Azure operations. That was thanks to moving away from on-premises hardware and waterfall-based engineering practices and deploying upgrades. You can read more about that effort.

Moreover, lifting-and-shifting Microsoft’s SAP workload to Azure allowed the company to easily scale its SAP application to keep up with the explosive growth in usage.

But Microsoft Digital engineers sought further savings. They saw pain points such as the rising costs of running SAP on virtual machines, the company’s one-size-fits-all approach to server configuration, and the lack of an out-of-the-box way to dynamically scale SAP workloads.

[Learn how Microsoft monitors SAP end to end. See how Microsoft monitors end-to-end enterprise health with Microsoft Azure. Check out how Microsoft migrated critical financial systems to Microsoft Azure.]

Adapting SAP for the cloud era

One of the challenges with managing SAP in the cloud is that even though it now has more than 220 million cloud-based users, it’s not fully optimized for today’s elastic cloud infrastructure.

“SAP (in its current form) was designed back in the 1990s,” Nampoothiri says. “It’s an older architecture, and it’s not as modern as some of our other Azure services—especially web services. At the same time, it’s one of our busiest services, managing everything from finances to supply chains. And a lot of our customers are in the same position as we are.”

One result is that engineers tend to be cautious when managing mission-critical applications such as SAP, building in plenty of capacity to ensure customers always have access.

“When you design a system like that, you always design for peak load,” Nampoothiri says. “Most of our customers do the same. But Azure has a lot of flexibility that allows you to right-size systems.”

This leaves SAP application servers at a sweet spot for automation and optimization. Combining the oversight of Microsoft Azure Monitor with the power of Microsoft Azure Automation, these application servers can be scaled at will.

Microsoft Digital carefully monitors SAP usage and stability using Microsoft Azure Monitor and applies technologies such as predictive analytics to spot potential problems before they occur. That monitoring also measures the loads on SAP infrastructure, allowing engineers to clearly see usage patterns.

“The telemetry from Microsoft Azure Monitor helped us understand which workgroups have different loads,” says Karan Parseja, a Microsoft Digital software engineer in Hyderabad. “The next step was to build a solution that would decide which servers should run at lower load levels, and then automatically reduce the capacity for those servers. We also needed the solution to gracefully stop an application when needed.”

Enter auto-scaling, tight-sizing, and snoozing.

After the migration of 700-plus virtual machines (VMs) to Azure, we were constantly looking at the opportunities for further optimization of infrastructure resources.

– Santosh Rajput, senior software engineer in Microsoft Digital

Microsoft Azure runs SAP more efficiently with auto-scaling

The seed for auto-scaling SAP on Microsoft Azure came from a hackathon—an annual week-long event at Microsoft where everyone teams up with colleagues to work on ideas of their choice.

“After the migration of 700-plus VMs to Azure, we were constantly looking at the opportunities for further optimization of infrastructure resources,” says Santosh Rajput, a senior software engineer in Microsoft Digital. “During a hackathon, we came up with this idea of scaling in or out of SAP application servers automatically, in real time.”

Altogether, the team took three approaches to improve how Microsoft Azure runs SAP:

Auto-scaling. The team embraced an “infrastructure on demand” approach, in part because it’s easy to scale Microsoft Azure up as needed. Team members used the SAP Quick Sizer tool to estimate precisely how much VM capacity was needed, then scaled accordingly. And they shortened the planning horizon from several years to six months, enabling more precise adjustments to demand.

Tight-sizing. Most system demand peaks are predictable—quarter-end and year-end in particular. The Microsoft Digital team redesigned its VM array running SAP to correlate system capacity with anticipated peak demands.

Snoozing. Perhaps the biggest change was to move away from the always-on status of the original SAP setup. The Microsoft Azure team used Microsoft PowerShell to give the system the ability to “sleep” during quiet periods. But if someone is working on a weekend and needs access, the virtual machines rapidly come back online to do the work.

The reconfigured SAP/Microsoft Azure system also was redesigned with fewer points of failure and has a substantial degree of redundancy to guard against unexpected faults. It also has dual databases, which provide automatic failover in the event one crashes. That also makes it easier to perform system upgrades without interfering with work demands.

A chart showing the flow of data through a SAP instance in Azure. It shows how the databases, servers, and Azure interact to respond to changes in demand for SAP.
Microsoft’s SAP infrastructure is based on servers, telemetry, SQL databases, and Microsoft Azure Logic Apps. This allows Microsoft Azure to scale SAP up or down, depending on demand.

Still, perhaps the biggest task was finding a way to deploy these improvements in a way that allowed Microsoft’s SAP infrastructure to keep working smoothly while auto-scaling SAP on Microsoft Azure changes were made. Think of it as repairing a jetliner mid-flight—from the outside.

“We had to convince our stakeholders that this would really work without having an impact on the availability of the system,” Rajput says. “Any customer running SAP and Azure would have that concern as well.”

When we moved, we didn’t want to take any chances. We wanted to show people the best possible way to run SAP on Azure. So we were conservative and focused on availability for peak loads. But now we’re confident that Azure can handle SAP workloads, so now we’re working on optimization.

– Niranjan Maski, senior program manager in Microsoft Digital

Niranjan Maski agrees. He is a senior program manager for Microsoft Digital in Hyderabad.

“When we moved, we didn’t want to take any chances,” Maski says. “We wanted to show people the best possible way to run SAP on Azure. So, we were conservative and focused on availability for peak loads. But now we’re confident that Azure can handle SAP workloads, so now we’re working on optimization.”

Empowering customers to do more with SAP

Overall, auto-scaling SAP on Microsoft Azure reduced the cost of running SAP by another 18 percent and created a more robust system in the process.

Now used internally, these improvements may be rolled out for customers using Microsoft Azure and SAP. That would be an important stage in keeping Microsoft Azure abreast or ahead of competitors, who also run SAP on their cloud services.

“With COVID-19, we’re seeing more enterprises moving their IT infrastructure to the cloud, so they have better resiliency and scalability,” Rajput says. “For a lot of our customers, their biggest workload is enterprise resource planning (ERP) performed on SAP. If we can show them that moving to the cloud saves them money, then that will drive more cloud adoption.”

Options include making this an add-on to Microsoft Azure, says Amit Ganguli, a Microsoft Digital program management director based in Hyderabad. That also means possibly using Microsoft Azure Monitor, which now is in preview, or open-source code on GitHub.

For the team, making a big difference despite their few members has been a great source of satisfaction.

“I’m really proud of my team members,” Rajput says. “One of the strengths of Microsoft is it can quickly build teams that can solve big problems like this. I don’t feel like I’m just doing a job. What motivates me is that we’re having a positive impact on our customers.”

Related links

Recent