This content has been archived, and while it was correct at time of publication, it may no longer be accurate or reflect the current situation at Microsoft.
Cybersecurity is a constant challenge at Microsoft.
To protect the connections between computer servers, resources, and their users, the company regularly uses digital certificates signed by trusted authority providers.
Every quarter, the Microsoft Digital team is responsible for some 4,000 digital certificate renewals. At two to four hours to do each one, sometimes more, this used tens of thousands of labor hours every year.
What if there were a way to avoid the manual, time-consuming tasks normally associated with certificate renewal? That’s the question that Wilson Gajarla, a senior software engineer in Microsoft Digital, faced after an incident with an important certificate on a local machine in a Microsoft lab at the company headquarters in Redmond.
When Gajarla and team noticed that the certificate was about to expire, they realized it would take more than a few hours to renew it. In this case, the lab itself was off-limits due to COVID-19 so the machine wasn’t easily reachable to perform the manual renewal process.
Fortunately, Gajarla was granted special permission to enter the building due to business urgency, and was able to renew the certificate in time. But the experience stuck with Gajarla.
Gajarla began to look for better ways to manage server security, even in cases where an administrator can’t connect remotely to a machine.
[See how Microsoft Azure Front Door makes it easy for two apps to boost availability and security. Find out what Microsoft learned from applying Zero Trust principles during COVID-19. Get inspired by Microsoft use of Microsoft Azure APIs to develop a robust external supplier catalog.]
A heavy load for server and resource administrators
Today’s servers and websites use digital certificates to provide a high level of security that’s easy to take for granted, but this practice comes with a steep cost.
“It can take several hours to renew a server certificate,” says Ashish Sayal, a principal software engineering manager in Microsoft Finance. “It’s a manual process, so there’s always the risk of error or even bringing down the server if it’s not done correctly. This impacts the level of customer trust and the organization’s reputation.”
The engineers who are responsible for certificates must anticipate upcoming expirations, renew them based on the issue date, and then upload the newly signed certificate to all the resources that are using it. Finally, they have to communicate to all the certificate consumers the need to renew the new version on their resources as well.
Sometimes authority providers require the renewal of all their issued certificates at once without advance notice. That means servers and resources have to use compromised certificates until the engineers can complete the process.
“Recently within Microsoft Digital, we had to renew more than 2,000 certificates across the teams,” Gajarla says. “Some teams had immense workloads to get this done.”
Timely certificate renewals are not just for servers used in development, they are essential to daily operations at Microsoft.
“If a certificate renewal isn’t taken care of on time, it will quickly escalate to leadership,” Sayal says. “We need certificates to be current to stay compliant and avoid fire-fighting at the last minute.”
Verify that a certificate is the best solution
The first step is making sure that a digital certificate is the right approach.
“We should check why we are using the certificate,” Gajarla says. “Is there any modern way we can integrate the services without one?”
The answer is often yes.
After Gajarla’s difficulty gaining physical access to an asset with an expiring certificate, the team began exploring ways to optimize certificate management. Along the way, they found that sometimes a machine-specific certificate isn’t needed at all.
“I recommend using Azure managed certificates for Azure Front Door, Azure Key Vault, Azure Content Delivery Network (CDN), and Azure App Services, included in every Azure subscription,” Gajarla says. “Managed certificates eliminate the need to physically touch a machine to handle routine certificate operations.”
In the past, Microsoft Digital used connection strings and shared access signature (SAS) tokens known as Secrets to access services and storage. To maintain the authorized connection string, Secrets must be renewed every three months which took hundreds of hours each year.
Now the team takes every opportunity to avoid certificates, connection strings, and SAS tokens from the beginning. Instead, Microsoft Digital uses managed identities for Microsoft Azure resources and other new Microsoft Azure services to protect servers and resources.
Another approach is to add encryption when the data is at rest or in transit using Microsoft Azure service managed keys, which reduces risk and lessens the need to own and manage digital certificates.
“Investigate whether you really need a certificate,” Sayal says. “With the modern world that Azure offers, you don’t need them as often.”
Automation and operations with Microsoft Azure
Gajarla discovered several certificate management best practices including a technique called ‘touchless certificate management’ that uses Microsoft Azure Key Vault. “This Azure service is the main driver of automation,” Gajarla says. “It connects to the certificate providers and automatically renews them.”
Now that the process is automated, I didn’t even notice that I needed to renew my certificates. It saves me the time it would have taken to deal with them and lets me focus on my current tasks.
– Dennis Han, software engineer, Microsoft Finance
“Even with custom domain endpoint, you can use the automation options provided by Key Vault,” Gajarla says. “This integrates well with third-party certificate providers. It’s the most efficient way to create, roll over, and renew certificates.”
Touchless certificates are not associated with an individual machine. “It’s not a reactive, but a proactive approach,” Sayal says.
Developer Dennis Han, a software engineer in Microsoft Finance, is already benefiting from the new Microsoft Digital practices. “Now that the process is automated, I didn’t even notice that I needed to renew my certificates,” Han says. “It saves me the time it would have taken to deal with them and lets me focus on my current tasks.”
Han also finds that the new system is better for security in addition to ease of certificate maintenance. “It enables business functionality rather than requiring me to spend time managing infrastructure,” Han says.
Likewise, Gajarla has seen a time savings for engineering across the board. Teams are still rolling out the new practices. “Some teams have 100% adoption while a few have legacy systems and are still in the process,” Gajarla says.
The benefits are already clear.
“Now each service has its own identity, not tied to a machine,” Gajarla says. “Senior leadership doesn’t have to monitor what certificates are expiring or follow up with their teams anymore.”
Sayal has noticed a difference as well.
“Now everything is done proactively,” Sayal says. “Now everyone can focus on their best work.”
See how Microsoft Azure Front Door makes it easy for two apps to boost availability and security.
Find out what Microsoft learned from applying Zero Trust principles during COVID-19.
Get inspired by Microsoft use of Microsoft Azure APIs to develop a robust external supplier catalog.