Why Microsoft uses a playbook to guard against ransomware

|

Microsoft uses a playbook to help it fight ransomware.

Microsoft Digital storiesWhen Microsoft’s Digital Security and Resilience (DSR) division set out to defend the company against human-operated ransomware, it faced several formidable challenges. In this form of ransomware, highly organized and sophisticated attacks by cybercriminals put major businesses, healthcare organizations, universities, and governments in their crosshairs for their visibility and potential payout. Human operated ransomware’s targeted strategy requires a holistic and comprehensive response, which comes in the form of the Ransomware Elimination Program (REP), our centralized and collaborative cross-company effort.

Attackers are more focused and targeted, they’re on a mission. It’s not a phishing email that spreads out to a bunch of random addresses and hopes someone clicks. That only nets you random targets. Human-operated ransomware aims for an enterprise and tries for big returns.

—Henry Duncan, senior security program manager, Digital Security and Resilience

As we discussed in our previous ransomware post, REP was purpose-built atop the philosophy of the philosophy of Zero Trust to give Microsoft a way to centralize defense, recovery, and resilience against ever changing cyberthreats. Core to the program is the ransomware playbook, our internal guide to ensure teams across the company take the right action to respond, recover, and remediate in the event of an attack. Adherence to the playbook limits the opportunity for attacks and minimizes the potential reward that criminals seek.

“Attackers are more focused and targeted, they’re on a mission,” says Henry Duncan, a senior security program manager on REP, part of DSR, the team responsible with protecting our enterprise so that we can deliver and operate secure products and services to our customers. “It’s not a phishing email that spreads out to a bunch of random addresses and hopes someone clicks. That only nets you random targets. Human-operated ransomware aims for an enterprise and tries for big returns.”

The longer threat actors are active in an environment and can move around, the greater the risk to the target. Each passing moment presents an opportunity to acquire more access to data through compromised accounts, or tamper with security and backup systems—and that means a higher likelihood of data being compromised and a larger ransom demand. Time is of the essence.

[Read blog one in our ransomware series: Sharing how Microsoft protects against ransomware. | Read blog three in our ransomware series: Building an anti-ransomware program at Microsoft focused on an Optimal Ransomware Resiliency State. | Learn more about human-operated ransomware. | Discover how Microsoft’s Zero Trust effort keeps the company secure.]

Writing the book on ransomware

When conceptualizing what it wanted the playbook to achieve, the REP team knew it needed to facilitate excellence in operational response readiness, have the flexibility and scope to address cyberattacks of any scale, and to align response processes across the company.

“We needed the playbook to articulate and visualize what everyone’s role in a process is,” Duncan says. “It’s not just a security thing; we have to get other teams involved, like legal, finance, and enterprise business continuity.”

Engaging with stakeholders from those organizations allowed the REP team to better understand the different methods used across the company to triage, contain, and escalate events. Such conversations and interviews were a vital learning opportunity, and when combined with industry and internal best practices, illuminated gaps and weaknesses and generated ideas to bridge them. Collaborative cross-team dialogue shaped the framework the team used to develop key processes, including what is used to recover critical services.

With this information synthesized, the REP team began structuring the ransomware playbook around addressing these four key questions:

  • How prepared are we for a cyber event?
  • What controls are in place to detect and identify malicious activity in our environment?
  • What is the appropriate response from various teams to contain and recover from threats?
  • How should a post-incident and root-cause analysis be performed?

The resulting document provides a unified and holistic response to cyberthreats for the company to use.

Walking the walk

“For a playbook to work, you need to test,” Duncan says. “It’s easy to think you’ve captured everything on the page, but we need to see what happens in practice.”

Performing simulations for a variety of scenarios demonstrated what might happen if an attack were to occur at Microsoft.

It’s hard to measure the significance and when to escalate events; are we talking about a handful of machines or a large critical system? Now we have processes to have a consistent plan for triaging and triggering events.

—Henry Duncan, senior security program manager, Digital Security and Resilience

Security professionals and stakeholders were put to the test. Detection and prevention systems were put through the wringer. Backup and restore functions were reviewed, ensuring the resiliency and recovery precautions needed to circumvent the leverage of cybercriminals were in place.

Not only did these live drills verify steps within the ransomware playbook, they also allowed the REP team to gather additional feedback, including ways to better categorize and triage ransomware.

“It’s hard to measure the significance and when to escalate events; are we talking about a handful of machines or a large critical system?” Duncan says. “Now we have processes to have a consistent plan for triaging and triggering events.”

Because ransomware continues to change, so must Microsoft’s response. The playbook is a living document, updated with regular reviews of testing and stakeholder engagement, enabling it to stay current with the quickly changing tactics of threat actors.

The benefits of playing it by the book

While the primary function of the ransomware playbook is to ensure Security Operation Centers (SOCs) and engineering teams across Microsoft have a documented process for responding to and recovering from ransomware, the playbook’s design has additional built-in benefits.

Duncan smiles in a portrait photo.
Henry Duncan is a senior security program manager in Microsoft’s Digital Security and Resilience division.

For instance, its detail clearly outlines who is responsible for what, creates visibility at the appropriate time, and clarifies escalation. The right process owners get the right information at the right time.

“You need visibility into how an event surfaces,” Duncan says. “Now we have a predictable mechanism to trigger incident response. Those definitions bring leadership into appropriate major events.”

In practice, Duncan and the REP team found the playbook to be a useful tool for continuous improvement. Regularly run internal tabletop exercises help DSR and the REP team measure Microsoft’s ability to effectively respond to specific types of attacks. Simulations and tests provide vital opportunities to expose issues, refine internal processes, and close the gap in eliminating ransomware. In using the playbook, Microsoft isn’t just more prepared against ransomware, but against security attacks in general.

This also happens to make the ransomware playbook a valuable training tool. Its adoption across the company is essential to a successful and holistic response to an attack. With training, the knowledge of roles and responsibilities, combined with muscle memory of the right actions to take ensures those involved are ready when put on the spot.

“We’ve also found that teams love the playbook as an onboarding tool,” Duncan says. “Anyone who joins Microsoft can know what the expectations are and loop that into their training. They’ll know how they fit into the ransomware equation.”

There’s a plan in place

Having the Ransomware Elimination Program along with the playbook gives teams across the company more visibility into the importance of ransomware. Microsoft now has a platform to share knowledge across organizations and centralize efforts to reduce the opportunity and reward for cybercriminals.

Human-operated ransomware is a full-time job for cybercriminals. None of us are perfect but being aware, having the right technology in place, and putting a plan in place reduces the likelihood and impact of an attack on the environment.

—Henry Duncan, senior security program manager, Digital Security and Resilience

“We can champion how people protect the environment while also involving them to improve response procedures,” Duncan says. “REP is the frontline of what an optimal ransomware resilience state should look like. That’s going to happen by working with different teams throughout Microsoft to research and understand the greatest risks.”

With a playbook at hand, there’s more confidence than ever that Microsoft’s people are prepared to detect and respond appropriately to malicious activity. The structure provided by REP and its playbook empowers Microsoft to capture important insights about its own resiliency, helping to drive future improvements. That’s critical, especially as ransomware continues to evolve.

“Human-operated ransomware is a full-time job for cybercriminals,” Duncan says. “None of us are perfect but being aware, having the right technology in place, and putting a plan in place reduces the likelihood and impact of an attack on the environment.”

While the ransomware playbook is internal to Microsoft, the REP team is investigating the best way to share its learnings so others can build their own.

Key Takeaways

  • The ransomware playbook serves as a single source of truth for detecting, responding, and recovering to ransomware. It helps identify the strategy and preparation approach for resiliency
  • Leverage your existing resources; you don’t have to start from scratch when developing a ransomware playbook
  • Invite stakeholders to participate in the development of your ransomware playbook. It will create a more comprehensive and inclusive document, and will improve adoption
  • Clarity of documentation is essential. Be sure to define expectations, roles, and responsibilities. Create diagrams and process flows whenever possible

Related links

Recent