Simplifying Microsoft’s royalty ecosystem with connected data service

|

Engineers at Microsoft have developed a new, connected data service powered insights system for unifying complex and multi-dimensional royalties data.

Microsoft Digital storiesIn any given month, Microsoft manages about 1.5 billion royalty transactions related to an estimated nine million products that span more than 3,000 partners, whose relationships are governed by over 5,000 active contracts.

That’s a lot of transacting.

Through smart data assembly, natural language processing, and a custom Microsoft Teams bot, the engineers on Microsoft’s Royalties Team have created Royalties Assurance as a Service (RaaS), the company’s new internal royalty transaction platform.

Our royalties system is dynamic and complicated—it processes multiple millions of transactions per day, dealing with hundreds of thousands of products, processing and calculating earned royalties according to the specific contracts in an accurate, timely, and compliant manner.

—Jagannathan Venkatesan, principal group engineering manager, Microsoft Global Payments and Cash

“It’s one thing to build an extensible and scalable system processing vast quantities of data spread across multiple dimensions and business rules to meet the accuracy, timeliness, and compliance needs of a multi-billion dollar business,” says Jagannathan Venkatesan, principal group engineering manager in Microsoft’s Global Payments and Cash organization. “It is an entirely different challenge to be able to reduce that complexity through building a fully connected data system to bring it to a single canvas that is easy to interact with. With RaaS, our Royalties team was able to do just that with the help of RaaS APIs that enable system-to-system integration—including Microsoft Teams integration—and human augmented exploratory analytics.”

Mehrabi poses for a portrait photo.
Ehsan Mehrabi, a senior finance manager on the Royalties team, is among those using the transformed royalty transaction insights system. (Photo by Ehsan Mehrabi)

Enterprise royalties, complex connections

Like many companies, Microsoft manages complex royalties relationships with other organizations. For example, the Microsoft Store sells Xbox games that leverage intellectual property from third party vendors. Or partners sell services through Microsoft that necessitate royalties based on consumption. In each of these cases, once an incoming transaction occurs and Microsoft has obtained the funds, a complex orchestration of calculations must take place to ensure each party receives the proper payments.

“Royalties payouts are a critical aspect of Microsoft business, enabling our global digital content partners to realize the value of the products they have onboarded onto the Microsoft ecosystem,” Venkatesan says. “Our royalties system is dynamic and complicated—it processes multiple millions of transactions per day, dealing with hundreds of thousands of products, processing and calculating earned royalties according to the specific contracts in an accurate, timely, and compliant manner.”

Gaining clarity throughout that entire system of relationships is essential for accuracy in accounting and payouts, and it is an integral part of generating organization-wide insights.

“When we generate a statement for a partner for a particular period, our system must be able to enable the business to walk back from the statement to products to transactions along with bringing appropriate contact and partner data including historical information,” Venkatesan says.

To achieve that level of clarity and trust in the system, the royalties team needed to aggregate the datasets underlying all of Microsoft’s royalties relationships and transactions, then make the results available in one easily accessible place.

“The challenge is the complexity around calculating payouts and retrieving that information,” says Ehsan Mehrabi, a senior finance manager on the Royalties team. “We need to make sure everything is correct before payments go out the door or transactions get their accounting treatment.”

The engineering team took up the challenge.

[Learn how Microsoft’s finance team uses anomaly detection and automation to transform royalty statements processing. Find out how Microsoft designed a modern data catalog to enable business insights. Explore how AI and chatbots simplify finance tools at Microsoft.]

Transforming Microsoft’s royalties ecosystem with a connected data system

Unifying the 300 to 400 million financial data points that flow in and out of the company each month was an enormous undertaking for the royalties engineering team.

It encapsulated three main challenges.

The first and most complicated task was defining and canonicalizing the millions of data points associated with the royalties business. The engineering team needed to use automation to identify distinct entities that could be assembled and visualized as a graph of connected data-points.

Defining words such as “contract” and “product” seems like a simple job, but it presents challenges when it comes to automating data in complex business relationships. The data definitions needed to reflect the royalties team’s business needs and be simultaneously consumable by data processing systems.

Janam Singh, Ratnagiri, Mandal, and Venkatesan smile at the camera in a group photo taken in a Microsoft building common area.
Ram Janam Singh (left to right), Sundeep Ratnagiri, Abhijit Mandal, and Jagannathan Venkatesan worked with other engineers on Microsoft’s Royalties Team to develop a royalty transaction insights system to aggregate data holistically through a Microsoft Teams bot, simplifying complex information and providing organization-wide insights. (Photo by Rajmohan Venkatesan)

Sundeep Ratnagiri, engineering manager for Microsoft Royalties, outlines how the team defines these terms, explaining what the word “contract” means when it comes to managing Microsoft’s royalties system.

“For a businessperson, a contract is a legally binding document that defines business terms,” Ratnagiri says. “For an engineer, it is a set of parameters codified in the system to function the way the legal document is written. Similarly, a product is an asset that is transacted upon, with rich attributes that can be referenced in a contract.”

From the start, engineers partnered with their peers across the royalties business and engineering landscape, including the accounting, business, and partner engineering teams. They spoke to a wide array of stakeholders to ensure they could assemble the system’s 300-400 million data point connections per month in ways that would support everyday usage. The result was a single, connected data output with analytical (like aggregation, for example) capabilities powered by the team’s different processing calculation systems (also for example).

The second major task was to represent the different data sets in a connected graph exposed with a single API set, enabling team members to navigate from any point of the royalties system to anywhere else. The engineers utilized Apache Spark for the data modeling pipeline, then modelled it as a graph of connected entities using Microsoft Azure Cosmos DB. The result was a trustworthy, independently validated source for all canonical data that was ready for access and interpretation.

“The natural connective tissue across all these platforms exists,” Ratnagiri says. “Some are straightforward connections. Others are inferred connections. When we link them up, it opens a plethora of analytics.”

The data wouldn’t be helpful to anyone if it wasn’t available for queries, so the team’s third task was enabling access through an API layer. The business users wanted the system to output expressive, incremental information when they submit queries, so they included natural language support in the API.

Like any search tool, the API’s query terms needed to seem natural enough to be intuitive to users but sufficiently rich to accomplish the full range of possible queries. So, the engineering team interviewed stakeholders to define the most relevant search activities and build a series of canonical queries. Each of these queries sets the API off on a traversal through the entire Microsoft Azure Cosmos DB graph to locate and assemble the relevant data for the user.

When customers start to look at connecting multiple data sets, it is important to spend an appropriate amount of time early in the project on entity modeling and relationship curation across these entities. On the storage side, it is particularly important to pick the right partition key on the Cosmos DB side. This can have a significant impact on the latencies of queries in terms of defining in-edges and out-edges.

—Abhijit Mandal, senior software engineer, Microsoft Royalties

To maximize accessibility, the team built access to the API layer into a Microsoft Teams bot. Together, the team calls this end-to-end data solution Royalties as a Service (RaaS). Despite the system’s complexity, the outcomes are all about simplicity and empowerment.

Additionally, the API layer enforces security and confidentiality perimeters depending on who is using the system and what permissions they have.

Query execution

A user simply navigates to RaaS within Microsoft Teams and submits a natural language query like “payee 100010 drilldown” or “contract <abccdd> assurance.” This query passes through several different stages of execution in the pipeline before results are assembled and shown to the user on the Teams bot UI canvas.

These stages in order are:

  1. Entity resolution:
    The natural language query is parsed to extract entities, sub-entities, and values. This is done using Azure cognitive service—Language Understanding Intelligence service (LUIS). Related entities are extracted as relationships and used in graph traversals. For the query “payee 100010 drilldown,” the entities and entity values extracted are “Payee”:“100010”
  2. Intent formation:
    Intents are formed from the LUIS layer as well. Along with parsed entities, the user-intended action is added to form the intent object.
  3. Dynamic Gremlin query generation:
    The intent object is passed through a query generation layer. The layer converts an intent object to a gremlin query that can be executed against a Cosmos graph DB instance. This is an example of a dynamic gremlin query:

    Example one:
    
    g.V().hasLabel('payee').has('payeeid','100010').range(0,1000).as('ct')
    
    .select('ct')
    
    .local(properties('column1','column2','column3','column4').group().by(key()).by(value())).dedup()

    Example two: The query below applies a contract ID filter on a contract node and traverses from the contract node over to product across connected edges, selecting the products associated with the contract.

    g.V().hasLabel('contract').has('contractid','1000010').as('contract')
    .outE('contract_to_product').inV().as('product')
    .select('product')
    .local(properties('column1', 'column2', 'column2', 'column3', 'column4',
    'column5').group().by(key()).by(value())).dedup()
  4. Gremlin query execution:
    The final stage in the query layer is the execution of the dynamic gremlin query and converting the response to JObject of the relevant entities being selected.

The intent of sharing what a query looks like is to give customers an example of how they could tackle something similar in related efforts.

“When customers start to look at connecting multiple data sets, it is important to spend an appropriate amount of time early in the project on entity modeling and relationship curation across these entities,” says Abhijit Mandal, a senior software engineer working on the platform. “On the storage side, it is particularly important to pick the right partition key on the Cosmos DB side. This can have a significant impact on the latencies of queries in terms of defining in-edges and out-edges.”

The RaaS system today serves queries within sub-second latencies over a graph of 32 million entities connected through 110 million relationships. It’s been a long, important journey to launch RaaS, one that brought together disconnected tools that Microsoft uses to manage the agreements and relationships that define the company’s underlying royalties.

Aggregation and insights powered by connected data

Previously, users pulled data directly from several different sources, assembled it into meaningful formats, and validated the information through several layers of manual cross-checking. This was onerous for the engineering team—they had to understand each request, craft appropriate queries and mechanisms to harvest the data, and collate and aggregate the queries so they would be available to the business for further handling.

Sometimes, that process had a multi-day cycle time.

I’ve always thought of RaaS as a data-quality tool. This knowledge is now baked into a system. We’re getting a reliable answer through a unified process because it’s been structured properly.

—Chris Roozen, senior project manager, Microsoft Royalties

Chris Roozen, senior project manager on the Royalties team.
Chris Roozen, a senior project manager on the Royalties team, says the biggest benefit of RaaS is how it gives the team better data insights. (Photo by Chris Roozen)

“With our RaaS system, retrieval and presentation of relevant information is automatic and driven by the end user with no time lost on the engineering and business sides, with the additional advantage of eliminating human error,” Venkatesan says.

That means it’s more difficult for errors to be entered into the system and that access is improved, which boosts accuracy and improves user satisfaction.

“I’ve always thought of RaaS as a data-quality tool,” says Chris Roozen, senior project manager on the Royalties team. “This knowledge is now baked into a system. We’re getting a reliable answer through a unified process because it’s been structured properly.”

The easier it is to get a clear picture of individual data pools, the simpler it is to look at the big picture and gain business-wide insights.

Opening up the connected data landscape

For now, RaaS is a relatively new capability on the Royalties team. As the internal experts on the query tool, engineers are RaaS’ primary frontline users, handling queries for the rest of the royalties business to help validate their data. In the future, they hope to simplify the search process with intelligent and predictive searches so it’s more user-friendly for non-engineers. In that scenario, anyone on the team will be able to submit queries and navigate the aggregated data independently.

Because team members will source their information through RaaS queries, fewer people will need access to the original data sourcing utilities. Limiting access to those tools helps decrease compliance risk within a large organization like Microsoft.

Similarly, as natural turnover occurs on the team, administrators won’t have to juggle access and training for multiple complicated data tools. Instead, RaaS will help all team members spend their time where it’s most valuable: validating data and building business insights.

There’s even the possibility of an outward-facing portal that customers, partners, and vendors can access to benefit from the ease and transparency that RaaS provides for Microsoft. But for now, RaaS is already demonstrating its value by saving time, eliminating error, and providing holistic insights.

Key Takeaways

  • Collaborate from the start: Partner with your business, finance, and accounting teams to make sure you’re asking the right questions.
  • Keep business intelligence front-of-mind: Know the questions your team wants to ask and the kind of answers they expect, then build toward that.
  • Know your data and processes: Know the details of your data including its source, meaning, different processing (manual and systems) it powers, to fully capture the extent and strength of the connected data.
  • Don’t rush to the API implementation: Make sure you’re seeking out the right information first and spend extra time on data modeling and graph design.
  • Make sure you get your user scenarios right: This is cutting-edge work that’s tough for users to understand, so make sure you’re coaching teams on usage.
  • Make your APIs very expressive: People are good at digital searches, so adapt your natural language processing to reflect everyday search habits.

Related links

Learn how Microsoft’s finance team uses anomaly detection and automation to transform royalty statements processing.

Find out how Microsoft designed a modern data catalog to enable business insights.

Explore how AI and chatbots simplify finance tools at Microsoft.

Learn how Microsoft is turning data into intelligent experiences.

Powering digital transformation at Microsoft with Modern Data Foundations.

Driving Microsoft’s transformation with AI.

Recent