{"id":10969,"date":"2020-02-19T15:02:45","date_gmt":"2020-02-19T23:02:45","guid":{"rendered":"https:\/\/www.microsoft.com\/insidetrack\/blog\/?p=10969"},"modified":"2023-06-11T16:01:50","modified_gmt":"2023-06-11T23:01:50","slug":"microsoft-reinvents-sales-processing-and-financial-reporting-with-azure","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/insidetrack\/blog\/microsoft-reinvents-sales-processing-and-financial-reporting-with-azure\/","title":{"rendered":"Microsoft reinvents sales processing and financial reporting with Azure"},"content":{"rendered":"
This content has been archived, and while it was correct at time of publication, it may no longer be accurate or reflect the current situation at Microsoft.<\/p>\n<\/div>\n<\/div>\n
Moving the company\u2019s revenue reporting platform to Microsoft Azure is giving Microsoft Digital the opportunity to redesign the platform\u2019s infrastructure and functionality. With the major components in Azure, we\u2019ve already seen how Spark Streaming and Azure Data Factory have made dramatic improvements in the platform\u2019s performance and scalability. As our journey to host this solution in Azure continues, we\u2019re finding new ways to improve it with Azure capabilities.<\/p>\n
Microsoft Digital has moved the company\u2019s revenue reporting platform, MS Sales, from on-premises datacenters to the cloud and Microsoft Azure. This is more than just a move to the cloud\u2014it\u2019s an opportunity to reimagine and redesign the way MS Sales infrastructure functions. To prepare for the migration, we examined several options for hosting MS Sales in Azure and came away with a clear direction for transition design, planning, and deployment.<\/p>\n
MS Sales manages Microsoft product and service revenue data. Transaction data is conformed, aggregated, and enriched by MS Sales to provide accurate revenue reporting and analysis. MS Sales gives a consistent view of Microsoft businesses and production revenue, and it enables better, faster strategic decisions. People can query purchase, sales, budget, and forecast data and drill down to see more transaction details.<\/p>\n
The MS Sales environment includes the following:<\/p>\n
MS Sales publishes data that\u2019s aligned with the Microsoft financial calendar. The publishing processes include daily, weekly, and\u2014the most critical\u2014fiscal month-end (FME) data for restatement, forecasting, and budgeting. The restatement includes the attribution of past revenue into current business structures and product lines. The system needed more processing capacity to keep pace with the expanding number of revenue records and details.<\/p>\n
We had been experiencing several challenges with the on-premises MS Sales environment, including limited scalability and agility, a complex ecosystem, cumbersome data-processing models, and increasing costs. The transition of Microsoft\u2019s business model to services has created an exponential curve in the number of transactions. With the legacy design, we would reach the limit of processing to meet our service-level agreements (SLAs) to users. The goal in migrating MS Sales to Azure was to address these challenges and position MS Sales for success well into the future.<\/p>\n
The MS Sales system was built 20 years ago to report Microsoft revenue. It\u2019s the company\u2019s standard revenue reporting system, and it is pivotal to strategic business, management, and financial decisions. Timely, accurate sales data is crucial to assessing Microsoft performance and maintaining a competitive position.<\/p>\n
The original solution for MS Sales was hosted in on-premises datacenters and included:<\/p>\n
The original MS Sales architecture ingested data, processed and transformed significant data, created star schema data marts, and distributed these marts for querying and consumption by other systems. Querying was primarily via Microsoft Reporting Analytics (MSRA), an Excel add-in that generates and executes SQL queries based on the user\u2019s definitions. The architecture supported five major functions:<\/p>\n
Figure 1 illustrates the MS Sales on-premises infrastructure.<\/p>\n
<\/p>\n Our MS Sales cloud-based solution is based on several key goals:<\/p>\n Azure allowed us to rethink data distribution and consumption in MS Sales and redefine what the data flow looks like. We are using many Azure-native and big data solutions for data processing, so we can generate more granular processing and reporting components. This greater level of granularity and native support for data manipulation leads to more parallel processes and quicker data delivery. The data flow components in the cloud include the following:<\/p>\n When we began the engineering program, we developed a programmatic approach to accomplish the redesign and migration. We first started with a thin slice of data that included several data sources. Our goal was to prove end-to-end functionality. However, this approach proved to be highly resource intensive. It would require 18 months for implementation of the entire system and risked missing SLAs as data volumes continued to grow.<\/p>\n We changed direction and decided to focus on the processing engine and distribution for the entire dataset. This approach allowed us to complete the project in 12 months, addressing the highest priority issues with the old system. To accomplish this, we needed to reduce scope and leave the ingestion and warehousing components operating in the legacy system, on-premises as depicted in Fig 2.<\/p>\n To ensure data parity and a completely reliable and manageable service, we operated the new Azure-based process in parallel with the legacy SQL processing in production for eight months.<\/p>\n <\/p>\n For our interim hybrid solution, the data is batch loaded into Azure from the on-premises components each day. We use Azure Data Factory to lift data from the on-premises systems. We added event triggers into our on-premises SQL 2016 servers to transfer the data into blob storage as Parquet files. We separated the hot and cold data prior to implementing this function, transferring only current unprocessed transactions. Despite the streamlining, this approach creates a speed penalty, requiring around two hours to transfer the data.<\/p>\n As we progress incrementally, we will use Azure Event Hubs as the primary method for data ingestion in MS Sales. Event Hubs makes it possible for us to process transactions from multiple sources and scale to handle transaction input. After pulling transactions, we will use Data Validation Services to ensure that the data coming into MS Sales corresponds with what we expect from our data providers. Data is compared to templates that we receive from our data providers and send through to Event Hubs if the data is valid.<\/p>\n We use Apache Spark to drive our big data processing tasks in MS Sales, most of our performance gains have come from converting our data pipeline.<\/p>\n Processing scheduling is designed around business processes and optimized to save processing cost. Apache Spark processes the entire MS Sales dataset each week, this includes five terabytes of historical data, current and future projections. The restatement process requires reprocessing past revenue to map to current organizations and product lines. We initiate eight Spark clusters as batch data is transferred from the on-premises warehouse with 80 nodes each. We then begin processing 10 serial steps. This requires approximately 10 hours of Spark processing time.<\/p>\n Each day, the current month, previous month, and all future projections are processed, which requires two clusters of 40 nodes. This consumes approximately five hours of Spark processing time.<\/p>\n One of the primary success criteria for the new system was absolute parity between the on-premises processing and the Spark processing. This required granular data comparisons at each of the 10 steps through processing as well as additional engineering to create synthetic steps that didn\u2019t exist in the on-premises factory. This allowed us to isolate issues into specific areas of the processing phase. Testing also required storage throttling because of the large volume of data transfer, which exceeded I\/O capacity for some services. This was a beneficial lesson to learn because it is driving the prioritization of delta processing in the new solution.<\/p>\n Delta processing will enable the transition to streaming processing and closer to real-time transaction attribution. We estimate that this will reduce the daily processing by 50% by eliminating the reprocessing of static data and the associated I\/O issues and costs. This will introduce variable timing into the system, depending on the level of change and processing required.<\/p>\n After processing is complete, we use Spark connector to push each unique cluster output to a virtual machine running SQL. These databases are aggregated and then replicated to a scaled set of 10 to 20 query servers that are used for the distribution. This is based on client load and dynamically scaled as needed. This provides redundancy as well as load balancing to improve performance. Our reporting tool, MSRA, is configured to access the appropriate database instance for the user. This creates an additional penalty of four to five hours, transferring to the distribution environment, but it still enables a one-hour improvement over the existing on-premises system. The significantly improved processing speed keeps us aligned with our SLAs.<\/p>\n The strategy for future output is to simplify and refine datasets for common use cases. The intent is to provide an interactive in-memory querying capability. This will provide 70% of users with a streamlined and fast interaction with the system, and it will work along with the current model to provide the large queries the business relies on today.<\/p>\n Business rules play a critical role in MS Sales functionality. Business rules define how data is represented in MS Sales. We took the opportunity to evaluate and optimize business-rules management within MS Sales, and adopted the following best practices:<\/p>\n Incorporating these best practices into rules we build using Drools and JBoss makes it easier to check for rule conflicts, negative rule effects, and confirm data relationships within the ruleset taxonomy. Our rules are authored in Drools, published to our business rules Git repository, which is managed through Azure DevOps until the publishing phase, and then Spark processes the changes.<\/p>\n Our business rules are created using a declarative approach, which makes the rules implementation process more flexible and makes it easier to create or redefine rules. Our natural-language rules are accessible to all of our stakeholders, which makes them easy to review, approve, or change and keeps our rules in a standard format that uses business terminology rather than database object names and obscure variables.<\/p>\n Our testing and release process for MS Sales has three primary phases to ensure that all functionality works as expected before full release:<\/p>\n Although MS Sales is still migrating, we\u2019ve already realized several benefits on the new platform. Many Microsoft business groups have adapted their business processes to the schedule and workflow in the original MS Sales version. With the new version, they\u2019re finding that the faster processing time enables them to reexamine their business processes and redefine them to fit business demands rather than technical limitations. Here are the other benefits that we\u2019ve experienced:<\/p>\n There are still many areas of innovation happening in the MS Sales environment. The team is working to implement streaming data ingestion, modernizing distribution, and implementing decoupled business-rule functionality.<\/p>\n","protected":false},"excerpt":{"rendered":" This content has been archived, and while it was correct at time of publication, it may no longer be accurate or reflect the current situation at Microsoft. Moving the company\u2019s revenue reporting platform to Microsoft Azure is giving Microsoft Digital the opportunity to redesign the platform\u2019s infrastructure and functionality. With the major components in Azure, […]<\/p>\n","protected":false},"author":146,"featured_media":10973,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"_hide_featured_on_single":false,"_show_featured_caption_on_single":false,"footnotes":""},"categories":[1],"tags":[],"coauthors":[674],"class_list":["post-10969","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized","m-blog-post"],"yoast_head":"\nRedesigning MS Sales in the cloud<\/h2>\n
\n
MS Sales architecture in Azure<\/h3>\n
\n
Incremental delivery<\/h3>\n
Current data movement<\/h4>\n
Data ingestion future state<\/h4>\n
Pipeline processing<\/h4>\n
Apache Spark processing<\/h5>\n
Processing future state<\/h5>\n
Current state pipeline output<\/h5>\n
Future state pipeline output<\/h5>\n
Future business rules<\/h4>\n
\n
Using a declarative business-rules approach<\/h5>\n
Testing and release<\/h3>\n
\n
\n
Benefits and best practices<\/h4>\n
\n