an organization-wide effort at Microsoft to move disparate, siloed data to data lakes<\/a>.\u00a0These data lakes make data easy to access and use for machine learning purposes, which can in turn be used to build smarter experiences for Microsoft customers and for our own employees and partners.<\/p>\nWhile most SAP data is available in the finance data lake, cash flow-specific data, such as exchange rates from the Treasury System and master data from DataMall, is housed in the cash flow data lake. The cash flow data lake was built for this purpose. Because of the work done as part of the data lake initiative, most major datasets the Foundational Finance Services team needed for this project were already housed in the finance data lake, and could be processed within it. Thus, data processing was more efficient.<\/p>\n
Arriving at the descriptive analytics platform that would be the foundation of this project would have been theoretically possible without centralizing data in data lakes, but it would have been unrealistic. It would have required a small army of financial controllers to glue massive amounts of spreadsheets together. Because of the organization-wide data lake strategy, however, Microsoft had already laid that foundation. Foundational Finance Services connected the finance data lake and the cash flow data lake to create their descriptive analytics platform.<\/p>\n
Phase 2: Developing insights to improve cash management through diagnostic analytics<\/h4>\n In the second phase of the project, the Foundational Finance Services team focused on diagnostic analytics that would isolate inefficiencies, uncover trends, and provide actionable insights to financial controllers. In collaboration with stakeholders who would be the primary beneficiaries of cash flow analytics automation, the team determined during the pilot phase that efficiency gains would yield as much as $25 million in immediate cost savings.<\/p>\n
The team built a platform that could ingest raw data and model it in a format that Microsoft Power BI and Microsoft Excel could consume. By March 2019, they had automated portions of cash flow statements but could only generate them quarterly. Six months after starting the project, the team was generating daily analytics, and controllers were extracting daily insights.<\/p>\n
Prior to these efforts, non-standard supplier payments (those that did not comply with the terms of the policy) comprised 27 percent of all spend in that space. Once the cash flow analytics platform began standardizing and automatically flagging non-standard payments, those payments comprised only 10 percent of spend. This 3x reduction in non-standard payments meant financial controllers examining those payments could tighten their focus, resulting in deeper analysis.<\/p>\n
Both the descriptive analytics platform (Phase 1) and the diagnostic analytics platform (Phase 2) are in production.<\/p>\nFigure 2. Standardizing supplier payments through the cash flow platform reduced non-standard payments from 27 percent of all payments to 10 percent.<\/figcaption><\/figure>\nPhase 3: Measuring and improving cash flow forecast accuracy through predictive analytics<\/h4>\n When the team was satisfied with the volume and quality of the data being ingested and the feedback from financial controllers was sufficiently promising, they turned their attention to phase 3 (predictive analytics). This phase consists of six distinct processes.<\/p>\n
Data ingestion:<\/strong>\u00a0Using Microsoft Azure Databricks job scheduling and parallel processing, all structured and semi-structured data is pushed to the cash flow data lake.<\/p>\nData curation:<\/strong>\u00a0A quality threshold is applied and de-duplicate checks are performed. If incoming data doesn’t meet the stated quality threshold, the data isn’t merged.<\/p>\nData processing:<\/strong>\u00a0All business logic, filters, joins, and aggregations are performed using Pyspark in Microsoft Azure Databricks. The final dataset is output as a .TSV file for consumption by AAS.<\/p>\nData modeling:<\/strong>\u00a0Based on the facts and dimensions created, the team built a star schema model in AAS. The tabular model allows for efficient slicing and dicing of data. The in-memory caching layer results in high performance and a satisfying end user experience in Microsoft Power BI and Microsoft Excel.<\/p>\nMachine learning:<\/strong>\u00a0Using Microsoft Azure AutoML, Python, and Microsoft Azure Machine Learning, the platform derives insights from the data provided to generate a quarterly forecast of supplier spend (broken down by payment category), and predicts invoice clearing dates.<\/p>\nQuality control and reporting:<\/strong>\u00a0LiveSite dashboards provide robust telemetry of end-to-end data, from provider to reporting.<\/p>\nBy summer 2019, the team had built an end-to-end view of cash flow. End users could view this cash flow data as a dashboard in Microsoft Power BI and drill into 13 payment categories. To date, the cash flow forecasts are not yet in production, though they are production-ready.<\/p>\nFigure 3. Cash flow forecast accuracy has hovered at +\/- two percent in early testing.<\/figcaption><\/figure>\nBenefits<\/h3>\n Though the project will eventually expand to encompass a broader portion of cash flow services at Microsoft, the Foundational Finance Services team focused their initial efforts on cash outflows like payments to suppliers. Each Microsoft supplier enters into a mutually agreed upon set of terms when they do business with Microsoft. Those terms are stored in SAP, Microsoft’s enterprise resource planning system of record, and include payment terms such as frequency of payments made to the supplier. However, because cash flow analytics were generated quarterly, and because generating them was such a laborious process, the details of the supplier terms were rarely factored in when making payments.<\/p>\n
By automating the cash flow analytics and making fresh data available daily rather than quarterly, such considerations can be baked into the payments process. The data available in SAP becomes one of the many signals used to schedule payments in a more structured and efficient way. That gives finance professionals more control over day-to-day operations, which in turn gives them a greater role to play in impacting team or departmental goals. Acquiring supplier payment data required manually gathering it from as many as 23 individuals. Automation, of course, eliminates that inefficiency.<\/p>\n
In total, the project resulted in $25 million worth of efficiency gains. The team has also identified many other opportunities. Early research suggests that policy and process updates will save Microsoft an estimated $130 million per year.<\/p>\n
More importantly, financial professionals have gained daily, actionable insights they can use to directly impact business outcomes. The insights provided are still being monitored for accuracy, and are in beta status. Early results are promising, with an accuracy range of +\/- two percent.<\/p>\n
Best practices<\/h3>\n Though the project hit relatively few bumps in the road, it did face challenges typical of projects that seek to marry raw data with its business context, as well as obstacles common to any machine learning project. Keeping these challenges in mind at the outset (and planning accordingly) can mitigate their impact.<\/p>\n
Devote sufficient resources to understanding the data’s business context.\u00a0<\/strong>The engineers working with the data, especially in the early phases of the project, are not the original owners. Thus, they have little familiarity with its context. Lack of context, however, can have a dramatic impact on the accuracy of insights and machine learning models. The Microsoft Digital and Engineering Fundamental teams made several datasets available to the Foundational Finance Services team. It makes sense to involve data owners early in the project, and to make the project a true collaboration with non-technical stakeholders.<\/p>\nWithin this project, and more broadly at Microsoft, the team encouraged the original owners of published data (data housed in widely accessible data lakes) to provide more context, including source data, where applicable, before publishing.<\/p>\n
For the Foundational Finance Services team, this project has also meant working closely with SAP architects and line of business users. Those partnerships helped engineers gain more familiarity not only with the dataset being explored or examined, but also with the ecosystem in which the data exists. Engineers also learned how the data relates to other datasets in the ecosystem, even tangentially.<\/p>\n
Supplier payments are a single dataset, for example, but that single dataset has 13 owners. Verifying the accuracy of that data with all 13 owners has proven critical to accurate insights and forecasting.<\/p>\n
Develop a code testing plan.<\/strong>\u00a0Early in the process, Foundational Finance Services used Microsoft Azure Databricks Notebooks to build out data processing pipelines with business logic, rather than using them exclusively to perform data analysis. That made it possible to deliver business value as quickly as possible with continuous integration\/continuous delivery, but it made testing challenging. Eventually, the team developed a Python application to test code, migrating to a hybrid version of Microsoft Azure Databricks Notebooks and the Python application. Once the team moved the code into the Python application, they were able to execute unit and regression tests.<\/p>\nThough it took some time to build a Python application late in the project, writing everything in Microsoft Azure Databricks Notebooks sooner rather than later made sense in the early stages, since it allowed the team the freedom to deploy rapidly and often. Once they were satisfied with the business value they’d delivered, devoting resources to building the Python application was an easier sell.<\/p>\n
Follow the Agile methodology.<\/strong>\u00a0Had Foundational Finance Services not followed the Agile methodology, the platform would’ve taken considerably longer to build. For example, to find out which supplier payment reports are needed, the team would have to wait for a complete analysis. Instead, they tracked the necessary data to its source and connected to it. Having direct access to the data, the team could analyze it, then work with the owners of the data to verify its accuracy. If they found a discrepancy, they could work with data publishers to make the necessary corrections before investing time in building reports for end users.<\/p>\nPrioritize data sources using cash flow analytics.\u00a0<\/strong>While tracking down data sources early in the project, the engineering team studied existing cash flow analytics. They ranked data sources in reverse order based on the volume of transactions in the cash flow analytics, allowing them to easily prioritize the early work of seeking out data sources.<\/p>\nWhat’s next<\/h3>\n The insights uncovered in relation to this project\u2014totaling $130 million in efficiency gains\u2014are being translated into policy and process updates. The Foundational Finance Services team is currently working with line of business partners to operationalize those changes.<\/p>\n
In the meantime, the forecasting model is being monitored for accuracy and continually refined. While insights uncovered through diagnostic analytics have provided the largest returns to date, the team expects predictive forecasts to be equally impactful. Daily projections are being compared to daily cash flow actuals to gauge the accuracy of the forecasts. In the most recent observations, the margin of error has been reduced to approximately one percent when projecting across all payment categories. When filtering for a particular payment category, however, the margin of error increases because the model has not yet been trained to account for seasonality. That seasonality work is ongoing.<\/p>\n
Conclusion<\/h2>\n The wider data lake strategy at Microsoft yields immediate benefits and lays the groundwork for future innovation. The cash flow analytics platform exemplifies the approach and the possibilities of that strategy. Making financial data widely available and accessible enabled the Foundational Finance Services team to create a dynamic, multi-dimensional, daily snapshot of cash flow, accessible via interactive dashboards. The result is that financial controllers at Microsoft have unprecedented control over the financial health of the teams and departments they oversee, and have become strategic partners in the quest for operational efficiency.<\/p>\n","protected":false},"excerpt":{"rendered":"
This content has been archived, and while it was correct at time of publication, it may no longer be accurate or reflect the current situation at Microsoft. By connecting cash flow data to a broader financial data ecosystem, Microsoft Digital has created a cash flow analytics platform capable of far greater insight than cash flow […]<\/p>\n","protected":false},"author":146,"featured_media":10315,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"_hide_featured_on_single":false,"_show_featured_caption_on_single":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false},"version":2}},"categories":[1],"tags":[],"coauthors":[674],"class_list":["post-10311","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized","m-blog-post"],"jetpack_publicize_connections":[],"yoast_head":"\n
Microsoft transforms cash flow data into an intelligent analytics platform - Inside Track Blog<\/title>\n \n \n \n \n \n \n \n \n \n \n \n \n\t \n\t \n\t \n \n \n \n\t \n\t \n\t \n