{"id":8667,"date":"2023-05-03T15:05:47","date_gmt":"2023-05-03T22:05:47","guid":{"rendered":"https:\/\/www.microsoft.com\/insidetrack\/blog\/?p=8667"},"modified":"2024-01-08T14:17:03","modified_gmt":"2024-01-08T22:17:03","slug":"driving-effective-data-governance-for-improved-quality-and-analytics","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/insidetrack\/blog\/driving-effective-data-governance-for-improved-quality-and-analytics\/","title":{"rendered":"Creating a modern data governance strategy to accelerate digital transformation at Microsoft"},"content":{"rendered":"
\n
\n
<\/div>\n

This content has been archived, and while it was correct at time of publication, it may no longer be accurate or reflect the current situation at Microsoft.<\/p>\n<\/div>\n<\/div>\n

\"MicrosoftData is the new currency of digital transformation. Whether it\u2019s providing new insights, improving decision making, or driving better business outcomes, enthusiasm for unlocking the power of data has never been greater. Internally at Microsoft, our data governance practices are essential in helping ensure that data at Microsoft is optimized for any use\u2014enabling deeper insights across our organizational and functional boundaries.<\/p>\n

In the simplest terms, data governance is about managing data as a strategic asset. It involves ensuring that there are controls in place around data, its content, structure, use, and safety. To provide effective data governance, we need to know what data exists, whether the data is of good quality, whether the data is usable, who\u2019s accessing it, who’s using it, what are they using it for, and whether the use cases are secure, compliant, and governed.<\/p>\n

As modern business is embracing advanced analytics, artificial intelligence, and machine learning, the amount, velocity, and variety of data is increasing. With all that data comes a wealth of new possibilities, and a new set of challenges. Our ability to optimize the management and governance of ever-greater amounts of data is essential.<\/p>\n

Different data types require different controls to ensure that systems handle, store, and use the data correctly. The traditional top-down method Microsoft Digital Employee Experience (MDEE) was using for data governance wasn\u2019t scalable. It left us little time to more than reactively address data issues as they occurred. We needed a scalable approach that could use automated controls, engineered into the process, to address the root causes of data issues during every stage of the data lifecycle.<\/p>\n

Our approach to data governance<\/h2>\n

Rather than viewing data governance as a blocking function, or a gatekeeper in the enterprise, MDEE saw data governance modernization as way to democratize data responsibly. Widely accessible, trusted, and connected enterprise data makes intelligent experiences possible, and powers the wider digital transformation at Microsoft.<\/p>\n

We are transforming how we provide data governance, to introduce scalable, automated controls for data architecture, lifecycle health, and advancing its appropriate use. As illustrated below, modern data governance is the foundational pillar upon which Microsoft has built its overall Enterprise Data Strategy.<\/p>\n

\"Image
Data governance is the foundational pillar of the Microsoft Enterprise Data Strategy.<\/figcaption><\/figure>\n

We created our overall Enterprise Data Strategy in response to an increasing demand for the right intelligence to power experiences at every touchpoint inside and outside Microsoft. At the same time, the increased demand amplified the pressure to better govern the data and manage regulatory requirements across an ever-expanding data landscape. Trying to address data issues as they arose\u2014one at a time\u2014was expensive and inefficient. Without a centralized, scalable, and automated way to address the root causes of these data issues, our analytics capabilities would continue to decline. As would our user satisfaction rating for Microsoft\u2019s data-centric apps.<\/p>\n

We developed a more modern data governance strategy with five goals in mind:<\/p>\n

    \n
  1. Reduce data duplication and sprawl by building a single Enterprise Data Lake (EDL) for high-quality, secure, and trusted data.<\/li>\n
  2. Connect data from disparate silos in a way that creates opportunities to use that data in ways not possible in a siloed approach.<\/li>\n
  3. Power responsible data democratization across Microsoft.<\/li>\n
  4. Drive efficiency gains in the processes Microsoft employs to gather, manage, access, and use data.<\/li>\n
  5. Meet or exceed compliance and regulatory requirements without compromising Microsoft\u2019s ability to create exceptional products.<\/li>\n<\/ol>\n

    Our approach to modern data governance has two key components. First, we embed clear data standards and build them into our application development process. This move helps us automate and proactively manage data governance issues and data policy compliance. Second, we leverage the EDL platform<\/a>, to centralize and systemically scan and monitor the data.<\/p>\n

    \"Illustration
    The two-pronged approach that MDEE uses to modernize data governance.<\/figcaption><\/figure>\n

    Creating a clear set of data standards built into the engineering process<\/h3>\n

    Much of our early effort focused on creating the formalized data standards that we wanted to build into the engineering process. It was natural for us to look to our core strength\u2014engineering\u2014when addressing business problems. For every formalized data standard, we then drive it into our modern engineering process. Having clear data standards and providing compliance measurements against those standards is key to our change management approach for data governance.<\/p>\n

    Microsoft Azure DevOps helps auto-generate and manage the data governance backlog<\/h4>\n

    After authoring data standards, we then used Microsoft Azure DevOps (ADO)\/Microsoft Visual Studio to automate the ways our systems generate, assign, and track data governance. For example, when an engineering project reaches a certain milestone, we have the application owner complete a data governance assessment. That assessment results in automatically generated work items in the project\u2019s backlog.<\/p>\n

    Measuring our compliance against the data standards<\/h4>\n

    To measure the progress of our data governance efforts, we are defining the\u00a0metrics that matter<\/a>\u00a0to create Microsoft Power BI-based scorecards that explicitly show data standards alignment. For each standard, the central data governance office will actively monitor assessment exceptions, so that application owners can complete their required data governance work.<\/p>\n

    Centralizing data in the Enterprise Data Lake<\/h3>\n

    As part of Microsoft\u2019s Enterprise Data Strategy, we have been making key investments in the\u00a0modern data foundations<\/a>\u00a0that enable modern data governance\u2019s role in ensuring the responsible democratization of data. Centralizing data assets is key in reducing the amount of redundant and outdated copies, understanding who has access, and understanding how they are using the assets. Data governance optimizes our infrastructure resources and uses services and automation to proactively scan data for potential issues, rather than reacting to issues as they occur.<\/p>\n

    We have begun moving data from disparate sources across Microsoft into our Enterprise Data Lake (EDL). The EDL is built on Azure Data Lake Storage and leverages Azure Data Services. The EDL not only consolidates the data,\u00a0<\/strong>it also creates a centralized source of truth where enterprise data can be collected, shaped into trusted forms, secured, made accessible, and managed by applicable governance controls. Moving everything to a single EDL enables scalable, systematic data scanning without having to individually scan thousands of databases across the enterprise.<\/p>\n

    Scalable and automated engineering solutions help proactively manage data governance<\/h4>\n

    Microsoft integrates automated and scalable services into the EDL. These services help proactively automate data management, data quality management, data security, data access management, and compliance. This integration means various teams that are onboarding to the EDL don\u2019t have to invest in engineering solutions to benefit from the built-in services and automation\u2014they are applied consistently across all data.<\/p>\n

    Scanning for data issues in the Enterprise Data Lake<\/h4>\n

    Regular scanning in the EDL finds data issues so they can be fixed and then prevented at the systems of record and systems of engagement. We are building out proactive solutions through engineering checks and guardrails directly into our processes. These moves help prevent data governance issues by design. The EDL\u2019s capabilities and services include built-in scanning for data security, access management, compliance, and a host of other defined data controls. Not only does the data foundations team get notifications of compliance violations, the data publishers receive them as well.<\/p>\n

    The Enterprise Data Catalog improves discoverability<\/h4>\n

    To provide effective data governance we need a full view of all data assets. We need to know where the assets exist, who is accessing them, and how users are interacting with the data. This visibility is needed for managing fragmentation, sprawl, and redundant or outdated copies of data assets that can exist across multiple platforms.<\/p>\n

    The\u00a0Enterprise Data Catalog<\/a>\u00a0helps drive data governance. It does so by building controls into the catalog\u2019s data-discovery process. These controls ensure that only people with the appropriate need and authority can access sensitive data stored in the EDL. This promotes compliance with government regulations through processes, patterns, and tools for data management and governance of data assets. The EDL metadata service sends metadata published to the EDL to the catalog for discovery. The service also registers broader data sources\u2014transactional data systems, retention policies, and master data, for example\u2014in the catalog.<\/p>\n

    Modern governance with assessment-based models and evidence-based results<\/h3>\n

    At Microsoft, we find evidence-based flagging is the most compelling way to incent data producers and\/or data owners to address the underlying gaps that cause data issues. Thus, \u201cevidence at scale\u201d is the fundamental reason we\u2019ve modernized our data governance program around the two-pronged approach of embedded data standards coupled with a scannable EDL platform. Using this new approach, we can detect data issues before they metastasize and engage and drive data compliance with multiple organizations at once. We\u2019re able to use scanners to show engineers where data compliance gaps exist before data products get published into production. And most importantly, we can sustain this model because it\u2019s simply part of the everyday rhythm of the business.<\/p>\n

    Things to consider when planning your own data governance strategy<\/h2>\n

    Though it\u2019s early in our journey toward modern data governance, we do have a few best practices to share. Primarily, we recommend that you address your data governance strategy holistically.\u00a0<\/strong>As illustrated below, we designed our approach so that standards, embedded into the engineering process and data centralization on the modern data foundation worked together to ensure end-to-end modern data governance.<\/p>\n