{"id":5085,"date":"2020-01-30T11:23:59","date_gmt":"2020-01-30T19:23:59","guid":{"rendered":"https:\/\/www.microsoft.com\/insidetrack\/blog\/?p=5085"},"modified":"2023-06-20T15:18:52","modified_gmt":"2023-06-20T22:18:52","slug":"how-microsoft-connects-high-quality-discoverable-data","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/insidetrack\/blog\/how-microsoft-connects-high-quality-discoverable-data\/","title":{"rendered":"How Microsoft connects high-quality, discoverable data"},"content":{"rendered":"
\n
\n
<\/div>\n

This content has been archived, and while it was correct at time of publication, it may no longer be accurate or reflect the current situation at Microsoft.<\/p>\n<\/div>\n<\/div>\n

At Microsoft, employees were already aware of the power of using data to create experiences that people love.<\/p>\n

But that awareness wasn\u2019t enough to bring the data to life\u2014they needed better tools.<\/p>\n

That\u2019s why Michael Lucas set out to transform the company\u2019s internal data catalog. He wanted to make it easier for employees to find data for their work, as well as upload assets for others to use. Lucas also wanted to provide tools for assessing data quality, which is a measurement of the data\u2019s accuracy, consistency, and reliability.<\/p>\n

\"Michael
Michael Lucas is leading Microsoft\u2019s effort to redesign the company\u2019s internal data catalog. He is a principal program manager for the Data Team in Microsoft Digital.<\/figcaption><\/figure>\n

\u201cA modern data catalog is a catalyst for intelligent experiences and insights,\u201d says Lucas, a principal program manager for the Data Team in Microsoft Digital. \u201cYou can\u2019t do that if you don\u2019t have a foundation of high-quality, discoverable data.\u201d<\/p>\n

The company\u2019s retooled modern data catalog has been a boon for employees like Marcela Alvarez Rodriguez, a software engineer on the Microsoft Cloud Business Intelligence and Analytics Team at Microsoft. She spends most of her time developing data analytics platforms for enterprise infrastructure scenarios. She\u2019s also responsible for uploading her team\u2019s SQL and data lake assets to the company\u2019s internal data catalog, but this is only a small part of her work.<\/p>\n

\u201cHaving a data catalog with a streamlined asset uploading process frees up my time to focus on developing data-driven solutions that increase engineering efficiency,\u201d Rodriguez says. \u201cIt also helps our\u00a0customers know which data we own so they can develop their own solutions.\u201d<\/p>\n

When Lucas started to redesign the company\u2019s existing internal catalog, he discovered that it wasn\u2019t designed to help people find data. Instead, it was optimized for a team of software developers who used it to maintain data warehouses.<\/p>\n

\u201cIf we wanted to create a catalog that worked for developers and data analysts, we knew we had to start from scratch,\u201d Lucas says.<\/p>\n

Lucas knew that a successful catalog would answer questions for all end users from developers and data analysts. To do this, he and his Modern Catalog Team decided to develop a modern data catalog that could be used for end-to-end scenarios.<\/p>\n

[Learn how Microsoft developed its modern data catalog<\/a>.]<\/em><\/p>\n

Focusing on the user<\/strong><\/p>\n

To identify the key requirements for the data catalog, Lucas interviewed users across the company. Their jobs ranged from traditional data roles such as developers, data scientists, data analysts, and data stewards to common business roles like program managers and business managers. His interviews enabled Lucas\u2019s team to understand pain points, create prototypes, and conduct usability tests with data publishers and catalog users to ensure development of a truly usable data catalog that would accelerate data discovery for business insights.<\/p>\n

The Modern Catalog Team quickly found that data consumers spent most of their time tracking down the appropriate group owner of the data they needed, requesting and receiving authorization from the data owner, and then cleaning the data before being able to use it.<\/p>\n

\u201cIt was time-consuming for users of our data and challenging for us because we had to answer so many different emails,\u201d Rodriguez says.<\/p>\n

It was also difficult for users like Kathy Brustad, a senior data and applied scientist in Worldwide Learning, to assess data quality.<\/p>\n

\u201cIt was hard to identify the source of truth for certain data assets,\u201d Brustad says. \u201cIt was also hard to know how data had been transformed when it flowed from one source to another and finally ended up being used.\u201d<\/p>\n

After gathering pain points, Lucas\u2019s team worked with the Data Analytics Working Group, which is comprised of key principal-level representatives of Microsoft Digital who help shape data policies, to create a prioritized list of high-level requirements for the catalog. This led to the design of a modern data catalog that enables employees to intuitively browse for available assets and share their team\u2019s data using a single site.<\/p>\n

Brustad wanted to be able to identify the source of truth for data assets. Using the redesigned data catalog, she can assess data quality and its transformation over time by referencing the quality score, sample list of data, and lineage showing how the data connects to other datasets.<\/p>\n

\u201cThere are many different places you can find this data, because it\u2019s still getting replicated throughout the company,\u201d she says. \u201cKnowing that there are quality standards in Microsoft\u2019s data catalog gives me a higher level of confidence.\u201d<\/p>\n

Prioritizing governance and user feedback<\/strong><\/p>\n

The catalog also integrates governance into the data registration process. If data publishers already have their assets in Azure Data Lake and follow best practices for governance, their assets can automatically be scanned into the catalog.<\/p>\n

\u201cWe\u2019re able to turn around our analytics a lot faster because we can establish an automatic connection to the source of the data,\u201d Brustad says of the catalog, which she uses to find data to measure the impact of seller training programs and understand changes in seller behavior. \u201cIt gives us a 20 to 25 percent gain on the turnaround time.\u201d<\/p>\n

The catalog\u2019s connection to Azure Data Lake facilitates the asset upload process for Rodriguez, because her team\u2019s assets are already in Azure SQL Server and Azure Data Lake. The data catalog also improves the experience for consumers of her team\u2019s data.<\/p>\n

\u201cIt was appealing to have a centralized data catalog that helps customers know what data we have,\u201d Rodriguez says. \u201cNow, we can invite them to go to the data catalog and check out our team\u2019s assets.\u201d The catalog offers visibility not just into the technical information of the assets, but also into key governance metadata, such as compliance adherence and data quality measurements.<\/p>\n

The Modern Catalog Team is committed to continuously learning from employees by collecting feedback through telemetry, email, and a feedback button in the modern catalog. This came in handy when Rodriguez couldn\u2019t add her team\u2019s distribution list as an asset owner. She reached out to the team via email, and they provided her an immediate workaround and added the feature to their backlog. Rodriguez has proposed additional features for future iterations of the data catalog, such as a guide for naming and tagging assets and supporting data quality.<\/p>\n

The team also collects telemetry data to identify errors in data access, which are coupled with user interviews to understand their intent. These ongoing conversations inform future iterations of the catalog.<\/p>\n

\u201cIt\u2019s an intuitive platform, and the team is always available for feedback,\u201d Rodriguez says. \u201cThis make the process easier.\u201d<\/p>\n

Whether it\u2019s used to share data or understand behavior, the modern data catalog is an invaluable tool for employees.<\/p>\n

\u201cAs someone who works in the data science field, I\u2019m comfortable with going to the data catalog to procure data because I know that the data has been vetted,\u201d Brustad says.<\/p>\n

Learn how Microsoft developed its modern data catalog<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"

This content has been archived, and while it was correct at time of publication, it may no longer be accurate or reflect the current situation at Microsoft. At Microsoft, employees were already aware of the power of using data to create experiences that people love. But that awareness wasn\u2019t enough to bring the data to […]<\/p>\n","protected":false},"author":146,"featured_media":5089,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"_hide_featured_on_single":false,"_show_featured_caption_on_single":true,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false},"version":2}},"categories":[1],"tags":[],"coauthors":[674],"class_list":["post-5085","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized","m-blog-post"],"jetpack_publicize_connections":[],"yoast_head":"\nHow Microsoft connects high-quality, discoverable data - Inside Track Blog<\/title>\n<meta name=\"description\" content=\"To connect data across business areas and processes, Microsoft Digital created a modern data catalog where employees can share and discover data assets.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.microsoft.com\/insidetrack\/blog\/how-microsoft-connects-high-quality-discoverable-data\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"How Microsoft connects high-quality, discoverable data - Inside Track Blog\" \/>\n<meta property=\"og:description\" content=\"To connect data across business areas and processes, Microsoft Digital created a modern data catalog where employees can share and discover data assets.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.microsoft.com\/insidetrack\/blog\/how-microsoft-connects-high-quality-discoverable-data\/\" \/>\n<meta property=\"og:site_name\" content=\"Inside Track Blog\" \/>\n<meta property=\"article:published_time\" content=\"2020-01-30T19:23:59+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-06-20T22:18:52+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2020\/01\/8979_hero.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"2100\" \/>\n\t<meta property=\"og:image:height\" content=\"1181\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Inside Track \u2013 retired stories\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Inside Track \u2013 retired stories\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/how-microsoft-connects-high-quality-discoverable-data\/\",\"url\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/how-microsoft-connects-high-quality-discoverable-data\/\",\"name\":\"How Microsoft connects high-quality, discoverable data - Inside Track Blog\",\"isPartOf\":{\"@id\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/how-microsoft-connects-high-quality-discoverable-data\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/how-microsoft-connects-high-quality-discoverable-data\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2020\/01\/8979_hero.jpg\",\"datePublished\":\"2020-01-30T19:23:59+00:00\",\"dateModified\":\"2023-06-20T22:18:52+00:00\",\"author\":{\"@id\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/#\/schema\/person\/59e5f7b07dae629412c990cc1a63b575\"},\"description\":\"To connect data across business areas and processes, Microsoft Digital created a modern data catalog where employees can share and discover data assets.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/how-microsoft-connects-high-quality-discoverable-data\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.microsoft.com\/insidetrack\/blog\/how-microsoft-connects-high-quality-discoverable-data\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/how-microsoft-connects-high-quality-discoverable-data\/#primaryimage\",\"url\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2020\/01\/8979_hero.jpg\",\"contentUrl\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2020\/01\/8979_hero.jpg\",\"width\":2100,\"height\":1181,\"caption\":\"Vijay Panjeti and Kathy Brustad use data science to increase people\u2019s interest in Microsoft technology. Both are senior data and applied scientists in Worldwide Learning. (Photo by Aleenah Ansari | Inside Track)\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/how-microsoft-connects-high-quality-discoverable-data\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"How Microsoft connects high-quality, discoverable data\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/#website\",\"url\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/\",\"name\":\"Inside Track Blog\",\"description\":\"How Microsoft does IT\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/#\/schema\/person\/59e5f7b07dae629412c990cc1a63b575\",\"name\":\"Inside Track \u2013 retired stories\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/#\/schema\/person\/image\/ee0de87c339052d5d84852473bd7f213\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/24a8c329ab32afd1bc23fd1658d1acc2?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/24a8c329ab32afd1bc23fd1658d1acc2?s=96&d=mm&r=g\",\"caption\":\"Inside Track \u2013 retired stories\"},\"description\":\"The content on this page was crafted to highlight a specific moment in time or the solutions that have led us to where we are today. It offers valuable insights into our journey and the progress made over the years. Check out the Inside Track blog page for our up-to-date stories around Microsoft.\",\"url\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/author\/insidetrackarchive\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"How Microsoft connects high-quality, discoverable data - Inside Track Blog","description":"To connect data across business areas and processes, Microsoft Digital created a modern data catalog where employees can share and discover data assets.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.microsoft.com\/insidetrack\/blog\/how-microsoft-connects-high-quality-discoverable-data\/","og_locale":"en_US","og_type":"article","og_title":"How Microsoft connects high-quality, discoverable data - Inside Track Blog","og_description":"To connect data across business areas and processes, Microsoft Digital created a modern data catalog where employees can share and discover data assets.","og_url":"https:\/\/www.microsoft.com\/insidetrack\/blog\/how-microsoft-connects-high-quality-discoverable-data\/","og_site_name":"Inside Track Blog","article_published_time":"2020-01-30T19:23:59+00:00","article_modified_time":"2023-06-20T22:18:52+00:00","og_image":[{"width":2100,"height":1181,"url":"https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2020\/01\/8979_hero.jpg","type":"image\/jpeg"}],"author":"Inside Track \u2013 retired stories","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Inside Track \u2013 retired stories","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.microsoft.com\/insidetrack\/blog\/how-microsoft-connects-high-quality-discoverable-data\/","url":"https:\/\/www.microsoft.com\/insidetrack\/blog\/how-microsoft-connects-high-quality-discoverable-data\/","name":"How Microsoft connects high-quality, discoverable data - Inside Track Blog","isPartOf":{"@id":"https:\/\/www.microsoft.com\/insidetrack\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.microsoft.com\/insidetrack\/blog\/how-microsoft-connects-high-quality-discoverable-data\/#primaryimage"},"image":{"@id":"https:\/\/www.microsoft.com\/insidetrack\/blog\/how-microsoft-connects-high-quality-discoverable-data\/#primaryimage"},"thumbnailUrl":"https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2020\/01\/8979_hero.jpg","datePublished":"2020-01-30T19:23:59+00:00","dateModified":"2023-06-20T22:18:52+00:00","author":{"@id":"https:\/\/www.microsoft.com\/insidetrack\/blog\/#\/schema\/person\/59e5f7b07dae629412c990cc1a63b575"},"description":"To connect data across business areas and processes, Microsoft Digital created a modern data catalog where employees can share and discover data assets.","breadcrumb":{"@id":"https:\/\/www.microsoft.com\/insidetrack\/blog\/how-microsoft-connects-high-quality-discoverable-data\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.microsoft.com\/insidetrack\/blog\/how-microsoft-connects-high-quality-discoverable-data\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.microsoft.com\/insidetrack\/blog\/how-microsoft-connects-high-quality-discoverable-data\/#primaryimage","url":"https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2020\/01\/8979_hero.jpg","contentUrl":"https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2020\/01\/8979_hero.jpg","width":2100,"height":1181,"caption":"Vijay Panjeti and Kathy Brustad use data science to increase people\u2019s interest in Microsoft technology. Both are senior data and applied scientists in Worldwide Learning. (Photo by Aleenah Ansari | Inside Track)"},{"@type":"BreadcrumbList","@id":"https:\/\/www.microsoft.com\/insidetrack\/blog\/how-microsoft-connects-high-quality-discoverable-data\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.microsoft.com\/insidetrack\/blog\/"},{"@type":"ListItem","position":2,"name":"How Microsoft connects high-quality, discoverable data"}]},{"@type":"WebSite","@id":"https:\/\/www.microsoft.com\/insidetrack\/blog\/#website","url":"https:\/\/www.microsoft.com\/insidetrack\/blog\/","name":"Inside Track Blog","description":"How Microsoft does IT","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.microsoft.com\/insidetrack\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.microsoft.com\/insidetrack\/blog\/#\/schema\/person\/59e5f7b07dae629412c990cc1a63b575","name":"Inside Track \u2013 retired stories","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.microsoft.com\/insidetrack\/blog\/#\/schema\/person\/image\/ee0de87c339052d5d84852473bd7f213","url":"https:\/\/secure.gravatar.com\/avatar\/24a8c329ab32afd1bc23fd1658d1acc2?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/24a8c329ab32afd1bc23fd1658d1acc2?s=96&d=mm&r=g","caption":"Inside Track \u2013 retired stories"},"description":"The content on this page was crafted to highlight a specific moment in time or the solutions that have led us to where we are today. It offers valuable insights into our journey and the progress made over the years. Check out the Inside Track blog page for our up-to-date stories around Microsoft.","url":"https:\/\/www.microsoft.com\/insidetrack\/blog\/author\/insidetrackarchive\/"}]}},"jetpack_featured_media_url":"https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2020\/01\/8979_hero.jpg","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p9hcZA-1k1","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/insidetrack\/blog\/wp-json\/wp\/v2\/posts\/5085"}],"collection":[{"href":"https:\/\/www.microsoft.com\/insidetrack\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/insidetrack\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/insidetrack\/blog\/wp-json\/wp\/v2\/users\/146"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/insidetrack\/blog\/wp-json\/wp\/v2\/comments?post=5085"}],"version-history":[{"count":8,"href":"https:\/\/www.microsoft.com\/insidetrack\/blog\/wp-json\/wp\/v2\/posts\/5085\/revisions"}],"predecessor-version":[{"id":11538,"href":"https:\/\/www.microsoft.com\/insidetrack\/blog\/wp-json\/wp\/v2\/posts\/5085\/revisions\/11538"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/insidetrack\/blog\/wp-json\/wp\/v2\/media\/5089"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/insidetrack\/blog\/wp-json\/wp\/v2\/media?parent=5085"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/insidetrack\/blog\/wp-json\/wp\/v2\/categories?post=5085"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/insidetrack\/blog\/wp-json\/wp\/v2\/tags?post=5085"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.microsoft.com\/insidetrack\/blog\/wp-json\/wp\/v2\/coauthors?post=5085"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}