{"id":11136,"date":"2017-06-22T12:35:11","date_gmt":"2017-06-22T19:35:11","guid":{"rendered":"https:\/\/www.microsoft.com\/insidetrack\/blog\/?p=11136"},"modified":"2023-06-16T16:05:59","modified_gmt":"2023-06-16T23:05:59","slug":"microsoft-uses-analytics-and-data-science-to-enhance-the-user-experience","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/insidetrack\/blog\/microsoft-uses-analytics-and-data-science-to-enhance-the-user-experience\/","title":{"rendered":"Microsoft uses analytics and data science to enhance the user experience"},"content":{"rendered":"<div class=\"m-alert f-error\" role=\"alert\">\n<div>\n<div class=\"c-glyph glyph-incident-triangle\" aria-label=\"Error message\"><\/div>\n<p class=\"c-paragraph\">This content has been archived, and while it was correct at time of publication, it may no longer be accurate or reflect the current situation at Microsoft.<\/p>\n<\/div>\n<\/div>\n<p>Microsoft Digital is enhancing user experiences for Microsoft employees\u2014such as finding information and communicating and collaborating with coworkers. Using Azure Data Factory and Azure Data lake, we built a platform to capture instrumentation\u2014which includes data from products and services like Skype for Business and Office 365. We analyze this data, along with user feedback, using data science, machine learning, and algorithms\u2014key phrase extraction, deep semantic similarity, and sentiment analysis\u2014for insights to help people be productive.<\/p>\n<p>Whether people use Microsoft products and services to look for information, attend meetings, or collaborate, a great experience promotes user satisfaction, improves productivity, increases adoption, and cuts support costs.<\/p>\n<p>To enhance user experiences for Microsoft employees and vendors, Microsoft Digital employs instrumentation,\u00a0<a class=\"c-hyperlink\" href=\"https:\/\/www.microsoft.com\/en-us\/cloud-platform\/cortana-intelligence-suite\" target=\"_blank\" rel=\"noopener\" aria-label=\"Read more about Azure AI\">Azure AI<\/a>, data science, and machine learning algorithms. The resulting intelligence gives us insight into how people use products and services to achieve their goals, how they perceive an experience, and any issues they have.<\/p>\n<p>Our scope isn\u2019t a specific product or service, but rather five high-level\u2014often overlapping\u2014experiences across Microsoft products and services:<\/p>\n<ul class=\"c-list\">\n<li>Finding<\/li>\n<li>Meeting<\/li>\n<li>Supporting<\/li>\n<li>Communicating<\/li>\n<li>Collaborating<\/li>\n<\/ul>\n<p>Let\u2019s take a closer look at the technical details about how we set up the data platform and ingest data. After we consume and analyze data, why and how do we apply data science, machine learning, and algorithms like key phrase extraction, sentiment analysis, and deep semantic similarity? How do these approaches give us insight to help enhance the five overarching experiences?<\/p>\n<h2>Our overall approach<\/h2>\n<p>First, we look at products, services, devices, and infrastructure across Microsoft. Then we map them to one or more of the five experiences and look for areas of potential improvement. To do this, we use product and service telemetry like crash data and client device information. We also gauge sentiment with user feedback and satisfaction metrics from employee surveys and Microsoft Helpdesk support data.<\/p>\n<p>The data platform we built offers a single view of the overall picture\u2014for example, Microsoft Windows, internal business apps, devices, and network data. Although product teams collect metrics, their metrics only give us a view of the product or service that they create rather than across products and services at Microsoft. So we collaborate with product teams to learn from each other and offer comprehensive insight.<\/p>\n<h2>Technologies for creating this solution<\/h2>\n<p>We use Azure AI and other Microsoft technologies, and then we apply machine learning and algorithms to the data:<\/p>\n<ul class=\"c-list\">\n<li><strong>Azure Data Factory.<\/strong>\u00a0We use\u00a0<a class=\"c-hyperlink\" href=\"https:\/\/azure.microsoft.com\/en-us\/services\/data-factory\/\" target=\"_blank\" rel=\"noopener\" aria-label=\"Read more about Azure Data Factory\u00a0\">Azure Data Factory\u00a0<\/a>pipelines to ingest, prepare, process, and transform terabytes of structured and unstructured data and to move data into Azure Data Lake.<\/li>\n<li><strong>Azure Data Lake\u00a0<\/strong><strong>.<\/strong>\u00a0We use\u00a0<a class=\"c-hyperlink\" href=\"https:\/\/azure.microsoft.com\/en-us\/solutions\/data-lake\/\" target=\"_blank\" rel=\"noopener\" aria-label=\"Read more about Azure Data Lake\u00a0\">Azure Data Lake\u00a0<\/a>to store the data. Data Lake makes it easy for developers, data scientists, and analysts to store data and to process and analyze it across platforms and languages.<\/li>\n<li><strong>SQL Server Analysis Services.<\/strong>\u00a0We use\u00a0<a class=\"c-hyperlink\" href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/analysis-services\/analysis-services\" target=\"_blank\" rel=\"noopener\" aria-label=\"Read more about SQL Server Analysis Services\u00a0\">SQL Server Analysis Services\u00a0<\/a>to create multidimensional cubes, so that business users who create their own reports can slice and dice data in various ways.<\/li>\n<li><strong>Machine learning\u00a0<\/strong><strong>.<\/strong>\u00a0We use machine learning to get insights from verbatim analytics. To answer business-stakeholder questions, data scientists take raw data from Data Lake and do multiple levels of analysis. For verbatim analytics, we use key phrase extraction, deep semantic similarity, and sentiment analysis.<\/li>\n<li><strong>Microsoft Power BI\u00a0<\/strong><strong>.<\/strong>\u00a0Stakeholders use\u00a0<a class=\"c-hyperlink\" href=\"https:\/\/powerbi.microsoft.com\/en-us\/\" target=\"_blank\" rel=\"noopener\" aria-label=\"Read more about Power BI\">Power BI<\/a>\u00a0to view the data in self-service reports.<\/li>\n<\/ul>\n<h3>Architecture of the data platform<\/h3>\n<p>Figure 1 shows the architecture, data sources, technologies, and steps in this solution.<\/p>\n<div><\/div>\n<div class=\"m-image\">\n<figure id=\"attachment_11138\" aria-describedby=\"caption-attachment-11138\" style=\"width: 621px\" class=\"wp-caption alignnone\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-11138 size-full\" src=\"https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2023\/05\/7330-img001.png\" alt=\"Big data sources, SQL sources, information management, big data stores, machine learning and analytics, intelligence, dashboard, and visualization components of the architecture\" width=\"621\" height=\"296\" srcset=\"https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2023\/05\/7330-img001.png 621w, https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2023\/05\/7330-img001-300x143.png 300w\" sizes=\"(max-width: 621px) 100vw, 621px\" \/><figcaption id=\"caption-attachment-11138\" class=\"wp-caption-text\">Figure 1. High-level architecture of the data platform<\/figcaption><\/figure>\n<\/div>\n<h3>Collecting data and setting up the data platform<\/h3>\n<p>Behind the scenes, for scenarios like improving the finding or meeting experience, our general process is:<\/p>\n<ol class=\"c-list\">\n<li>First, we look at what the business wants to achieve. It\u2019s important to understand the business goals and reporting needs\u2014and how they\u2019ll bring overall business value\u2014before we design the data warehouse.<\/li>\n<li>Using Data Factory pipelines, we ingest terabytes of product and service data, along with survey and support data. We preprocess and cleanse it, and do initial data quality steps. With\u00a0<a class=\"c-hyperlink\" href=\"https:\/\/docs.microsoft.com\/en-us\/azure\/data-lake-analytics\/data-lake-analytics-u-sql-get-started\" target=\"_blank\" rel=\"noopener\" aria-label=\"Read more about U-SQL\">U-SQL<\/a>, a big data query language for Data\u00a0Lake\u00a0Analytics, we filter out what we don\u2019t need. Our scope is enterprise data, so we filter out consumer data, and ingest only what\u2019s applicable to Microsoft employees and vendors.<\/li>\n<li>We use Data Factory pipelines to move the data into Data Lake, where we store it.<\/li>\n<li>We build the data warehouse, which contains dimensions\u2014like devices and apps\u2014and facts. We build the fact list so that we can correlate the pieces\u2014for example, we might correlate apps with devices and users. For some data streams\u2014like device-related or Windows instrumentation\u2014we might not have user information. Without this, we can\u2019t correlate a device with a user. Here are examples of how we correlate:\n<div>\n<p>We do this in full compliance with all company security policies, and all personally identifiable information is hashed and anonymized.<\/p>\n<\/div>\n<ul class=\"c-list\">\n<li>We have device IDs and other device-related data, but any device with System Center Configuration Manager installed has both device and user info. We merge the data, making sure not to duplicate it. If there\u2019s a match between device IDs from Windows and Configuration\u00a0Manager, we build instrumentation that has both user and device info.<\/li>\n<li>For apps, we use the device ID to correlate an app with a device. Then we can connect the device with the user, apps, drivers, and app crashes.<\/li>\n<li>Incidents have a user alias, so we can get details based on user and device alias. For example, \u201cThis user or device has this number of incidents or requests associated with it.\u201d<\/li>\n<\/ul>\n<\/li>\n<li>After the data is in the data warehouse, we build online analytical processing (OLAP) cubes. OLAP cubes give others self-service reporting capabilities, where they can slice and dice for more details. In the cubes, we define metrics like device count, percentage of users, and percentage of app crashes.<\/li>\n<li>On top of this data, we\u2019ve started building trend analytics on data subsets. For example, some executives want to know when there is a specific Windows release and see adoption rates across the company over time. Trend analytics help us connect the pieces, giving us insights and helping us see crashes across apps. We correlate this data with the number of overall deployments and whether there were any weekly product or service updates.<\/li>\n<li>It\u2019s easy for a product team, executive, or business analyst to build dashboards and visualize the data in Power BI.<\/li>\n<\/ol>\n<h2>The what, why, and how of data science<\/h2>\n<p>After we set up the data platform, we apply data science and algorithms. At its core, data science is about using automated methods to analyze massive amounts of data and to extract knowledge or value. For this solution, because some of our data involves verbatim feedback from user satisfaction surveys, we use three algorithms to analyze the text\u2014key phrase extraction, sentiment analysis, and deep semantic similarity. These algorithms help us:<\/p>\n<ul class=\"c-list\">\n<li>Answer questions from business stakeholders.<\/li>\n<li>Measure sentiment, which helps us assess the level of user satisfaction with an experience.<\/li>\n<li>See how an experience could be enhanced.<\/li>\n<\/ul>\n<h3>Business opportunity of handling unstructured data<\/h3>\n<p>When it comes to data at large, data can either be:<\/p>\n<ul class=\"c-list\">\n<li><strong>Structured.<\/strong>\u00a0This data is highly organized\u2014like the information in a database table.<\/li>\n<li><strong>Unstructured.<\/strong>\u00a0This data lacks clearly defined categories, row, columns\u2014like the content in a document.<\/li>\n<\/ul>\n<p>In the case of verbatim feedback from surveys, the data is unstructured. Data science, machine learning, and algorithms help us make sense of this unstructured data and derive meaning from it. For example, processing and analyzing verbatim feedback gives us an in-depth understanding of:<\/p>\n<ul class=\"c-list\">\n<li>How a person feels about a subject.<\/li>\n<li>What topics are the most discussed.<\/li>\n<li>Which experiences are affected.<\/li>\n<\/ul>\n<p>It also gives a complementary facet (and sometimes conflicting input) to other ways of measuring user satisfaction. It\u2019s a more holistic picture for understanding feedback, gathering actionable insights, and creating a targeted plan.<\/p>\n<h3>Four-step approach to a data science project<\/h3>\n<p>Regardless of which data science solution we work on, we use a general, four-step approach and technologies like the\u00a0<a class=\"c-hyperlink\" href=\"https:\/\/www.microsoft.com\/cognitive-services\/en-us\/text-analytics-api\" target=\"_blank\" rel=\"noopener\" aria-label=\"Read more about Text Analytics API\u00a0\">Text Analytics API\u00a0<\/a>from\u00a0<a class=\"c-hyperlink\" href=\"https:\/\/www.microsoft.com\/cognitive-services\/en-us\/apis\" target=\"_blank\" rel=\"noopener\" aria-label=\"Read more about Microsoft Cognitive Services\u00a0\">Microsoft Cognitive Services\u00a0<\/a>, Power BI for visualization, and open-source APIs and visualization tools.<\/p>\n<p>Figure 2 shows the high-level process that we follow.<\/p>\n<div><\/div>\n<div class=\"m-image float-right\">\n<figure id=\"attachment_11139\" aria-describedby=\"caption-attachment-11139\" style=\"width: 565px\" class=\"wp-caption alignnone\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-11139 size-full\" src=\"https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2023\/05\/7330-img002.png\" alt=\"Big data sources, SQL sources, information management, big data stores, machine learning and analytics, intelligence, dashboard, and visualization components of the architecture\" width=\"565\" height=\"312\" srcset=\"https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2023\/05\/7330-img002.png 565w, https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2023\/05\/7330-img002-300x166.png 300w\" sizes=\"(max-width: 565px) 100vw, 565px\" \/><figcaption id=\"caption-attachment-11139\" class=\"wp-caption-text\">Figure 2. Four-step process in a data science project<\/figcaption><\/figure>\n<\/div>\n<p>Our goal is to give relevant and timely information to stakeholders to influence the changes that most users say that they want. The following steps help us reach this goal:<\/p>\n<ol class=\"c-list\">\n<li>We start with a structured approach and work with business stakeholders from the get-go, so that we understand their objectives. This way, we target what questions to ask the data, and use our findings as a benchmark.<\/li>\n<li>After we know what questions to ask, business users and our data scientists collaborate to identify which metrics and data sources can help answer those questions. All the data is consolidated into a single data platform.<\/li>\n<li>After the data is ingested, we prepare and analyze it, so that we can extract maximum value from it.<\/li>\n<li>Much of the value of data comes from uncovering and communicating answers and new perspectives to stakeholders. To provide a narrative and better context behind the data, we visualize the data in Power BI.<\/li>\n<\/ol>\n<h3>The machine learning engine<\/h3>\n<p>To uncover answers and new perspectives on verbatim survey responses, we use various data processing activities and machine learning algorithms, which come together in the machine learning engine. Figure 3 shows the flow, which starts with input from data that\u2019s ingested into the data platform, and ends with output in the form of insights and visualization.<\/p>\n<figure id=\"attachment_11140\" aria-describedby=\"caption-attachment-11140\" style=\"width: 467px\" class=\"wp-caption alignright\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-11140 size-full\" src=\"https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2023\/05\/7330-img003.jpg\" alt=\"This flow diagram shows the preprocessing text activities (sentence tokenization, checking validity of sentences, normalization, lemmatization, part-of-speech tagging, and n-gram extraction) and algorithms used in this solution. After text preprocessing activities, the key phrase extraction, sentiment analysis, and deep semantic similarity algorithms are applied.\" width=\"467\" height=\"584\" srcset=\"https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2023\/05\/7330-img003.jpg 467w, https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2023\/05\/7330-img003-240x300.jpg 240w\" sizes=\"(max-width: 467px) 100vw, 467px\" \/><figcaption id=\"caption-attachment-11140\" class=\"wp-caption-text\">Figure 3. The machine learning engine<\/figcaption><\/figure>\n<p>The machine learning process that we use to analyze verbatim feedback typically follows these steps:<\/p>\n<ol class=\"c-list\">\n<li><strong>We start with text preprocessing.\u00a0<\/strong>This makes it easier to extract meaningful information for a given topic and helps us do more precise sentiment analysis. As part of preprocessing, we divide the text into sentences, while also simplifying and standardizing\u2014or normalizing\u2014different elements in the text.<\/li>\n<li><strong>After preprocessing, we extract key phrases, analyze sentiment, and look for deep semantic similarity.\u00a0<\/strong>These activities can be done in parallel. Data scientists provide sentiment analysis and semantic analysis at the sentence level, and topic modeling at the entire verbatim survey level.<\/li>\n<li><strong>As the verbatim feedback is analyzed, it\u2019s mapped to one of the five experiences.\u00a0<\/strong>The business stakeholders who define the experiences also identify the associated tools. For example, for collaborating, you could have OneNote and Skype for Business (although Skype for Business also aligns with the meeting experience). This information guides the machine learning in understanding whether the feedback is tied to a specific experience.<\/li>\n<\/ol>\n<h4>Importance of data quality and text preprocessing activities<\/h4>\n<p>When we build any automated procedure for text analysis, we need to ensure that the text data is good quality. This is because the quality of input data directly affects the machine learning results. Because most data quality checks are done during data ingestion, they\u2019re outside the scope of data science. But verbatim data\u2014ingested for analytics\u2014can have a subset of values that aren\u2019t valid, like question marks, \u201cno comment,\u201d \u201cnone,\u201d \u201cna,\u201d and others. We need to recognize this kind of data and disqualify it from analysis.<\/p>\n<p>After the data quality check, the verbatim data is prepared for machine learning algorithms. This is the preprocessing stage and is done in the machine learning engine. Text preprocessing involves:<\/p>\n<ul class=\"c-list\">\n<li>Tokenize and validate sentences.<\/li>\n<li>Normalize acronyms and abbreviations.<\/li>\n<li>Lemmatize words, tag parts of speech, and extract n-grams.<\/li>\n<\/ul>\n<h5>Tokenize and validate sentences<\/h5>\n<p><strong>WHAT<\/strong>:\u00a0<i>Sentence tokenization<\/i>\u00a0ensures that each sentence from verbatim data goes through sentiment analysis. It separates sentences and attaches a unique sentence ID to each. Tokenized sentences are evaluated for quality to make sure that they have valuable information for further analysis.<\/p>\n<p><strong>WHY<\/strong>: Sentiment analysis can\u2019t separate sentences, so if there are two sentences in a verbatim response where one is positive and the other is negative, it can\u2019t provide a single sentiment score and polarity\u2014whether an opinion is positive, negative, or neutral . If the first sentence is very positive and the second sentence is equally negative, analyzing both sentences together results in a neutral polarity and sentiment score, which is incorrect. If sentences are analyzed separately, the polarity score is more accurate.<\/p>\n<p><strong>HOW<\/strong>: There are many open-source, rule-based algorithms that tokenize English grammar. The rule-based algorithms can\u2019t guarantee 100 percent accuracy, but they do a good job of recognizing end-of-sentence punctuation and separating sentences.<\/p>\n<h5>Normalize acronyms and abbreviations<\/h5>\n<p><strong>WHAT<\/strong>:\u00a0<i>Normalization<\/i>\u00a0maps acronyms and abbreviations to a corresponding word list. For example, many users might refer to Skype for Business as SFB or S4B. To ensure that abbreviations are recognized by text-analysis algorithms, each tokenized sentence is normalized.<\/p>\n<p><strong>WHY<\/strong>: Normalization can improve the accuracy of many text-analysis algorithms. For example, what if a group of verbatim responses has 10 records for Skype for Business, 10 records for SFB, and 15 records for Surface? If terms aren\u2019t normalized, the text-mining algorithms might consider Surface as the most frequently mentioned product in the verbatim responses. Normalization would recognize 20 records for Skype for Business.<\/p>\n<p><strong>HOW<\/strong>: Currently, text is normalized by using a glossary of words and abbreviations. Acronyms and abbreviations are manually identified to ensure that normalization is as accurate as possible.<\/p>\n<h5>Lemmatize words, tag parts of speech, and extract n-grams<\/h5>\n<p><strong>WHAT<\/strong>: The last step of preprocessing is preparing input for key phrase extraction and hashtagging. Lemmatization returns the base or dictionary form\u2014the lemma\u2014of a word. It changes plural words to singular, changes past tense verbs to present tense, and so on. For example, the lemma of \u201ckids\u201d is \u201ckid,\u201d \u201cchildren\u201d is \u201cchild,\u201d and \u201cheld\u201d is \u201chold.\u201d Part-of-speech tagging identifies nouns, verbs, adjectives, and adverbs, which helps extract information from sentences.<\/p>\n<p>In very simple terms, an n-gram is a collection of letters, words, and syllables. Extracted n-grams are used as input for hashtagging.<\/p>\n<p><strong>WHY<\/strong>: Lemmatization ensures that words are converted to their base, or dictionary, forms. Like normalization, it can improve the accuracy of text-mining algorithms that use word count. Parts of speech and n-grams are used as input to the machine learning engine.<\/p>\n<p><strong>HOW<\/strong>: There are many open-source libraries that lemmatize, tag parts of speech, and extract n-grams.<\/p>\n<h4>The machine learning algorithms we use, why, and how<\/h4>\n<p>Given all the available algorithms, why did we choose to use key phrase extraction, sentiment analysis, and deep semantic similarity for this solution? Based on the type of data that we get\u2014which includes unstructured data from feedback in survey responses\u2014each algorithm helps us make sense of the text in different ways:<\/p>\n<ul class=\"c-list\">\n<li>Algorithm 1: Key phrase extraction<\/li>\n<li>Algorithm 2: Sentiment analysis<\/li>\n<li>Algorithm 3: Deep semantic similarity<\/li>\n<\/ul>\n<h5>Algorithm 1: Key phrase extraction<\/h5>\n<p><strong>WHAT<\/strong>: Key phrase extraction involves extracting structured information from unstructured text. The structured information consists of important topical words and phrases from verbatim survey responses. Key phrases concisely describe survey verbatim content, and are useful for categorizing, clustering, indexing, searching, and summarizing. Key phrases are then scored\/ranked.<\/p>\n<p>As part of key phrase extraction, we add labels or metadata tags\u2014hashtags\u2014usually on social network and blog services, which makes it easier for people to find specific themes or content. Hashtagging is a process of creating a meaningful label or combination of labels to best represent a verbatim survey response.<\/p>\n<p><strong>WHY<\/strong>: The key phrases, hashtags, and associated score\/rank are used to interpret the survey language. For example, they help us find positive and negative phrases, and trending or popular topics.<\/p>\n<p><strong>HOW<\/strong>: Key phrase extraction has two steps. First, a set of words and phrases that convey the content of a document is identified. Second, these candidates are scored\/ranked and the best ones are selected as a document\u2019s key phrases. We use key phrase extraction application programming interfaces (APIs)\u2014both from Microsoft and open source.<\/p>\n<p>Hashtagging is also a two-step process. First, we normalize the input survey responses and remove non-essential words. Second, the labels are scored, based on how often they appear in other survey responses.<\/p>\n<h5>Algorithm 2: Sentiment analysis<\/h5>\n<p><strong>WHAT<\/strong>: Sentiment analysis is an automated method of determining if text\u2014like verbatim feedback from surveys\u2014is positive, neutral, or negative, and to what degree. Sentiment analysis and text analytics reveal people\u2019s opinions about Microsoft products and services.<\/p>\n<p><strong>WHY<\/strong>: Sentiment analysis helps us detect what users like and dislike about products and services. It\u2019s not enough to know the main topics that people are concerned about. We need to know how strongly they feel\u2014whether positively or negatively. Sentiment analysis uncovers these feelings and helps us group survey responses into corresponding polarities\u2014positive, negative, or neutral\u2014for deeper text analysis.<\/p>\n<p><strong>HOW<\/strong>: Sentiment analysis is usually done with classification modeling. \u201cTrained\u201d data with positive, neutral, and negative labels is fed into different classification algorithms. After model training and selection, we run the final model on new verbatim data. The trained model determines the polarity and sentiment score of each piece of writing. We use the Text Analytics API in Cognitive Services for sentiment polarity and scoring.<\/p>\n<h5>Algorithm 3: Deep semantic similarity<\/h5>\n<p><strong>WHAT:<\/strong>\u00a0Semantic similarity is a metric that\u2019s defined for a set of documents or terms, where the similarity score is based on similarity of meaning. It identifies whether verbatim sentences have common characteristics. For example, one person might use the word \u201cissue,\u201d whereas another person uses \u201cproblem.\u201d But the net result is a negative sentiment.<\/p>\n<p><strong>WHY<\/strong>: Deep semantic similarity is key for mapping survey responses into the five overarching experiences. It helps us understand how one experience influences the satisfaction of other experiences. It\u2019s more accurate than clustering and categorizing because it\u2019s based on meaning\u2014unlike other methods, which are based on similarity of format, number of letters, or words in common.<\/p>\n<p>Before a survey goes out, the business stakeholders tag each survey question based on the experience it\u2019s related to\u2014for example, the meeting experience. However, when someone answers a survey question, the response that a person gives could be related to the finding experience more than the meeting experience. In other words, survey respondents might express high or low satisfaction about a different experience than how something was initially tagged. Having this information helps pinpoint which experiences to enhance. Deep semantic similarity allows us to uncover and share these types of discoveries with stakeholders.<\/p>\n<p><strong>HOW<\/strong>: Semantic similarity is measured based on the semantic distance between two terms. Semantic distance refers to how similar in meaning the terms are. First, all terms are converted into their lowest vocabulary form (for example, \u201csearching\u201d to \u201csearch\u201d), and then the distance between their semantic scores is measured.<\/p>\n<h4>Advanced visualization techniques<\/h4>\n<p>We choose visualization techniques that best represent the results and give the most insights. For this solution, we use Power BI dashboards and open-source visualization tools. For example, we use Power BI dashboards to show whether sentiment tied to survey questions and responses about a user experience was positive, neutral, or negative.<\/p>\n<h3>Ability to operationalize and repeat<\/h3>\n<p>Much of the business value of data science lies in:<\/p>\n<ul class=\"c-list\">\n<li>The ability to make insights widely available and to repeat an end-to-end machine learning process as close as possible to the time when unstructured data is produced (after each survey, for example).<\/li>\n<li>The ability to connect insights. For example, to predict what the user feedback will be, we can connect survey insights that we get from verbatim analysis combined with Helpdesk insights that we get from Helpdesk tickets.<\/li>\n<\/ul>\n<p>Having a robust data platform in place and a team of data engineers is key to enabling us to incorporate an end-to-end, repeatable machine learning process into our operations, so that we get the maximum business value.<\/p>\n<h2>Best practices<\/h2>\n<p>Along the way, we\u2019ve learned some valuable lessons that help make our data collection and analysis run smoothly:<\/p>\n<ul class=\"c-list\">\n<li><strong>Have a methodical, clearly defined strategy.\u00a0<\/strong>A core part of our strategy is understanding the business context and questions that we need to answer. To chart our course, we always align with the five high-level experiences.<\/li>\n<li><strong>Ensure variety.\u00a0<\/strong>A good team needs diverse skill sets. Different perspectives and expertise yields deeper, richer insights. Typical skills required include machine learning, Data Lake, R and Python, and U-SQL.<\/li>\n<li><strong>Fail fast\/learn fast.\u00a0<\/strong>We\u2019re always iterating based on what has and hasn\u2019t worked well.<\/li>\n<li><strong>Reduce scope.\u00a0<\/strong>We don\u2019t want to overwhelm teams and executives with thousands of insights at one time. We focus on smaller, actionable, valuable pieces\u2014like sentiment analysis.<\/li>\n<li><strong>Use Power BI for self-service reporting.\u00a0<\/strong>Power BI makes data visualization and reporting capabilities easy and widely available. Anyone can generate a report without having to wait on other people.<\/li>\n<\/ul>\n<h2>Summary<\/h2>\n<p>To enhance user experience across Microsoft products and services, we gather, process, and analyze data to learn where we can improve. We study products and services instrumentation. Also, we determine user sentiment by looking at data from user surveys that we distribute to employees and from Microsoft Helpdesk support data. We analyze text by using key phrase extraction, sentiment analysis, and deep semantic similarity algorithms. These methods and processes help us answer stakeholder questions, rank user satisfaction, and discover how an experience can be enhanced.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This content has been archived, and while it was correct at time of publication, it may no longer be accurate or reflect the current situation at Microsoft. Microsoft Digital is enhancing user experiences for Microsoft employees\u2014such as finding information and communicating and collaborating with coworkers. Using Azure Data Factory and Azure Data lake, we built [&hellip;]<\/p>\n","protected":false},"author":146,"featured_media":11137,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"_hide_featured_on_single":false,"_show_featured_caption_on_single":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false},"version":2}},"categories":[1],"tags":[],"coauthors":[674],"class_list":["post-11136","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized","m-blog-post"],"jetpack_publicize_connections":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v23.7 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Microsoft uses analytics and data science to enhance the user experience - Inside Track Blog<\/title>\n<meta name=\"description\" content=\"Microsoft Digital is enhancing user experiences for Microsoft employees\u2014such as finding information and communicating and collaborating with coworkers. Using Azure Data Factory and Azure Data lake, we built a platform to capture instrumentation\u2014which includes data from products and services like Skype for Business and Office 365. We analyze this data, along with user feedback, using data science, machine learning, and algorithms\u2014key phrase extraction, deep semantic similarity, and sentiment analysis\u2014for insights to help people be productive.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.microsoft.com\/insidetrack\/blog\/microsoft-uses-analytics-and-data-science-to-enhance-the-user-experience\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Microsoft uses analytics and data science to enhance the user experience - Inside Track Blog\" \/>\n<meta property=\"og:description\" content=\"Microsoft Digital is enhancing user experiences for Microsoft employees\u2014such as finding information and communicating and collaborating with coworkers. Using Azure Data Factory and Azure Data lake, we built a platform to capture instrumentation\u2014which includes data from products and services like Skype for Business and Office 365. We analyze this data, along with user feedback, using data science, machine learning, and algorithms\u2014key phrase extraction, deep semantic similarity, and sentiment analysis\u2014for insights to help people be productive.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.microsoft.com\/insidetrack\/blog\/microsoft-uses-analytics-and-data-science-to-enhance-the-user-experience\/\" \/>\n<meta property=\"og:site_name\" content=\"Inside Track Blog\" \/>\n<meta property=\"article:published_time\" content=\"2017-06-22T19:35:11+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-06-16T23:05:59+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2023\/05\/7330-hero.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1040\" \/>\n\t<meta property=\"og:image:height\" content=\"585\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Inside Track \u2013 retired stories\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Inside Track \u2013 retired stories\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"17 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/microsoft-uses-analytics-and-data-science-to-enhance-the-user-experience\/\",\"url\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/microsoft-uses-analytics-and-data-science-to-enhance-the-user-experience\/\",\"name\":\"Microsoft uses analytics and data science to enhance the user experience - Inside Track Blog\",\"isPartOf\":{\"@id\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/microsoft-uses-analytics-and-data-science-to-enhance-the-user-experience\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/microsoft-uses-analytics-and-data-science-to-enhance-the-user-experience\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2023\/05\/7330-hero.jpg\",\"datePublished\":\"2017-06-22T19:35:11+00:00\",\"dateModified\":\"2023-06-16T23:05:59+00:00\",\"author\":{\"@id\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/#\/schema\/person\/59e5f7b07dae629412c990cc1a63b575\"},\"description\":\"Microsoft Digital is enhancing user experiences for Microsoft employees\u2014such as finding information and communicating and collaborating with coworkers. Using Azure Data Factory and Azure Data lake, we built a platform to capture instrumentation\u2014which includes data from products and services like Skype for Business and Office 365. We analyze this data, along with user feedback, using data science, machine learning, and algorithms\u2014key phrase extraction, deep semantic similarity, and sentiment analysis\u2014for insights to help people be productive.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/microsoft-uses-analytics-and-data-science-to-enhance-the-user-experience\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.microsoft.com\/insidetrack\/blog\/microsoft-uses-analytics-and-data-science-to-enhance-the-user-experience\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/microsoft-uses-analytics-and-data-science-to-enhance-the-user-experience\/#primaryimage\",\"url\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2023\/05\/7330-hero.jpg\",\"contentUrl\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2023\/05\/7330-hero.jpg\",\"width\":1040,\"height\":585,\"caption\":\"Woman designs AI applications.\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/microsoft-uses-analytics-and-data-science-to-enhance-the-user-experience\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Microsoft uses analytics and data science to enhance the user experience\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/#website\",\"url\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/\",\"name\":\"Inside Track Blog\",\"description\":\"How Microsoft does IT\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/#\/schema\/person\/59e5f7b07dae629412c990cc1a63b575\",\"name\":\"Inside Track \u2013 retired stories\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/#\/schema\/person\/image\/ee0de87c339052d5d84852473bd7f213\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/24a8c329ab32afd1bc23fd1658d1acc2?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/24a8c329ab32afd1bc23fd1658d1acc2?s=96&d=mm&r=g\",\"caption\":\"Inside Track \u2013 retired stories\"},\"description\":\"The content on this page was crafted to highlight a specific moment in time or the solutions that have led us to where we are today. It offers valuable insights into our journey and the progress made over the years. Check out the Inside Track blog page for our up-to-date stories around Microsoft.\",\"url\":\"https:\/\/www.microsoft.com\/insidetrack\/blog\/author\/insidetrackarchive\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Microsoft uses analytics and data science to enhance the user experience - Inside Track Blog","description":"Microsoft Digital is enhancing user experiences for Microsoft employees\u2014such as finding information and communicating and collaborating with coworkers. Using Azure Data Factory and Azure Data lake, we built a platform to capture instrumentation\u2014which includes data from products and services like Skype for Business and Office 365. We analyze this data, along with user feedback, using data science, machine learning, and algorithms\u2014key phrase extraction, deep semantic similarity, and sentiment analysis\u2014for insights to help people be productive.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.microsoft.com\/insidetrack\/blog\/microsoft-uses-analytics-and-data-science-to-enhance-the-user-experience\/","og_locale":"en_US","og_type":"article","og_title":"Microsoft uses analytics and data science to enhance the user experience - Inside Track Blog","og_description":"Microsoft Digital is enhancing user experiences for Microsoft employees\u2014such as finding information and communicating and collaborating with coworkers. Using Azure Data Factory and Azure Data lake, we built a platform to capture instrumentation\u2014which includes data from products and services like Skype for Business and Office 365. We analyze this data, along with user feedback, using data science, machine learning, and algorithms\u2014key phrase extraction, deep semantic similarity, and sentiment analysis\u2014for insights to help people be productive.","og_url":"https:\/\/www.microsoft.com\/insidetrack\/blog\/microsoft-uses-analytics-and-data-science-to-enhance-the-user-experience\/","og_site_name":"Inside Track Blog","article_published_time":"2017-06-22T19:35:11+00:00","article_modified_time":"2023-06-16T23:05:59+00:00","og_image":[{"width":1040,"height":585,"url":"https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2023\/05\/7330-hero.jpg","type":"image\/jpeg"}],"author":"Inside Track \u2013 retired stories","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Inside Track \u2013 retired stories","Est. reading time":"17 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.microsoft.com\/insidetrack\/blog\/microsoft-uses-analytics-and-data-science-to-enhance-the-user-experience\/","url":"https:\/\/www.microsoft.com\/insidetrack\/blog\/microsoft-uses-analytics-and-data-science-to-enhance-the-user-experience\/","name":"Microsoft uses analytics and data science to enhance the user experience - Inside Track Blog","isPartOf":{"@id":"https:\/\/www.microsoft.com\/insidetrack\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.microsoft.com\/insidetrack\/blog\/microsoft-uses-analytics-and-data-science-to-enhance-the-user-experience\/#primaryimage"},"image":{"@id":"https:\/\/www.microsoft.com\/insidetrack\/blog\/microsoft-uses-analytics-and-data-science-to-enhance-the-user-experience\/#primaryimage"},"thumbnailUrl":"https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2023\/05\/7330-hero.jpg","datePublished":"2017-06-22T19:35:11+00:00","dateModified":"2023-06-16T23:05:59+00:00","author":{"@id":"https:\/\/www.microsoft.com\/insidetrack\/blog\/#\/schema\/person\/59e5f7b07dae629412c990cc1a63b575"},"description":"Microsoft Digital is enhancing user experiences for Microsoft employees\u2014such as finding information and communicating and collaborating with coworkers. Using Azure Data Factory and Azure Data lake, we built a platform to capture instrumentation\u2014which includes data from products and services like Skype for Business and Office 365. We analyze this data, along with user feedback, using data science, machine learning, and algorithms\u2014key phrase extraction, deep semantic similarity, and sentiment analysis\u2014for insights to help people be productive.","breadcrumb":{"@id":"https:\/\/www.microsoft.com\/insidetrack\/blog\/microsoft-uses-analytics-and-data-science-to-enhance-the-user-experience\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.microsoft.com\/insidetrack\/blog\/microsoft-uses-analytics-and-data-science-to-enhance-the-user-experience\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.microsoft.com\/insidetrack\/blog\/microsoft-uses-analytics-and-data-science-to-enhance-the-user-experience\/#primaryimage","url":"https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2023\/05\/7330-hero.jpg","contentUrl":"https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2023\/05\/7330-hero.jpg","width":1040,"height":585,"caption":"Woman designs AI applications."},{"@type":"BreadcrumbList","@id":"https:\/\/www.microsoft.com\/insidetrack\/blog\/microsoft-uses-analytics-and-data-science-to-enhance-the-user-experience\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.microsoft.com\/insidetrack\/blog\/"},{"@type":"ListItem","position":2,"name":"Microsoft uses analytics and data science to enhance the user experience"}]},{"@type":"WebSite","@id":"https:\/\/www.microsoft.com\/insidetrack\/blog\/#website","url":"https:\/\/www.microsoft.com\/insidetrack\/blog\/","name":"Inside Track Blog","description":"How Microsoft does IT","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.microsoft.com\/insidetrack\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.microsoft.com\/insidetrack\/blog\/#\/schema\/person\/59e5f7b07dae629412c990cc1a63b575","name":"Inside Track \u2013 retired stories","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.microsoft.com\/insidetrack\/blog\/#\/schema\/person\/image\/ee0de87c339052d5d84852473bd7f213","url":"https:\/\/secure.gravatar.com\/avatar\/24a8c329ab32afd1bc23fd1658d1acc2?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/24a8c329ab32afd1bc23fd1658d1acc2?s=96&d=mm&r=g","caption":"Inside Track \u2013 retired stories"},"description":"The content on this page was crafted to highlight a specific moment in time or the solutions that have led us to where we are today. It offers valuable insights into our journey and the progress made over the years. Check out the Inside Track blog page for our up-to-date stories around Microsoft.","url":"https:\/\/www.microsoft.com\/insidetrack\/blog\/author\/insidetrackarchive\/"}]}},"jetpack_featured_media_url":"https:\/\/www.microsoft.com\/insidetrack\/blog\/uploads\/prod\/2023\/05\/7330-hero.jpg","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p9hcZA-2TC","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/insidetrack\/blog\/wp-json\/wp\/v2\/posts\/11136"}],"collection":[{"href":"https:\/\/www.microsoft.com\/insidetrack\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/insidetrack\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/insidetrack\/blog\/wp-json\/wp\/v2\/users\/146"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/insidetrack\/blog\/wp-json\/wp\/v2\/comments?post=11136"}],"version-history":[{"count":4,"href":"https:\/\/www.microsoft.com\/insidetrack\/blog\/wp-json\/wp\/v2\/posts\/11136\/revisions"}],"predecessor-version":[{"id":11194,"href":"https:\/\/www.microsoft.com\/insidetrack\/blog\/wp-json\/wp\/v2\/posts\/11136\/revisions\/11194"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/insidetrack\/blog\/wp-json\/wp\/v2\/media\/11137"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/insidetrack\/blog\/wp-json\/wp\/v2\/media?parent=11136"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/insidetrack\/blog\/wp-json\/wp\/v2\/categories?post=11136"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/insidetrack\/blog\/wp-json\/wp\/v2\/tags?post=11136"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.microsoft.com\/insidetrack\/blog\/wp-json\/wp\/v2\/coauthors?post=11136"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}