{"id":822961,"date":"2022-03-01T15:23:15","date_gmt":"2022-03-01T23:23:15","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-blog-post&p=822961"},"modified":"2022-03-15T10:23:06","modified_gmt":"2022-03-15T17:23:06","slug":"explainability","status":"publish","type":"msr-blog-post","link":"https:\/\/www.microsoft.com\/en-us\/research\/articles\/explainability\/","title":{"rendered":"Explainability"},"content":{"rendered":"
AI models are becoming a normal part of many business operations, led by advancement in AI technologies and the democratization of AI. While AI is increasingly important in decision making, it can be challenging to understand what influences the outcomes of AI models. Critical details like the information used as input, the influence of missing data, and use of unintended or sensitive input variables can all have an impact on a model\u2019s output. To use AI responsibly and to trust it enough to make decisions, we must have tools and processes in place to understand how the model is reaching its conclusions.<\/p>\n
Microsoft Dynamics 365 Customer Insights goes beyond just a predicted outcome and provides additional information that helps better understand the model and its predictions. Using the latest AI technologies, Customer Insights surfaces the main factors that drive our predictions. In this blog post, we will talk about how Customer Insights\u2019 out-of-the-box AI models enable enterprises to better understand and trust AI models, as well as what actions can be taken based on the additional model interpretability.<\/p>\n
Figure 1: Explainability information on the results page of the Customer Lifetime Value Out of box model, designed to help you interpret model results.<\/em><\/p><\/div>\n AI models are sometimes described as black boxes that consume information and output a prediction \u2013 where the inner workings are unknown. This raises serious questions about our reliance on AI technology. Can the model\u2019s prediction be trusted? Does the prediction make sense? AI model interpretability has emerged over the last few years as an area of research with the goal of providing insights into how AI models reach decisions.<\/p>\n AI models leverage information from the enterprise (data about customers, transactions, historic data, etc.) as inputs. We call these inputs features<\/strong>. Features are used by the model to determine the output. A way to achieve model interpretability is by using explainable AI, or model explainability, which are a set of techniques that describe which features influence a prediction. We\u2019ll talk about two approaches: local explainability that describes how the model arrived at a single prediction (say a single customer\u2019s churn score) and global explainability that describes which features are most useful to make all predictions. Before we describe how a model produces explainability output and how you should interpret it, we need to describe how we construct features from input data.<\/p>\n AI models are trained using features, which are transformations of raw input data to make it easier for the model to use. These transformations are a standard part of the model development process.<\/p>\n For instance, input data may be a list of transactions with dollar amounts, but a feature might be the number of transactions in the last thirty days and the average transaction value. (Many features summarize more than one input row.) Before features are created, raw input data needs to be prepared and \u201ccleaned\u201d. In a future post, we\u2019ll deep dive on data preparation and the role that model explainability plays in it.<\/p>\n To provide a more concrete example of what a feature is and how they might be important to the model\u2019s prediction, take these two features that might help predict customer churn value: frequency of transactions<\/em> and number of product types<\/em> bought<\/em>. In a coffee shop, frequency of transactions<\/em> is likely a great predictor of continued patronage: the regulars who walk by every morning will likely continue to do so. But those regulars may always get the same thing: I always get a 12 oz black Americano and never get a mochaccino or a sandwich. That means that number of product types<\/em> I buy isn\u2019t a good predictor of my churn: I buy the same product, but I get it every morning.<\/p>\n Conversely, the bank down the road may observe that I rarely visit the branch to transact. However, I\u2019ve got a mortgage, two bank accounts and a credit card with that bank. The bank\u2019s churn predictions might rely on the number of products\/services bought rather than frequency of buying a new product. Both models start with the same set of facts (frequency of transactions<\/em> and number of product types<\/em>) and predict the same thing (churn) but have learned to use different features to make accurate predictions. Model authors created a pair of features that might be useful, but the model ultimately decides how or whether to use those features based on the context.<\/p>\n Feature design also requires understandable names for the features. If a user doesn\u2019t know what a feature means, then it\u2019s hard to understand what it means if the model thinks it\u2019s important! During feature construction, AI engineers work with Product Managers and Content Writers to create human-readable names for every feature. For example, a feature representing the average number of transactions for a customer in the last quarter could look something like \u2018avg_trans_last_3_months\u2019 in the data science experimentation environment. If we were to present features like this to business users, it could be difficult for them to understand exactly what that means.<\/p>\n A main goal in model explainability is to understand the impact of including a feature in a model. For instance, one could train a model with all the features except one, then train a model with all features. The difference in accuracy of model predictions is a measure of the importance of the feature that was left out. If the model with the feature is much more accurate than the model without the feature, then the feature was very important.<\/p>\n Figure 2: The basic idea to compute explainability is to understand each feature\u2019s contribution to the model\u2019s performance by comparing performance of the whole model to performance without the feature. In reality, we use Shapley values to identify each feature\u2019s contribution, including interactions, in one training cycle.<\/em><\/p><\/div>\n There are nuances related to feature interaction (e.g., including city name and zip code may be redundant: removing one won\u2019t impact model performance but removing both would) but the basic idea remains the same: how much does including a feature contribute to model performance?<\/p>\n With hundreds of features, it\u2019s too expensive to train a model leaving each feature out one by one. Instead, we use a concept called Shapley values (opens in new tab)<\/span><\/a> to identify feature contributions from a single training cycle. Shapley values are a technique from game theory, where the goal is to understand the gains and costs of several actors working in a coalition. In Machine Learning, the \u201cactors\u201d are features, and the Shapley Value algorithm can estimate each feature\u2019s contribution even when they interact with other features.<\/p>\n If you are looking for (much!) more detail about Shapley analysis, a good place to start is this GitHub repository: GitHub – slundberg\/shap: A game theoretic approach to explain the output of any machine learning model. (opens in new tab)<\/span><\/a><\/p>\nWhat is model interpretability and why is it important?<\/strong><\/h3>\n
AI Feature Design with Interpretability in mind<\/strong><\/h3>\n
Explainability via Game Theory<\/strong><\/h3>\n