{"id":656325,"date":"2020-06-02T09:40:48","date_gmt":"2020-06-02T16:40:48","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-project&p=656325"},"modified":"2025-03-31T18:33:10","modified_gmt":"2025-04-01T01:33:10","slug":"econml","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/econml\/","title":{"rendered":"EconML"},"content":{"rendered":"
\n\t
\n\t\t
\n\t\t\t\"EconML\t\t<\/div>\n\t\t\n\t\t
\n\t\t\t\n\t\t\t
\n\t\t\t\t\n\t\t\t\t
\n\t\t\t\t\t\n\t\t\t\t\t
\n\t\t\t\t\t\t
\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\n

EconML<\/h1>\n\n\n\n

Estimate causal effects with ML<\/p>\n\n\t\t\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/div>\n<\/section>\n\n\n\n\n\n

\n
EconML on GitHub<\/a><\/div>\n\n\n\n
Documentation on EconML<\/a><\/div>\n<\/div>\n\n\n\n

Overview<\/h2>\n\n\n\n

EconML<\/strong> is a Python package that applies the power of machine learning techniques to estimate individualized causal responses from observational or experimental data. The suite of estimation methods provided in EconML represents the latest advances in causal machine learning. By incorporating individual machine learning steps into interpretable causal models, these methods improve the reliability of what-if predictions and make causal analysis quicker and easier for a broad set of users.<\/p>\n\n\n\n

EconML is open-source software developed by the ALICE team<\/a> at Microsoft Research.<\/p>\n\n\n\n

<\/div>\n\n\n\n
\n
\n
\"Flexible<\/figure>\n\n\n\n
<\/div>\n\n\n\n

Flexible<\/b>\nAllows for flexible model forms that do not impose strong assumptions, including models of heterogenous responses to treatment.\n\n<\/p>\n<\/div>\n\n\n\n

\n
\"Unified<\/figure>\n\n\n\n
<\/div>\n\n\n\n

Unified<\/b>\nBroad set of methods representing latest advances in the econometrics and machine learning literature within a unified API.\n\n<\/p>\n<\/div>\n\n\n\n

\n
\"Familiar<\/figure>\n\n\n\n
<\/div>\n\n\n\n

Familiar Interface<\/b>\nBuilt on standard Python packages for machine learning and data analysis.\n\n<\/p>\n<\/div>\n<\/div>\n\n\n\n

<\/div>\n\n\n\n

Use cases<\/h2>\n\n\n\n

This toolkit is designed to measure the causal effect of some treatment variable(s) T on an outcome variable Y, controlling for a set of features X. Use cases include<\/a>:<\/p>\n\n\n\n

\n
\n
\"illustration<\/figure>\n\n\n\n
<\/div>\n\n\n\n

Recommendation A\/B testing<\/h3>\n\n\n\n

Interpret experiments with imperfect compliance<\/em><\/p>\n\n\n\n

Question:<\/strong>\u00a0A travel website would like to know whether joining a membership program causes users to spend more time engaging with the website.\u00a0<\/p>\n\n\n\n

Problem:<\/strong>\u00a0They can\u2019t look directly at existing data, comparing members and non-members, because the customers who chose to become members are likely already more engaged than other users. Nor can they run a direct A\/B test because they can\u2019t force users to sign up for membership.\u00a0<\/p>\n\n\n\n

Solution:<\/strong>\u00a0The company had run an earlier experiment to test the value of a new, faster sign-up process. EconML\u2019s\u00a0DRIV estimator(opens in new tab) (opens in new tab)<\/span><\/a>\u00a0uses this experimental nudge towards membership as an instrument that generates random variation in the likelihood of membership. The DRIV model adjusts for the fact that not every customer who was offered the easier sign-up became a member and returns the effect of membership rather than the effect of receiving the quick sign-up.<\/p>\n\n\n\n

\n
Trip Advisor Case Study<\/a><\/div>\n\n\n\n
Jupyter Notebook<\/a><\/div>\n<\/div>\n<\/div>\n\n\n\n
\n
\"illustration<\/figure>\n\n\n\n
<\/div>\n\n\n\n

Customer segmentation<\/h3>\n\n\n\n

Estimate individualized responses to incentives<\/em><\/p>\n\n\n\n

Question:<\/strong> A media subscription service would like to offer targeted discounts through a personalized pricing plan. <\/p>\n\n\n\n

Problem:<\/strong> They observe many features of their customers but are not sure which customers will respond most to a lower price. <\/p>\n\n\n\n

Solution:<\/strong>\u00a0EconML\u2019s\u00a0DML estimator(opens in new tab) (opens in new tab)<\/span><\/a>\u00a0uses price variations in existing data, along with a rich set of user features, to estimate heterogeneous price sensitivities that vary with multiple customer features. The\u00a0tree interpreter(opens in new tab) (opens in new tab)<\/span><\/a>\u00a0provides a presentation-ready summary of the key features that explain the biggest differences in responsiveness to a discount.<\/p>\n\n\n\n

\n
Jupyter Notebook<\/a><\/div>\n<\/div>\n<\/div>\n\n\n\n
\n
\"illustration<\/figure>\n\n\n\n
<\/div>\n\n\n\n

Multi-investment attribution<\/h3>\n\n\n\n

Distinguish the effects of multiple outreach efforts<\/em><\/p>\n\n\n\n

Question:<\/strong>\u00a0A startup would like to know the most effective approach for recruiting new customers: price discounts, technical support to ease adoption, or a combination of the two.\u00a0<\/p>\n\n\n\n

Problem:<\/strong>\u00a0The risk of losing customers makes experiments across outreach efforts too expensive. So far, customers have been offered incentives strategically, for example larger businesses are more likely to get technical support.\u00a0<\/p>\n\n\n\n

Solution:<\/strong>\u00a0EconML\u2019s\u00a0Doubly Robust Learner(opens in new tab) (opens in new tab)<\/span><\/a>\u00a0model jointly estimates the effects of multiple discrete treatments. The model uses flexible functions of observed customer features to filter out confounding correlations in existing data and deliver the causal effect of each effort on revenue.<\/p>\n\n\n\n

\n
Technical Paper<\/a><\/div>\n\n\n\n
Jupyter Notebook<\/a><\/div>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n
\n
EconML on GitHub<\/a><\/div>\n\n\n\n
Documentation on EconML<\/a><\/div>\n<\/div>\n\n\n\n\n\n
\n
\n
\"orange<\/figure>

<\/p>\n

Quick Installation<\/h2>\n

Install the latest release for PyPI (opens in new tab)<\/span><\/a>: pip install econml<\/code><\/p>\n<\/div>\n

 <\/div>\n\n\n\n

For developers<\/h2>\n\n\n\n

You can get started by cloning this repository. We use setuptools (opens in new tab)<\/span><\/a> for building and distributing our package. We rely on some recent features of setuptools, so make sure to upgrade to a recent version with pip install setuptools --upgrade<\/code>. Then from your local copy of the repository you can run python setup.py develop<\/code> to get started.<\/p>\n\n\n\n

Running the tests<\/h2>\n\n\n\n

This project uses pytest (opens in new tab)<\/span><\/a> for testing. To run tests locally after installing the package, you can use python setup.py pytest<\/code>.<\/p>\n\n\n\n

Generating the documentation<\/h2>\n\n\n\n

This project\u2019s documentation is generated via Sphinx (opens in new tab)<\/span><\/a>. Note that we use graphviz (opens in new tab)<\/span><\/a>\u2018s dot<\/code> application to produce some of the images in our documentation, so you should make sure that dot<\/code> is installed and in your path.<\/p>\n\n\n\n

To generate a local copy of the documentation from a clone of this repository, just run python setup.py build_sphinx -W -E -a<\/code>, which will build the documentation and place it under the build\/sphinx\/html<\/code> path.<\/p>\n\n\n\n

The reStructuredText files that make up the documentation are stored in the docs directory (opens in new tab)<\/span><\/a>; module documentation is automatically generated by the Sphinx build process.<\/p>\n\n\n\n

\n

Enviroment<\/h2>\n

The econml package works on macOS, Windows, and Linux, and supports Python versions 3.5-3.7. The econml package relies on numpy, scipy, and scikit-learn for most of its underlying numerical computation and machine learning routines, and uses keras for the components built on deep neural networks. If you don\u2019t already have these dependencies installed locally, then installing the econml package from PyPI via pip will also install them.<\/p>\n<\/div>\n\n\n\n

\n
EconML on GitHub<\/a><\/div>\n\n\n\n
Documentation on EconML<\/a><\/div>\n<\/div>\n\n\n\n\n\n

If you are new to causal inference<\/h2>\n\n\n\n

Causal analysis is used to answer what-if questions. Unlike forecasting, which answers the question of what will happen next if everyone keeps behaving as they have in the past, causal inference answers the question of what would happen next if someone changes behavior, such as pursuing a new pricing strategy for a product or a new treatment for a patient. If you are new to causal inference, our guide will introduce you to the biggest challenges in answering causal questions, how EconML addresses those challenges, and the standard causal terminology we use throughout our SDK and documentation.<\/p>\n\n\n\n