{"id":701041,"date":"2021-02-06T16:45:33","date_gmt":"2021-02-07T00:45:33","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-blog-post&p=701041"},"modified":"2021-08-30T09:24:46","modified_gmt":"2021-08-30T16:24:46","slug":"flaml-a-fast-and-lightweight-automl-library","status":"publish","type":"msr-blog-post","link":"https:\/\/www.microsoft.com\/en-us\/research\/articles\/flaml-a-fast-and-lightweight-automl-library\/","title":{"rendered":"FLAML: A Fast and Lightweight AutoML Library"},"content":{"rendered":"

FLAML<\/span><\/span><\/a> is a Python package to automatically find accurate machine learning models at low computational cost. <\/span><\/span>It\u00a0<\/span><\/span>free<\/span><\/span>s data scientists\u00a0<\/span><\/span>from\u00a0<\/span><\/span>worrying about<\/span><\/span>\u00a0hyperparameter<\/span><\/span>\u00a0tuning<\/span><\/span>\u00a0<\/span><\/span>and model selection<\/span><\/span>.<\/span><\/span>\u00a0<\/span><\/span>It<\/span><\/span>\u00a0enables developers to build self-tuning software\u00a0<\/span><\/span>which adjusts itself with new<\/span><\/span>\u00a0training data.<\/span><\/span>\u00a0<\/span><\/span>It is<\/span><\/span>\u00a0<\/span><\/span>fast,\u00a0<\/span><\/span>economical<\/span><\/span>, and\u00a0<\/span><\/span>easy to use<\/span><\/span>.<\/span><\/span> \u00a0<\/span><\/p>\n

Problem<\/h3>\n

More and more\u00a0<\/span><\/span>businesses<\/span><\/span>\u00a0start building millions of ML-embedded applications. It adds up to a large cost to manually choose the right training algorithm and tune the hyperparameters for every task and every dataset.\u00a0<\/span><\/span>M<\/span><\/span>assive consumption of computation resources in tuning machine learning models\u00a0<\/span><\/span>also\u00a0<\/span><\/span>brings a tremendous burden to the environment.\u00a0<\/span><\/span>\u00a0<\/span><\/p>\n

Solution<\/h3>\n

We build an economical\u00a0<\/span><\/span>AutoML<\/span><\/span>\u00a0<\/span><\/span>system that handles\u00a0<\/span><\/span>tuning<\/span><\/span>\u00a0tasks robustly and efficiently. FLAML leverages the structure of the search space to choose a search order optimized for both cost and\u00a0<\/span><\/span>model\u00a0<\/span><\/span>quality<\/span><\/span>. Overall, the search tends to gradually move\u00a0<\/span><\/span>from cheap trials to expensive trials\u00a0<\/span><\/span>while i<\/span><\/span>mproving model accuracy<\/span><\/span>.<\/span><\/span><\/p>\n\t\t\t

\n\t\t\t
\n\t\t\t\t\t
\n\t\t
\"\"

Trial cost vs. total time spent in automl. Each marker corresponds to one trial of configuration evaluation. Triangles mark FLAML; circles mark a typical existing AutoML library.<\/p><\/div>

\t<\/div>\n\t \t

\n\t\t<\/p>
\"diagram\"

Model auc regret vs. total time spent in automl. Each marker corresponds to one trial of configuration evaluation. Triangles mark FLAML; circles mark a typical existing AutoML library.<\/p><\/div>

\t<\/div>\n\t<\/p>\t\t\t<\/div>\n\t\t<\/div>\n\t\t\n

Our<\/span><\/span>\u00a0<\/span><\/span>design<\/span><\/span>\u00a0<\/span><\/span>enables it<\/span><\/span>\u00a0robustly\u00a0<\/span>adapting<\/span>\u00a0to an ad-hoc dataset out of the box.<\/span><\/span>\u00a0<\/span><\/p>\n

\"chart,

Box plot of normalized score difference between FLAML and (1) Auto-sklearn, (2) a cloud-based AutoML service, (3) HpBandSter, (4) H2O AutoML, and (5) TPOT when using equal budget, tested on 53 AutoML benchmark datasets including classification and regression tasks of a large variety of scales. Positive means FLAML is better.<\/p><\/div>\n

How to get started<\/h3>\n

FLAML can be easily installed by pip\u00a0install\u00a0flaml<\/code><\/span><\/span>.<\/span><\/span><\/p>\n