{"id":701041,"date":"2021-02-06T16:45:33","date_gmt":"2021-02-07T00:45:33","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-blog-post&p=701041"},"modified":"2021-08-30T09:24:46","modified_gmt":"2021-08-30T16:24:46","slug":"flaml-a-fast-and-lightweight-automl-library","status":"publish","type":"msr-blog-post","link":"https:\/\/www.microsoft.com\/en-us\/research\/articles\/flaml-a-fast-and-lightweight-automl-library\/","title":{"rendered":"FLAML: A Fast and Lightweight AutoML Library"},"content":{"rendered":"
FLAML<\/span><\/span> (opens in new tab)<\/span><\/a> is a Python package to automatically find accurate machine learning models at low computational cost. <\/span><\/span>It\u00a0<\/span><\/span>free<\/span><\/span>s data scientists\u00a0<\/span><\/span>from\u00a0<\/span><\/span>worrying about<\/span><\/span>\u00a0hyperparameter<\/span><\/span>\u00a0tuning<\/span><\/span>\u00a0<\/span><\/span>and model selection<\/span><\/span>.<\/span><\/span>\u00a0<\/span><\/span>It<\/span><\/span>\u00a0enables developers to build self-tuning software\u00a0<\/span><\/span>which adjusts itself with new<\/span><\/span>\u00a0training data.<\/span><\/span>\u00a0<\/span><\/span>It is<\/span><\/span>\u00a0<\/span><\/span>fast,\u00a0<\/span><\/span>economical<\/span><\/span>, and\u00a0<\/span><\/span>easy to use<\/span><\/span>.<\/span><\/span> \u00a0<\/span><\/p>\n More and more\u00a0<\/span><\/span>businesses<\/span><\/span>\u00a0start building millions of ML-embedded applications. It adds up to a large cost to manually choose the right training algorithm and tune the hyperparameters for every task and every dataset.\u00a0<\/span><\/span>M<\/span><\/span>assive consumption of computation resources in tuning machine learning models\u00a0<\/span><\/span>also\u00a0<\/span><\/span>brings a tremendous burden to the environment.\u00a0<\/span><\/span>\u00a0<\/span><\/p>\n We build an economical\u00a0<\/span><\/span>AutoML<\/span><\/span>\u00a0<\/span><\/span>system that handles\u00a0<\/span><\/span>tuning<\/span><\/span>\u00a0tasks robustly and efficiently. FLAML leverages the structure of the search space to choose a search order optimized for both cost and\u00a0<\/span><\/span>model\u00a0<\/span><\/span>quality<\/span><\/span>. Overall, the search tends to gradually move\u00a0<\/span><\/span>from cheap trials to expensive trials\u00a0<\/span><\/span>while i<\/span><\/span>mproving model accuracy<\/span><\/span>.<\/span><\/span><\/p>\n\t\t\t Trial cost vs. total time spent in automl. Each marker corresponds to one trial of configuration evaluation. Triangles mark FLAML; circles mark a typical existing AutoML library.<\/p><\/div> \t<\/div>\n\t \t Model auc regret vs. total time spent in automl. Each marker corresponds to one trial of configuration evaluation. Triangles mark FLAML; circles mark a typical existing AutoML library.<\/p><\/div> \t<\/div>\n\t<\/p>\t\t\t<\/div>\n\t\t<\/div>\n\t\t\n Our<\/span><\/span>\u00a0<\/span><\/span>design<\/span><\/span>\u00a0<\/span><\/span>enables it<\/span><\/span>\u00a0robustly\u00a0<\/span>adapting<\/span>\u00a0to an ad-hoc dataset out of the box.<\/span><\/span>\u00a0<\/span><\/p>\n Box plot of normalized score difference between FLAML and (1) Auto-sklearn, (2) a cloud-based AutoML service, (3) HpBandSter, (4) H2O AutoML, and (5) TPOT when using equal budget, tested on 53 AutoML benchmark datasets including classification and regression tasks of a large variety of scales. Positive means FLAML is better.<\/p><\/div>\n FLAML can be easily installed by T<\/span><\/span>he package has become an indispensable tool for our GBM model builds. I highly recommend it to all statistical modelers and data scientists<\/span><\/span>.\u00a0<\/em><\/p>\n \u2014 Bingyi Yang, VP. Decisions Science at Global Lending Services LLC.<\/p>\n<\/blockquote>\n To learn more about FLAML, please check out\u202f\u00a0<\/span><\/span>our\u00a0<\/span><\/span>GitHub repo<\/span><\/span><\/a>\u00a0and<\/span><\/span>\u00a0<\/span><\/span>the\u00a0<\/span><\/span>n<\/span><\/span>otebook<\/span><\/span><\/a>\u00a0example<\/span><\/span>s<\/span><\/span>.\u00a0<\/span><\/span>We look forward to hearing your feedback, questions, and stories, and welcome all contributions.<\/span><\/span>\u00a0\u00a0<\/span><\/p>\n <\/p>\n","protected":false},"excerpt":{"rendered":" Accelerate development of machine learning applications for engineers and data scientists<\/p>\n","protected":false},"author":31406,"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-content-parent":620280,"msr_hide_image_in_river":0,"footnotes":""},"research-area":[],"msr-locale":[268875],"msr-post-option":[],"class_list":["post-701041","msr-blog-post","type-msr-blog-post","status-publish","hentry","msr-locale-en_us"],"msr_assoc_parent":{"id":620280,"type":"project"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-blog-post\/701041","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-blog-post"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-blog-post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/31406"}],"version-history":[{"count":20,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-blog-post\/701041\/revisions"}],"predecessor-version":[{"id":770518,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-blog-post\/701041\/revisions\/770518"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=701041"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=701041"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=701041"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=701041"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}Problem<\/h3>\n
Solution<\/h3>\n



How to get started<\/h3>\n
pip\u00a0install\u00a0flaml<\/code><\/span><\/span>.<\/span><\/span><\/p>\n\n
from<\/span> flaml import<\/span> AutoML<\/code>
\nautoml = AutoML()
\nautoml.fit(X_train, y_train, task<\/span>=\"classification<\/span>\")<\/code><\/p>\n\n
automl.fit(X_train, y_train, task<\/span>=\"regression<\/span>\", estimator_list<\/span>=[\"lgbm<\/span>\"])<\/code><\/p>\n\n
fit()<\/code>.<\/span><\/span>\u00a0<\/span><\/li>\n<\/ul>\nfrom<\/span> flaml import<\/span> tune
\ntune.run(training_function, config<\/span>={\"learning_rate<\/span>\": tune.loguniform(lower<\/span>=1e-5<\/span>, upper<\/span>=1.0<\/span>), \"num_epochs<\/span>\": tune.loguniform(lower<\/span>=1<\/span>, upper<\/span>=100<\/span>)}, init_config<\/span>={\"num_epochs<\/span>\": 1}, time_budget_s<\/span>=3600<\/span>)
\n<\/code><\/p>\nCustomer quote<\/h3>\n