FLAML: A Fast and Lightweight AutoML Library

février 6, 2021

Partagez cette page

FLAML is a Python package to automatically find accurate machine learning models at low computational cost. It frees data scientists from worrying about hyperparameter tuning and model selection. It enables developers to build self-tuning software which adjusts itself with new training data. It is fast, economical, and easy to use.

Problem

More and more businesses start building millions of ML-embedded applications. It adds up to a large cost to manually choose the right training algorithm and tune the hyperparameters for every task and every dataset. Massive consumption of computation resources in tuning machine learning models also brings a tremendous burden to the environment.

Solution

We build an economical AutoML system that handles tuning tasks robustly and efficiently. FLAML leverages the structure of the search space to choose a search order optimized for both cost and model quality. Overall, the search tends to gradually move from cheap trials to expensive trials while improving model accuracy.

Trial cost vs. total time spent in automl. Each marker corresponds to one trial of configuration evaluation. Triangles mark FLAML; circles mark a typical existing AutoML library.

Model auc regret vs. total time spent in automl. Each marker corresponds to one trial of configuration evaluation. Triangles mark FLAML; circles mark a typical existing AutoML library.

Our design enables it robustly adapting to an ad-hoc dataset out of the box.

Box plot of normalized score difference between FLAML and (1) Auto-sklearn, (2) a cloud-based AutoML service, (3) HpBandSter, (4) H2O AutoML, and (5) TPOT when using equal budget, tested on 53 AutoML benchmark datasets including classification and regression tasks of a large variety of scales. Positive means FLAML is better.

How to get started

FLAML can be easily installed by pip install flaml.

With three lines of code, you can start using this economical and fast AutoML engine as a scikit-learn style estimator.

from flaml import AutoMLautoml = AutoML() automl.fit(X_train, y_train, task="classification")

You can restrict the learners and use FLAML as a fast hyperparameter tuning tool for XGBoost, LightGBM, Random Forest etc. or a custom learner.

automl.fit(X_train, y_train, task="regression", estimator_list=["lgbm"])

You can also run generic model tuning beyond the scikit-learn style fit().

from flaml import tune tune.run(training_function, config={"learning_rate": tune.loguniform(lower=1e-5, upper=1.0), "num_epochs": tune.loguniform(lower=1, upper=100)}, init_config={"num_epochs": 1}, time_budget_s=3600)

Customer quote

The package has become an indispensable tool for our GBM model builds. I highly recommend it to all statistical modelers and data scientists.

— Bingyi Yang, VP. Decisions Science at Global Lending Services LLC.

To learn more about FLAML, please check out  our GitHub repo and the notebook examples. We look forward to hearing your feedback, questions, and stories, and welcome all contributions.