{"id":486965,"date":"2018-05-19T13:01:26","date_gmt":"2018-05-19T20:01:26","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&p=486965"},"modified":"2018-10-16T22:26:07","modified_gmt":"2018-10-17T05:26:07","slug":"mlbench-benchmarking-machine-learning-services-human-experts","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/mlbench-benchmarking-machine-learning-services-human-experts\/","title":{"rendered":"MLBench: Benchmarking Machine Learning Services Against Human Experts"},"content":{"rendered":"

Modern machine learning services and systems are complicated <\/span>data systems \u2014 the process of designing such systems is an art <\/span>of compromising between functionality, performance, and quality. <\/span>Providing different levels of system supports for different functionalities, <\/span>such as automatic feature engineering, model selection and <\/span>ensemble, and hyperparameter tuning, could improve the quality, <\/span>but also introduce additional cost and system complexity. In this <\/span>paper, we try to facilitate the process of asking the following type <\/span>of questions: How much will the users lose if we remove the support <\/span>of functionality x from a machine learning service?<\/span>

Answering this question using existing datasets, such as the UCI <\/span>datasets, is challenging. The main contribution of this work is <\/span>a novel dataset, mlbench, harvested from Kaggle competitions. <\/span>Unlike existing datasets, mlbench contains not only the raw features <\/span>for a machine learning task, but also those used by the winning <\/span>teams of Kaggle competitions. The winning features serve as a <\/span>baseline of best human effort that enables multiple ways to measure <\/span>the quality of machine learning services that cannot be supported <\/span>by existing datasets, such as relative ranking on Kaggle and relative <\/span>accuracy compared with best-effort systems.<\/span>

We then conduct an empirical study using mlbench to understand <\/span>two example machine learning services from Azure and Amazon, <\/span>and showcase how mlbench enables a comparative study revealing <\/span>the strength and weakness of these existing machine learning <\/span>services quantitatively and systematically.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"

Modern machine learning services and systems are complicated data systems \u2014 the process of designing such systems is an art of compromising between functionality, performance, and quality. Providing different levels of system supports for different functionalities, such as automatic feature engineering, model selection and ensemble, and hyperparameter tuning, could improve the quality, but also introduce […]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"msr-content-type":[3],"msr-research-highlight":[],"research-area":[13563],"msr-publication-type":[193716],"msr-product-type":[],"msr-focus-area":[],"msr-platform":[],"msr-download-source":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-486965","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-data-platform-analytics","msr-locale-en_us"],"msr_publishername":"","msr_edition":"Proceedings of the VLDB Endowment (VLDB 2018)","msr_affiliation":"","msr_published_date":"2018-05-17","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"","msr_publicationurl":"http:\/\/www.vldb.org\/pvldb\/vol11.html","msr_doi":"","msr_publication_uploader":[{"type":"url","title":"http:\/\/www.vldb.org\/pvldb\/vol11.html","viewUrl":false,"id":false,"label_id":0}],"msr_related_uploader":"","msr_attachments":[{"id":0,"url":"http:\/\/www.vldb.org\/pvldb\/vol11.html"}],"msr-author-ordering":[{"type":"text","value":"Yu Liu","user_id":0,"rest_url":false},{"type":"text","value":"Hantian Zhang","user_id":0,"rest_url":false},{"type":"text","value":"Luyuan Zeng","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Wentao Wu","user_id":34824,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Wentao Wu"},{"type":"text","value":"Ce Zhang","user_id":0,"rest_url":false}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[],"msr_group":[957177],"msr_project":[],"publication":[],"video":[],"download":[],"msr_publication_type":"inproceedings","related_content":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/486965"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":3,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/486965\/revisions"}],"predecessor-version":[{"id":486974,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/486965\/revisions\/486974"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=486965"}],"wp:term":[{"taxonomy":"msr-content-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-content-type?post=486965"},{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=486965"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=486965"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=486965"},{"taxonomy":"msr-product-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-product-type?post=486965"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=486965"},{"taxonomy":"msr-platform","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-platform?post=486965"},{"taxonomy":"msr-download-source","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-download-source?post=486965"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=486965"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=486965"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=486965"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=486965"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=486965"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=486965"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=486965"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}