{"id":568524,"date":"2019-02-19T16:08:01","date_gmt":"2019-02-20T00:08:01","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&p=568524"},"modified":"2021-12-07T14:33:45","modified_gmt":"2021-12-07T22:33:45","slug":"an-algorithmic-framework-for-geo-distributed-analytics","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/an-algorithmic-framework-for-geo-distributed-analytics\/","title":{"rendered":"An Algorithmic Framework for Geo-Distributed Analytics"},"content":{"rendered":"

Large scale cloud enterprises operate tens to hundreds of datacenters, running a variety of services that produce enormous amounts of data, such as search clicks and infrastructure operation logs. A recent research direction in both academia and industry is to attempt to process the \u201cbig data\u201d in multiple datacenters, as the
\nalternative of centralized processing might be too slow and costly (e.g., due to transferring all the data to a single location). Running such geo-distributed analytics jobs at scale gives rise to key resource management decisions: Where should each of the computations take place? Accordingly, which data should be moved to which location, and when? Which network paths should be used for moving the data, etc. These decisions are complicated not only because they involve the scheduling of multiple types of resources (e.g., compute and network), but also due to the complicated internal data flow of the jobs \u2013 typically structured as a DAG of tens of stages, each of which with up to thousands of tasks. Recent work has dealt with the resource management problem by abstracting away certain aspects of the problem, such as the physical network connecting the datacenters, the DAG structure of the jobs and\/or the compute capacity constraints at the (possibly heterogeneous) datacenters. In this paper, we provide the first analytical model that includes all aspects of the problem, with the objective of minimizing the makespan of multiple geo-distributed jobs. We provide exact and approximate algorithms for certain practical scenarios, and suggest principled heuristics for other scenarios of interest.<\/p>\n","protected":false},"excerpt":{"rendered":"

Large scale cloud enterprises operate tens to hundreds of datacenters, running a variety of services that produce enormous amounts of data, such as search clicks and infrastructure operation logs. A recent research direction in both academia and industry is to attempt to process the \u201cbig data\u201d in multiple datacenters, as the alternative of centralized processing […]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"footnotes":""},"msr-content-type":[3],"msr-research-highlight":[],"research-area":[13561,13547],"msr-publication-type":[193716],"msr-product-type":[],"msr-focus-area":[],"msr-platform":[],"msr-download-source":[],"msr-locale":[268875],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-568524","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-algorithms","msr-research-area-systems-and-networking","msr-locale-en_us"],"msr_publishername":"","msr_edition":"","msr_affiliation":"","msr_published_date":"2018-11-1","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":0,"msr_main_download":"","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2019\/02\/gdaNETGCOOP.pdf","id":"568527","title":"gdanetgcoop","label_id":"243109","label":0}],"msr_related_uploader":"","msr_attachments":[{"id":568536,"url":"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2019\/02\/gdaNETGCOOP-5c6c9a4ad67e0.pdf"},{"id":568527,"url":"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2019\/02\/gdaNETGCOOP.pdf"}],"msr-author-ordering":[{"type":"user_nicename","value":"Srikanth Kandula","user_id":33707,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Srikanth Kandula"},{"type":"user_nicename","value":"Ishai Menache","user_id":32116,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Ishai Menache"},{"type":"text","value":"Joseph (Seffi) Naor","user_id":0,"rest_url":false},{"type":"text","value":"Erez Timnat","user_id":0,"rest_url":false}],"msr_impact_theme":[],"msr_research_lab":[199565],"msr_event":[],"msr_group":[144899],"msr_project":[],"publication":[],"video":[],"download":[],"msr_publication_type":"inproceedings","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/568524"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":2,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/568524\/revisions"}],"predecessor-version":[{"id":802564,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/568524\/revisions\/802564"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=568524"}],"wp:term":[{"taxonomy":"msr-content-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-content-type?post=568524"},{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=568524"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=568524"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=568524"},{"taxonomy":"msr-product-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-product-type?post=568524"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=568524"},{"taxonomy":"msr-platform","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-platform?post=568524"},{"taxonomy":"msr-download-source","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-download-source?post=568524"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=568524"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=568524"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=568524"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=568524"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=568524"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=568524"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}