{"id":702511,"date":"2020-10-31T17:02:51","date_gmt":"2020-11-01T00:02:51","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&p=702511"},"modified":"2021-11-13T00:07:53","modified_gmt":"2021-11-13T08:07:53","slug":"prediction-based-power-oversubscription-in-cloud-platforms","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/prediction-based-power-oversubscription-in-cloud-platforms\/","title":{"rendered":"Prediction-Based Power Oversubscription in Cloud Platforms"},"content":{"rendered":"

Datacenter designers rely on conservative estimates of IT equipment power draw to provision resources. This leaves resources underutilized and requires more datacenters to be built. Prior work has used power capping to shave the rare power peaks and add more servers to the datacenter, thereby oversubscribing its resources and lowering capital costs. This works well when the workloads and their server placements are known. Unfortunately, these factors are unknown in public clouds, forcing providers to limit the oversubscription so that performance is never impacted.<\/p>\n

In this paper, we argue that providers can use predictions of workload performance criticality and virtual machine (VM) resource utilization to increase oversubscription. This poses many challenges, such as identifying the performance-critical workloads from black-box VMs, creating support for criticality-aware power management, and increasing oversubscription while limiting the impact of capping. We address these challenges for the hardware and software infrastructures of Microsoft Azure. The results show that we enable a 2x increase in oversubscription with minimum impact to critical workloads.<\/p>\n","protected":false},"excerpt":{"rendered":"

Datacenter designers rely on conservative estimates of IT equipment power draw to provision resources. This leaves resources underutilized and requires more datacenters to be built. Prior work has used power capping to shave the rare power peaks and add more servers to the datacenter, thereby oversubscribing its resources and lowering capital costs. This works well […]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"msr-content-type":[3],"msr-research-highlight":[],"research-area":[13547],"msr-publication-type":[193716],"msr-product-type":[],"msr-focus-area":[],"msr-platform":[],"msr-download-source":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-702511","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-systems-and-networking","msr-locale-en_us"],"msr_publishername":"","msr_edition":"","msr_affiliation":"","msr_published_date":"2021-7-1","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"USENIX","msr_how_published":"","msr_notes":"Earlier version published as arXiv:2010.15388, October 2020","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/10\/Per-VM-Capping-ATC21.pdf","id":"751480","title":"per-vm-capping-atc21","label_id":"243109","label":0}],"msr_related_uploader":"","msr_attachments":[{"id":751480,"url":"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2021\/06\/Per-VM-Capping-ATC21.pdf"},{"id":702514,"url":"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/10\/Per-VM-capping-and-oversubscription-Arxiv.pdf"}],"msr-author-ordering":[{"type":"user_nicename","value":"Alok Kumbhare","user_id":36086,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Alok Kumbhare"},{"type":"text","value":"Reza Azimi","user_id":0,"rest_url":false},{"type":"text","value":"Ioannis Manousakis","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Anand Bonde","user_id":36068,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Anand Bonde"},{"type":"user_nicename","value":"Felipe Vieira Frujeri","user_id":38139,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Felipe Vieira Frujeri"},{"type":"text","value":"Nithish Mahalingam","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Pulkit Misra","user_id":38496,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Pulkit Misra"},{"type":"text","value":"Seyyed Ahmad Javadi","user_id":0,"rest_url":false},{"type":"text","value":"Bianca Schroeder","user_id":0,"rest_url":false},{"type":"text","value":"Marcus Fontoura","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Ricardo Bianchini","user_id":33393,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Ricardo Bianchini"}],"msr_impact_theme":[],"msr_research_lab":[199565],"msr_event":[],"msr_group":[144927,282170],"msr_project":[615975,573039],"publication":[],"video":[],"download":[],"msr_publication_type":"inproceedings","related_content":{"projects":[{"ID":615975,"post_title":"Power Efficiency and Sustainability","post_name":"power-capping","post_type":"msr-project","post_date":"2019-10-18 11:34:18","post_modified":"2023-06-06 15:30:44","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/power-capping\/","post_excerpt":"Power Capping and Oversubscription is a collaboration between MSR, Azure Compute, CO+I, and AHSI to harvest stranded datacenter resources via smart performance-aware power capping and oversubscription.","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/615975"}]}},{"ID":573039,"post_title":"Machine Learning for Systems and Tiered AIOps","post_name":"resource-central","post_type":"msr-project","post_date":"2019-03-12 17:02:33","post_modified":"2023-06-06 15:49:16","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/resource-central\/","post_excerpt":"Resource Central\u00a0is a general ML and prediction-serving system deployed in Azure Compute.\u00a0 It trains ML models offline and uses them to produce predictions online.","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/573039"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/702511"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":1,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/702511\/revisions"}],"predecessor-version":[{"id":702517,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/702511\/revisions\/702517"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=702511"}],"wp:term":[{"taxonomy":"msr-content-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-content-type?post=702511"},{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=702511"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=702511"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=702511"},{"taxonomy":"msr-product-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-product-type?post=702511"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=702511"},{"taxonomy":"msr-platform","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-platform?post=702511"},{"taxonomy":"msr-download-source","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-download-source?post=702511"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=702511"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=702511"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=702511"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=702511"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=702511"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=702511"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=702511"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}