{"id":743035,"date":"2021-04-29T20:59:19","date_gmt":"2021-04-30T03:59:19","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&p=743035"},"modified":"2024-04-25T12:25:42","modified_gmt":"2024-04-25T19:25:42","slug":"cost-ef%ef%ac%81cient-overclocking-in-immersion-cooled-datacenters","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/cost-ef%ef%ac%81cient-overclocking-in-immersion-cooled-datacenters\/","title":{"rendered":"Cost-Ef\ufb01cient Overclocking in Immersion-Cooled Datacenters"},"content":{"rendered":"
Cloud providers typically use air-based solutions for cooling servers in datacenters.\u00a0 However, increasing transistor counts and the end of Dennard scaling will result in chips with thermal design power that exceeds the capabilities of air cooling in the near future.\u00a0 Consequently, providers have started to explore liquid cooling solutions (e.g., cold plates, immersion cooling) for the most power-hungry workloads.\u00a0 By keeping the servers cooler, these new solutions enable providers to operate server components beyond the normal frequency range (i.e., overclocking them) all the time.\u00a0 Still, providers must tradeoff the increase in performance via overclocking with its higher power draw and any component reliability implications.\u00a0 In this paper, we argue that two-phase immersion cooling (2PIC) is the most promising technology, and build three prototype 2PIC tanks.\u00a0 Given the bene\ufb01ts of 2PIC, we characterize the impact of overclocking on performance, power, and reliability.\u00a0 Moreover, we propose several new scenarios for taking advantage of overclocking in cloud platforms, including oversubscribing servers and virtual machine (VM) auto-scaling. For the auto-scaling scenario, we build a system that leverages overclocking for either hiding the latency of VM creation or postponing the VM creations in the hopes of not needing them.\u00a0 Using realistic cloud workloads running on a tank prototype, we show that overclocking can improve performance by 20%, increase VM packing density by 20%, and improve tail latency in auto-scaling scenarios by 54%.\u00a0 The combination of 2PIC and overclocking can reduce platform cost by up to 13% compared to air cooling.<\/p>\n
<\/p>\n","protected":false},"excerpt":{"rendered":"
Cloud providers typically use air-based solutions for cooling servers in datacenters.\u00a0 However, increasing transistor counts and the end of Dennard scaling will result in chips with thermal design power that exceeds the capabilities of air cooling in the near future.\u00a0 Consequently, providers have started to explore liquid cooling solutions (e.g., cold plates, immersion cooling) for […]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"msr-content-type":[3],"msr-research-highlight":[],"research-area":[13547],"msr-publication-type":[193716],"msr-product-type":[],"msr-focus-area":[],"msr-platform":[],"msr-download-source":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[264846],"msr-pillar":[],"class_list":["post-743035","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-systems-and-networking","msr-locale-en_us"],"msr_publishername":"","msr_edition":"","msr_affiliation":"","msr_published_date":"2021-6-1","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/04\/Zissou-Overclocking-ISCA21.pdf","id":"743038","title":"zissou-overclocking-isca21","label_id":"243109","label":0}],"msr_related_uploader":"","msr_attachments":[{"id":743038,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/04\/Zissou-Overclocking-ISCA21.pdf"}],"msr-author-ordering":[{"type":"text","value":"Majid Jalili","user_id":0,"rest_url":false},{"type":"text","value":"Ioannis Manousakis","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Íñigo Goiri","user_id":32102,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Íñigo Goiri"},{"type":"user_nicename","value":"Pulkit Misra","user_id":38496,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Pulkit Misra"},{"type":"text","value":"Ashish Raniwala","user_id":0,"rest_url":false},{"type":"text","value":"Husam Alissa","user_id":0,"rest_url":false},{"type":"text","value":"Bharath Ramakrishnan","user_id":0,"rest_url":false},{"type":"text","value":"Phillip Tuma","user_id":0,"rest_url":false},{"type":"text","value":"Christian Belady","user_id":0,"rest_url":false},{"type":"text","value":"Marcus Fontoura","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Ricardo Bianchini","user_id":33393,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Ricardo Bianchini"}],"msr_impact_theme":["Computing foundations"],"msr_research_lab":[199565],"msr_event":[],"msr_group":[144927,282170],"msr_project":[757045],"publication":[],"video":[],"download":[],"msr_publication_type":"inproceedings","related_content":{"projects":[{"ID":757045,"post_title":"Zissou: New datacenter, server, and software architectures for liquid-cooled systems","post_name":"zissou-new-datacenter-server-and-software-architectures-for-liquid-cooled-systems","post_type":"msr-project","post_date":"2021-06-27 18:19:18","post_modified":"2025-02-05 11:05:04","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/zissou-new-datacenter-server-and-software-architectures-for-liquid-cooled-systems\/","post_excerpt":"The Zissou project is exploring immersion cooling in large-scale cloud platforms. Our main motivation is that chip power has been steadily increasing since the end of Dennard scaling.","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/757045"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/743035","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":2,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/743035\/revisions"}],"predecessor-version":[{"id":1028772,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/743035\/revisions\/1028772"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=743035"}],"wp:term":[{"taxonomy":"msr-content-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-content-type?post=743035"},{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=743035"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=743035"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=743035"},{"taxonomy":"msr-product-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-product-type?post=743035"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=743035"},{"taxonomy":"msr-platform","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-platform?post=743035"},{"taxonomy":"msr-download-source","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-download-source?post=743035"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=743035"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=743035"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=743035"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=743035"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=743035"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=743035"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=743035"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}