{"id":1016919,"date":"2024-03-20T09:56:21","date_gmt":"2024-03-20T16:56:21","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&p=1016919"},"modified":"2024-07-18T20:27:57","modified_gmt":"2024-07-19T03:27:57","slug":"smartoclock-workload-and-risk-aware-overclocking-in-the-cloud","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/smartoclock-workload-and-risk-aware-overclocking-in-the-cloud\/","title":{"rendered":"SmartOClock: Workload- and Risk-Aware Overclocking in the Cloud"},"content":{"rendered":"

Operating server components beyond their voltage <\/span>and power design limits (<\/span>i.e.<\/span>, overclocking) enables improving <\/span>performance and lowering cost for cloud workloads. However, <\/span>overclocking can significantly degrade component lifetime, in<\/span>crease power consumption, and cause power capping events, <\/span>eventually diminishing the performance benefits.<\/span><\/p>\n

In this paper, we characterize the impact of overclocking <\/span>on cloud workloads by studying their profiles from production <\/span>deployments. Based on the characterization insights, we propose <\/span>SmartOClock,<\/span> the<\/span> first<\/span> distributed<\/span> overclocking<\/span> management <\/span>platform specifically designed for cloud environments. SmartO<\/span>Clock is a workload-aware scheme that relies on power predic<\/span>tions to heterogeneously distribute the power budgets across its <\/span>servers based on their needs and then enforce budget compliance <\/span>locally, per-server, in a decentralized manner.<\/span><\/p>\n

SmartOClock reduces the tail latency by 9%, application cost <\/span>by 30% and total energy consumption by 10% for latency-<\/span>sensitive microservices on a 36-server deployment. Simulation <\/span>analysis using production traces show that SmartOClock reduces <\/span>the<\/span> number<\/span> of<\/span> power<\/span> capping<\/span> events<\/span> by<\/span> up<\/span> to<\/span> 95%<\/span> while <\/span>increasing the overclocking success rate by up to 62%. We also <\/span>describe lessons from building a first-of-its-kind overclockable <\/span>cluster at a cloud provider for production experiments.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"

Operating server components beyond their voltage and power design limits (i.e., overclocking) enables improving performance and lowering cost for cloud workloads. However, overclocking can significantly degrade component lifetime, increase power consumption, and cause power capping events, eventually diminishing the performance benefits. In this paper, we characterize the impact of overclocking on cloud workloads by studying […]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"footnotes":""},"msr-content-type":[3],"msr-research-highlight":[],"research-area":[13547],"msr-publication-type":[193716],"msr-product-type":[],"msr-focus-area":[],"msr-platform":[],"msr-download-source":[],"msr-locale":[268875],"msr-field-of-study":[246691],"msr-conference":[259546],"msr-journal":[],"msr-impact-theme":[264846],"msr-pillar":[],"class_list":["post-1016919","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-systems-and-networking","msr-locale-en_us","msr-field-of-study-computer-science"],"msr_publishername":"","msr_edition":"","msr_affiliation":"","msr_published_date":"2024-6-1","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2024\/03\/SmartOClock_ISCA24.pdf","id":"1029366","title":"smartoclock_isca24","label_id":"243109","label":0}],"msr_related_uploader":"","msr_attachments":[{"id":1029366,"url":"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2024\/04\/SmartOClock_ISCA24.pdf"},{"id":1016925,"url":"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2024\/03\/SmartOClock_ISCA.pdf"}],"msr-author-ordering":[{"type":"text","value":"Jovan Stojkovic","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Pulkit Misra","user_id":38496,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Pulkit Misra"},{"type":"user_nicename","value":"Íñigo Goiri","user_id":32102,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Íñigo Goiri"},{"type":"user_nicename","value":"Sam Whitlock","user_id":41024,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Sam Whitlock"},{"type":"user_nicename","value":"Esha Choukse","user_id":40417,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Esha Choukse"},{"type":"user_nicename","value":"Mayukh Das","user_id":41140,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Mayukh Das"},{"type":"user_nicename","value":"Chetan Bansal","user_id":31394,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Chetan Bansal"},{"type":"text","value":"Jason Lee","user_id":0,"rest_url":false},{"type":"text","value":"Zoey Sun","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Haoran Qiu","user_id":43428,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Haoran Qiu"},{"type":"text","value":"Reed Zimmermann","user_id":0,"rest_url":false},{"type":"text","value":"Savyasachi Samal","user_id":0,"rest_url":false},{"type":"guest","value":"brijesh-warrier","user_id":956994,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=brijesh-warrier"},{"type":"text","value":"Ashish Raniwala","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Ricardo Bianchini","user_id":33393,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Ricardo Bianchini"}],"msr_impact_theme":["Computing foundations"],"msr_research_lab":[],"msr_event":[],"msr_group":[282170,793670,811276,998211],"msr_project":[],"publication":[],"video":[],"download":[],"msr_publication_type":"inproceedings","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/1016919"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":3,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/1016919\/revisions"}],"predecessor-version":[{"id":1058478,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/1016919\/revisions\/1058478"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1016919"}],"wp:term":[{"taxonomy":"msr-content-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-content-type?post=1016919"},{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=1016919"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1016919"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=1016919"},{"taxonomy":"msr-product-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-product-type?post=1016919"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=1016919"},{"taxonomy":"msr-platform","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-platform?post=1016919"},{"taxonomy":"msr-download-source","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-download-source?post=1016919"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1016919"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=1016919"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=1016919"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=1016919"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1016919"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=1016919"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}