\u00a0<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"Cloud providers and datacenter operators are grappling with massive demand for graphics processing units (GPUs) due to surging use of large language models (LLMs).\u00a0To try to keep up, enterprises are building new GPU clusters to run LLM workloads, which in turn are running into an energy wall worldwide.\u00a0Power oversubscription and adding more servers to existing […]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"msr-content-type":[3],"msr-research-highlight":[],"research-area":[13547],"msr-publication-type":[193716],"msr-product-type":[],"msr-focus-area":[],"msr-platform":[],"msr-download-source":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[246691],"msr-conference":[259537],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-1016907","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-systems-and-networking","msr-locale-en_us","msr-field-of-study-computer-science"],"msr_publishername":"","msr_edition":"","msr_affiliation":"","msr_published_date":"2024-4-27","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/03\/GPU_Power_ASPLOS_24.pdf","id":"1016916","title":"gpu_power_asplos_24","label_id":"243109","label":0}],"msr_related_uploader":"","msr_attachments":[{"id":1016916,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/03\/GPU_Power_ASPLOS_24.pdf"}],"msr-author-ordering":[{"type":"text","value":"Pratyush Patel","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Esha Choukse","user_id":40417,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Esha Choukse"},{"type":"user_nicename","value":"Chaojie Zhang","user_id":42705,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Chaojie Zhang"},{"type":"user_nicename","value":"\u00cd\u00f1igo Goiri","user_id":32102,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=\u00cd\u00f1igo Goiri"},{"type":"guest","value":"brijesh-warrier","user_id":956994,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=brijesh-warrier"},{"type":"text","value":"Nithish Mahalingam","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Ricardo Bianchini","user_id":33393,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Ricardo Bianchini"}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[],"msr_group":[282170],"msr_project":[1017939],"publication":[],"video":[],"download":[],"msr_publication_type":"inproceedings","related_content":{"projects":[{"ID":1017939,"post_title":"Efficient AI","post_name":"efficient-ai","post_type":"msr-project","post_date":"2024-03-22 17:14:57","post_modified":"2024-09-06 14:53:30","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/efficient-ai\/","post_excerpt":"Making Azure's big bet possible Recent innovations in generative large language models (LLMs) have made their applications and use-cases ubiquitous. This has led to large-scale deployments of these models, using complex, expensive, and power-hungry AI accelerators, most commonly GPUs. These developments make LLM training and inference efficiency an important challenge. In the Azure Research - Systems (opens in new tab) group we are working on improving the Azure infrastructure including hardware, power, and serving. Check…","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/1017939"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/1016907","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":3,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/1016907\/revisions"}],"predecessor-version":[{"id":1019928,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/1016907\/revisions\/1019928"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1016907"}],"wp:term":[{"taxonomy":"msr-content-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-content-type?post=1016907"},{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=1016907"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1016907"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=1016907"},{"taxonomy":"msr-product-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-product-type?post=1016907"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=1016907"},{"taxonomy":"msr-platform","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-platform?post=1016907"},{"taxonomy":"msr-download-source","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-download-source?post=1016907"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1016907"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=1016907"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=1016907"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=1016907"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=1016907"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1016907"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=1016907"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}