{"id":1029186,"date":"2024-04-29T13:30:01","date_gmt":"2024-04-29T20:30:01","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=1029186"},"modified":"2024-04-29T13:30:03","modified_gmt":"2024-04-29T20:30:03","slug":"microsoft-at-asplos-2024-advancing-hardware-and-software-for-high-scale-secure-and-efficient-modern-applications","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/microsoft-at-asplos-2024-advancing-hardware-and-software-for-high-scale-secure-and-efficient-modern-applications\/","title":{"rendered":"Microsoft at ASPLOS 2024: Advancing hardware and software for high-scale, secure, and efficient modern applications"},"content":{"rendered":"\n
\"ASPLOS<\/figure>\n\n\n\n

Modern computer systems and applications, with unprecedented scale, complexity, and security needs, require careful co-design and co-evolution of hardware and software. The ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (opens in new tab)<\/span><\/a>, is the main forum where researchers bridge the gap between architecture, programming languages, and operating systems to advance the state of the art.<\/p>\n\n\n\n

ASPLOS 2024 is taking place in San Diego between April 27 and May 1, and Microsoft researchers and collaborators have a strong presence, with members of our team taking on key roles in organizing the event. This includes participation in the program and external review committees and leadership as the program co-chair.<\/p>\n\n\n\n

We are pleased to share that eight papers from Microsoft researchers and their collaborators have been accepted to the conference, spanning a broad spectrum of topics. In the field of AI and deep learning, subjects include power and frequency management for GPUs and LLMs, the use of Process-in-Memory for deep learning, and instrumentation frameworks. Regarding infrastructure, topics include memory safety with CHERI, I\/O prefetching in modern storage, and smart oversubscription of burstable virtual machines. This post highlights some of this work.<\/p>\n\n\n\n\t

\n\t\t\n\n\t\t

\n\t\tSpotlight: Event<\/span>\n\t<\/p>\n\t\n\t

\n\t\t\t\t\t\t
\n\t\t\t\t\n\t\t\t\t\t\"teal\n\t\t\t\t<\/a>\n\t\t\t<\/div>\n\t\t\t\n\t\t\t
\n\n\t\t\t\t\t\t\t\t\t

Microsoft at CVPR 2024<\/h2>\n\t\t\t\t\n\t\t\t\t\t\t\t\t

Microsoft is a proud sponsor and active participant of\u00a0CVPR 2024<\/a>, which focuses on advancements in computer vision and pattern recognition. <\/p>\n\t\t\t\t\n\t\t\t\t\t\t\t\t

\n\t\t\t\t\t
\n\t\t\t\t\t\t\n\t\t\t\t\t\t\tLearn more\t\t\t\t\t\t<\/a>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t<\/div>\n\t<\/div>\n\t<\/div>\n\t\n\n\n

Paper highlights<\/h2>\n\n\n\n

Characterizing Power Management Opportunities for LLMs in the Cloud<\/a><\/h3>\n\n\n\n

The rising popularity of LLMs and generative AI has led to an unprecedented demand for GPUs. However, the availability of power is a key limiting factor in expanding a GPU fleet. This paper characterizes the power usage in LLM clusters, examines the power consumption patterns across multiple LLMs, and identifies the differences between inference and training power consumption patterns. This investigation reveals that the average and peak power consumption in inference clusters is not very high, and that there is substantial headroom for power oversubscription. Consequently, the authors propose POLCA: a framework for power oversubscription that is robust, reliable, and readily deployable for GPU clusters. It can deploy 30% more servers in the same GPU clusters for inference tasks, with minimal performance degradation.<\/p>\n\n\n\n

PIM-DL: Expanding the Applicability of Commodity DRAM-PIMs for Deep Learning via Algorithm-System Co-Optimization<\/a><\/h3>\n\n\n\n

PIM-DL is the first deep learning framework specifically designed for off-the-shelf processing-in-memory (PIM) systems, capable of offloading most computations in neural networks. Its goal is to surmount the computational limitations of PIM hardware by replacing traditional compute-heavy matrix multiplication operations with Lookup Tables (LUTs). PIM-DL first enables neural networks to operate efficiently on PIM architectures, significantly reducing the need for complex arithmetic operations. PIM-DL demonstrates significant speed improvements, achieving up to ~37x faster performance than traditional GEMM-based systems and showing competitive speedups against CPUs and GPUs.<\/p>\n\n\n\n

Cornucopia Reloaded: Load Barriers for CHERI Heap Temporal Safety<\/a><\/h3>\n\n\n\n

Memory safety bugs have persistently plagued software for over 50 years and underpin some 70% of common vulnerabilities and exposures (CVEs) every year. The CHERI capability architecture (opens in new tab)<\/span><\/a> is an emerging technology (opens in new tab)<\/span><\/a> (especially through Arm\u2019s Morello (opens in new tab)<\/span><\/a> and Microsoft\u2019s CHERIoT (opens in new tab)<\/span><\/a> platforms) for spatial memory safety and software compartmentalization. In this paper, the authors demonstrate the viability of object-granularity heap temporal safety built atop CHERI with considerably lower overheads than prior work.<\/p>\n\n\n\n

AUDIBLE: A Convolution-Based Resource Allocator for Oversubscribing Burstable Virtual Machines<\/a><\/h3>\n\n\n\n

Burstable virtual machines (BVMs) are a type of virtual machine in the cloud that allows temporary increases in resource allocation. This paper shows how to oversubscribe BVMs. It first studies the characteristics of BVMs on Microsoft Azure and explains why traditional approaches based on using a fixed oversubscription ratio or based on the Central Limit Theorem do not work well for BVMs: they lead to either low utilization or high server capacity violation rates. Based on the lessons learned from the workload study, the authors developed a new approach, called AUDIBLE, using a nonparametric statistical model. This makes the approach lightweight and workload independent. This study shows that AUDIBLE achieves high system utilization while enforcing stringent requirements on server capacity violations.<\/p>\n\n\n\n

Complete list of accepted publications by Microsoft researchers<\/h2>\n\n\n\n

Amanda: Unified Instrumentation Framework for Deep Neural Networks<\/strong>
<\/a>Yue Guan, Yuxian Qiu, and Jingwen Leng;
Fan Yang<\/a>, Microsoft Research; Shuo Yu, Shanghai Jiao Tong University; Yunxin Liu, Tsinghua University; Yu Feng and Yuhao Zhu, University of Rochester; Lidong Zhou<\/a>, Microsoft Research; Yun Liang, Peking University; Chen Zhang, Chao Li, and Minyi Guo, Shanghai Jiao Tong University<\/p>\n\n\n\n

AUDIBLE: A Convolution-Based Resource Allocator for Oversubscribing Burstable Virtual Machines<\/strong><\/a>
Seyedali Jokar Jandaghi and Kaveh Mahdaviani, University of Toronto; Amirhossein Mirhosseini, University of Michigan;
Sameh Elnikety<\/a>, Microsoft Research; Cristiana Amza and Bianca Schroeder, University of Toronto, Cristiana Amza and Bianca Schroeder, University of Toronto<\/p>\n\n\n\n

Characterizing Power Management Opportunities for LLMs in the Cloud<\/strong>
(opens in new tab)<\/span><\/a>Pratyush Patel, Microsoft Azure and University of Washington;
Esha Choukse (opens in new tab)<\/span><\/a>, Chaojie Zhang (opens in new tab)<\/span><\/a>, and \u00cd\u00f1igo Goiri (opens in new tab)<\/span><\/a>, Azure Research; Brijesh Warrier (opens in new tab)<\/span><\/a>, Nithish Mahalingam, Ricardo Bianchini (opens in new tab)<\/span><\/a>, Microsoft AzureResearch<\/p>\n\n\n\n

Cornucopia Reloaded: Load Barriers for CHERI Heap Temporal Safety<\/strong>
<\/a>
Nathaniel Wesley Filardo<\/a>, University of Cambridge and Microsoft Research; Brett F. Gutstein, Jonathan Woodruff, Jessica Clarke, and Peter Rugg, University of Cambridge; Brooks Davis, SRI International; Mark Johnston, University of Cambridge; Robert Norton<\/a>, Microsoft Research; David Chisnall, SCI Semiconductor; Simon W. Moore, University of Cambridge; Peter G. Neumann, SRI International; Robert N. M. Watson, University of Cambridge<\/p>\n\n\n\n

CrossPrefetch: Accelerating I\/O Prefetching for Modern Storage<\/strong><\/a>
Shaleen Garg and Jian Zhang, Rutgers University; Rekha Pitchumani, Samsung; Manish Parashar, University of Utah;
Bing Xie<\/a>, Microsoft; Sudarsun Kannan, Rutgers University<\/p>\n\n\n\n

Kimbap: A Node-Property Map System for Distributed Graph Analytics<\/strong><\/a>
Hochan Lee, University of Texas at Austin; Roshan Dathathri, Microsoft Research; Keshav Pingali, University of Texas at Austin<\/p>\n\n\n\n

PIM-DL: Expanding the Applicability of Commodity DRAM-PIMs for Deep Learning via Algorithm-System Co-Optimization<\/strong>
<\/a>Cong Li and Zhe Zhou, Peking University;
Yang Wang<\/a>, Microsoft Research; Fan Yang, Nankai University; Ting Cao<\/a> and Mao Yang<\/a>, Microsoft Research; Yun Liang and Guangyu Sun, Peking University<\/p>\n\n\n\n

Predict; Don\u2019t React for Enabling Efficient Fine-Grain DVFS in GPUs<\/strong>
<\/a>
Srikant Bharadwaj<\/a>, Microsoft Research; Shomit Das, Qualcomm; Kaushik Mazumdar and Bradford M. Beckmann, AMD; Stephen Kosonocky, Uhnder<\/p>\n\n\n\n

Conference organizers from Microsoft<\/h2>\n\n\n\n

Program Co-Chair<\/h3>\n\n\n\n

Madan Musuvathi<\/a><\/p>\n\n\n\n

Submission Chairs<\/h3>\n\n\n\n

Jubi Taneja<\/a>
Olli Saarikivi<\/a><\/p>\n\n\n\n

Program Committee<\/h3>\n\n\n\n

Abhinav Jangda (opens in new tab)<\/span><\/a>
Aditya Kanade (opens in new tab)<\/span><\/a>
Ashish Panwar (opens in new tab)<\/span><\/a>
Jacob Nelson (opens in new tab)<\/span><\/a>
Jay Lorch (opens in new tab)<\/span><\/a>
Jilong Xue (opens in new tab)<\/span><\/a>
Paolo Costa (opens in new tab)<\/span><\/a>
Rodrigo Fonseca (opens in new tab)<\/span><\/a>
Shan Lu (opens in new tab)<\/span><\/a>
Suman Nath (opens in new tab)<\/span><\/a>
Tim Harris (opens in new tab)<\/span><\/a><\/p>\n\n\n\n

External Review Committee<\/h3>\n\n\n\n

Rujia Wang<\/a><\/p>\n\n\n\n

Career opportunities<\/h2>\n\n\n\n

Microsoft welcomes talented individuals across various roles at Microsoft Research, Azure Research, and other departments. We are always pushing the boundaries of computer systems to improve the scale, efficiency, and security of all our offerings. You can review our open research-related positions here<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"

From AI and deep learning to innovations in infrastructure, researchers from Microsoft are bridging the gap between architecture, programming languages, and operating systems to advance the state of the art at ASPLOS 2024.<\/p>\n","protected":false},"author":37583,"featured_media":1029198,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"footnotes":""},"categories":[1],"tags":[],"research-area":[13552,13560,13547],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[243984],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-1029186","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-research-blog","msr-research-area-hardware-devices","msr-research-area-programming-languages-software-engineering","msr-research-area-systems-and-networking","msr-locale-en_us","msr-post-option-blog-homepage-featured"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[],"related-events":[],"related-researchers":[{"type":"user_nicename","value":"Rodrigo Fonseca","user_id":40429,"display_name":"Rodrigo Fonseca","author_link":"Rodrigo Fonseca<\/a>","is_active":false,"last_first":"Fonseca, Rodrigo","people_section":0,"alias":"rofons"},{"type":"user_nicename","value":"Madan Musuvathi","user_id":32766,"display_name":"Madan Musuvathi","author_link":"Madan Musuvathi<\/a>","is_active":false,"last_first":"Musuvathi, Madan","people_section":0,"alias":"madanm"}],"msr_type":"Post","featured_image_thumbnail":"\"ASPLOS","byline":"Rodrigo Fonseca<\/a> and Madan Musuvathi<\/a>","formattedDate":"April 29, 2024","formattedExcerpt":"From AI and deep learning to innovations in infrastructure, researchers from Microsoft are bridging the gap between architecture, programming languages, and operating systems to advance the state of the art at ASPLOS 2024.","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1029186"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/37583"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=1029186"}],"version-history":[{"count":27,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1029186\/revisions"}],"predecessor-version":[{"id":1029729,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1029186\/revisions\/1029729"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/1029198"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1029186"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=1029186"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=1029186"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1029186"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=1029186"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=1029186"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1029186"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=1029186"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1029186"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=1029186"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=1029186"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}