{"id":566493,"date":"2019-02-05T23:40:03","date_gmt":"2019-02-06T07:40:03","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&p=566493"},"modified":"2019-07-30T01:43:47","modified_gmt":"2019-07-30T08:43:47","slug":"astra-exploiting-predictability-to-optimize-deep-learning","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/astra-exploiting-predictability-to-optimize-deep-learning\/","title":{"rendered":"Astra: Exploiting Predictability to Optimize Deep Learning"},"content":{"rendered":"
We present Astra, a compilation and execution framework that optimizes
\nexecution of a deep learning training job. Instead of treating the computation as a generic data flow graph, Astra exploits domain knowledge about deep learning to adopt a custom approach to compiler optimization.<\/p>\n
The key insight in Astra is to exploit the unique repetitiveness and predictability of a deep learning job, to perform online exploration of the optimization state space in a work-conserving manner while making progress on the training job. This dynamic state space exploration in Astra uses lightweight profiling and indexing of profile data, coupled with several techniques to prune the exploration state space. Effectively, the execution layer custom-wires the infrastructure end-to-end for each job and hardware, while keeping the compiler simple and maintainable.<\/p>\n
We have implemented Astra in two popular deep learning frameworks, PyTorch and Tensorflow. On state-of-the-art deep learning models, we show that Astra improves end-to-end performance of deep learning training by up to 3x, while approaching the performance of hand-optimized implementations such as cuDNN where available. Astra also significantly outperforms static compilation frameworks such as Tensorflow XLA both in performance and robustness.<\/p>\n","protected":false},"excerpt":{"rendered":"
We present Astra, a compilation and execution framework that optimizes execution of a deep learning training job. Instead of treating the computation as a generic data flow graph, Astra exploits domain knowledge about deep learning to adopt a custom approach to compiler optimization. The key insight in Astra is to exploit the unique repetitiveness and […]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"msr-content-type":[3],"msr-research-highlight":[],"research-area":[13556,13547],"msr-publication-type":[193716],"msr-product-type":[],"msr-focus-area":[],"msr-platform":[],"msr-download-source":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-566493","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-artificial-intelligence","msr-research-area-systems-and-networking","msr-locale-en_us"],"msr_publishername":"ACM","msr_edition":"","msr_affiliation":"","msr_published_date":"2019-4-14","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2019\/02\/astra-asplos19.pdf","id":"578938","title":"astra-asplos19","label_id":"243109","label":0}],"msr_related_uploader":"","msr_attachments":[{"id":578938,"url":"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2019\/04\/astra-asplos19.pdf"},{"id":566496,"url":"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2019\/02\/asplos493-sivathanuA-hm.pdf"}],"msr-author-ordering":[{"type":"user_nicename","value":"Muthian Sivathanu","user_id":36320,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Muthian Sivathanu"},{"type":"text","value":"Tapan Chugh","user_id":0,"rest_url":false},{"type":"text","value":"Sanjay S. Singapuram","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Lidong Zhou","user_id":32673,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Lidong Zhou"}],"msr_impact_theme":[],"msr_research_lab":[199560,199562],"msr_event":[],"msr_group":[],"msr_project":[600474,968667,557118],"publication":[],"video":[],"download":[],"msr_publication_type":"inproceedings","related_content":{"projects":[{"ID":600474,"post_title":"Micro Co-design","post_name":"micro-codesign","post_type":"msr-project","post_date":"2020-02-23 21:23:12","post_modified":"2020-02-23 21:23:13","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/micro-codesign\/","post_excerpt":"The massive scale of cloud infrastructure services enables, and often necessitates, vertical co-design of the infrastructure stack by cloud providers. Micro\u00a0co-design is a\u00a0Minimally invasive,\u00a0Cheap, and retro-fittable approach to co-design that extracts efficiency out of existing software infrastructure layers. It does this by making lightweight changes to generic software interfaces. We are exploring micro co-design in the context of four systems: Instalytics, Astra, Gandiva, and Quiver, aimed at improving the efficiency and functionality of key infrastructure…","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/600474"}]}},{"ID":968667,"post_title":"AI Infrastructure","post_name":"ai-infrastructure","post_type":"msr-project","post_date":"2023-10-04 00:44:27","post_modified":"2024-08-01 03:26:39","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/ai-infrastructure\/","post_excerpt":"Towards efficient AI\/ML deployment The AI Infrastructure team at Microsoft Research India works on cutting-edge systems optimizations for improving the efficiency of a variety of AI\/ML workloads, including an emerging class of workloads, namely, serving large language models (LLMs). AI\/ML models are expensive to train and serve at scale and therefore, systems optimizations are crucial for unlocking the true potential of AI-powered applications. The key principle behind many of our projects is co-design of Systems and…","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/968667"}]}},{"ID":557118,"post_title":"Astra: Custom-Wired DNNs","post_name":"astra-custom-wired-dnns","post_type":"msr-project","post_date":"2018-12-14 02:56:39","post_modified":"2019-11-19 08:45:06","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/astra-custom-wired-dnns\/","post_excerpt":"Astra is a compilation and execution framework that optimizes execution of a deep learning training job. Instead of treating the computation as a generic data flow graph, Astra exploits domain knowledge about deep learning training to adopt a custom approach to compiler optimization.","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/557118"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/566493"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":1,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/566493\/revisions"}],"predecessor-version":[{"id":566499,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/566493\/revisions\/566499"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=566493"}],"wp:term":[{"taxonomy":"msr-content-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-content-type?post=566493"},{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=566493"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=566493"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=566493"},{"taxonomy":"msr-product-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-product-type?post=566493"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=566493"},{"taxonomy":"msr-platform","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-platform?post=566493"},{"taxonomy":"msr-download-source","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-download-source?post=566493"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=566493"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=566493"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=566493"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=566493"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=566493"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=566493"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=566493"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}