{"id":476322,"date":"2018-03-25T15:24:05","date_gmt":"2018-03-25T22:24:05","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&p=476322"},"modified":"2020-04-13T09:53:44","modified_gmt":"2020-04-13T16:53:44","slug":"serving-dnns-real-time-datacenter-scale-project-brainwave","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/serving-dnns-real-time-datacenter-scale-project-brainwave\/","title":{"rendered":"Serving DNNs in Real Time at Datacenter Scale with Project Brainwave"},"content":{"rendered":"

To meet the computational demands required of deep learning, cloud operators are turning toward specialized hardware for improved efficiency and performance. Project Brainwave, Microsoft’s principal infrastructure for AI serving in real time, accelerates deep neural network (DNN) inferencing in major services such as Bing\u2019s intelligent search features \u00a0and Azure. Exploiting distributed model parallelism and pinning over low-latency hardware microservices, Project Brainwave serves state-of-the-art, pre-trained DNN models with high efficiencies at low batch sizes. A high-performance, precision-adaptable FPGA soft processor is at the heart of the system, achieving up to 39.5 TFLOPs of effective performance at Batch 1 on a state-of-the-art Intel Stratix 10 FPGA.<\/p>\n","protected":false},"excerpt":{"rendered":"

To meet the computational demands required of deep learning, cloud operators are turning toward specialized hardware for improved efficiency and performance. Project Brainwave, Microsoft’s principal infrastructure for AI serving in real time, accelerates deep neural network (DNN) inferencing in major services such as Bing\u2019s intelligent search features \u00a0and Azure. Exploiting distributed model parallelism and pinning […]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"msr-content-type":[3],"msr-research-highlight":[],"research-area":[13556,13552],"msr-publication-type":[193715],"msr-product-type":[],"msr-focus-area":[],"msr-platform":[],"msr-download-source":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-476322","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-artificial-intelligence","msr-research-area-hardware-devices","msr-locale-en_us"],"msr_publishername":"","msr_edition":"","msr_affiliation":"","msr_published_date":"2018-3-25","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"IEEE Micro","msr_volume":"38","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"476325","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/03\/mi0218_Chung-2018Mar25.pdf","id":"476325","title":"mi0218_Chung-2018Mar25","label_id":"243109","label":0}],"msr_related_uploader":"","msr_attachments":[],"msr-author-ordering":[{"type":"user_nicename","value":"Eric Chung","user_id":31746,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Eric Chung"},{"type":"user_nicename","value":"Jeremy Fowers","user_id":32249,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Jeremy Fowers"},{"type":"user_nicename","value":"Kalin Ovtcharov","user_id":36134,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Kalin Ovtcharov"},{"type":"user_nicename","value":"Michael Papamichael","user_id":33191,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Michael Papamichael"},{"type":"user_nicename","value":"Adrian Caulfield","user_id":30808,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Adrian Caulfield"},{"type":"user_nicename","value":"Todd Massengill","user_id":34236,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Todd Massengill"},{"type":"user_nicename","value":"Ming Liu","user_id":37056,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Ming Liu"},{"type":"user_nicename","value":"Mahdi Ghandi","user_id":37506,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Mahdi Ghandi"},{"type":"user_nicename","value":"Daniel Lo","user_id":31646,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Daniel Lo"},{"type":"user_nicename","value":"Steve Reinhardt","user_id":37488,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Steve Reinhardt"},{"type":"user_nicename","value":"Shlomi Alkalay","user_id":37479,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Shlomi Alkalay"},{"type":"guest","value":"hari-angepat","user_id":431040,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=hari-angepat"},{"type":"guest","value":"derek-chiou","user_id":375089,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=derek-chiou"},{"type":"user_nicename","value":"Alessandro Forin","user_id":33513,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Alessandro Forin"},{"type":"user_nicename","value":"Doug Burger","user_id":31582,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Doug Burger"},{"type":"user_nicename","value":"Lisa Woods","user_id":32701,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Lisa Woods"},{"type":"user_nicename","value":"Gabriel Weisz","user_id":37500,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Gabriel Weisz"},{"type":"user_nicename","value":"Michael Haselman","user_id":37482,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Michael Haselman"},{"type":"user_nicename","value":"Dan Zhang","user_id":37497,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Dan Zhang"}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[],"msr_group":[],"msr_project":[691494,649749,486102,171431],"publication":[],"video":[],"download":[],"msr_publication_type":"article","related_content":{"projects":[{"ID":691494,"post_title":"Project Turing","post_name":"project-turing","post_type":"msr-project","post_date":"2020-09-13 20:41:57","post_modified":"2021-11-01 18:05:54","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/project-turing\/","post_excerpt":"A deep learning initiative inside Microsoft to build the best-in-class models for use by Microsoft and power AI applications across entire Microsoft product family (Word, PowerPoint, Office, Dynamics, etc.) and make them available for use through Azure.","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/691494"}]}},{"ID":649749,"post_title":"AI at Scale","post_name":"ai-at-scale","post_type":"msr-project","post_date":"2020-05-19 08:01:11","post_modified":"2024-09-09 08:40:22","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/ai-at-scale\/","post_excerpt":"AI at Scale is an applied research initiative that works to evolve Microsoft products with the adoption of deep learning for both natural language text and image processing. Our work is actively being integrated into Microsoft products, including Bing, Office, and Xbox.","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/649749"}]}},{"ID":486102,"post_title":"Project Brainwave","post_name":"project-brainwave","post_type":"msr-project","post_date":"2018-08-14 09:49:27","post_modified":"2023-07-10 07:52:57","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/project-brainwave\/","post_excerpt":"Project Brainwave is a deep learning platform for real-time AI inference in the cloud and on the edge, transforming computing by augmenting CPUs with an interconnected and configurable compute layer composed of programmable silicon.","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/486102"}]}},{"ID":171431,"post_title":"Project Catapult","post_name":"project-catapult","post_type":"msr-project","post_date":"2015-02-02 08:20:51","post_modified":"2021-12-06 21:07:49","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/project-catapult\/","post_excerpt":"Project Catapult is the code name for a Microsoft Research (MSR) enterprise-level initiative that is transforming cloud computing by augmenting CPUs with an interconnected and configurable compute layer composed of programmable silicon.","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/171431"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/476322"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":4,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/476322\/revisions"}],"predecessor-version":[{"id":484839,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/476322\/revisions\/484839"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=476322"}],"wp:term":[{"taxonomy":"msr-content-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-content-type?post=476322"},{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=476322"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=476322"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=476322"},{"taxonomy":"msr-product-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-product-type?post=476322"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=476322"},{"taxonomy":"msr-platform","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-platform?post=476322"},{"taxonomy":"msr-download-source","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-download-source?post=476322"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=476322"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=476322"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=476322"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=476322"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=476322"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=476322"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=476322"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}