{"id":305990,"date":"2011-01-26T10:00:24","date_gmt":"2011-01-26T18:00:24","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=305990"},"modified":"2016-10-15T15:27:27","modified_gmt":"2016-10-15T22:27:27","slug":"customers-get-dryad-dryadlinq","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/customers-get-dryad-dryadlinq\/","title":{"rendered":"Customers Get Dryad, DryadLINQ"},"content":{"rendered":"

By Douglas Gantenbein, Senior Writer, Microsoft News Center<\/em><\/p>\n

Researchers and businesspeople around the world now have at their disposal a new way to perform massive computations over large quantities of unstructured data more quickly and easily than they\u2019ve ever imagined.<\/p>\n

The reason: a Microsoft Research-developed computing tool called Dryad (opens in new tab)<\/span><\/a>, a name derived from shy tree deities found in Greek mythology. Dryad and a related programming model called DryadLINQ (opens in new tab)<\/span><\/a> constitute technology that simplifies running complex data-analysis applications across hundreds or even thousands of servers on familiar, widely used Windows software.<\/p>\n

After nearly six years of research into Dryad and DryadLINQ\u2014as well as its use in-house on Microsoft projects such as Kinect (opens in new tab)<\/span><\/a> and Bing (opens in new tab)<\/span><\/a>\u2014Dryad and DryadLINQ are entering commercial use. Starting Jan. 26, a technology preview of Dryad and DryadLINQ will be built into the Windows HPC Server 2008 R2 (opens in new tab)<\/span><\/a> high-performance computing line and eventually will be integrated with Microsoft SQL Server (opens in new tab)<\/span><\/a> and Windows Azure (opens in new tab)<\/span><\/a>. HPC Server is designed to give customers tremendous computing power and an easy management experience, all using off-the-shelf hardware.<\/p>\n

Michael Isard, a Microsoft Research Silicon Valley principal researcher instrumental in launching the Dryad project, says the new technology is an excellent example of how Microsoft views computing.<\/p>\n

\u201cThis is an opportunity to democratize large-scale, data-intensive computing,\u201d he says. \u201cIn areas such as customer-relationship management, business intelligence, planning, and infrastructure\u2014all those tasks where companies now have access to a vast amount of data\u2014Dryad and DryadLINQ can make sense of that data.\u201d<\/p>\n

How Dryad Works<\/h2>\n

The Dryad project consists of two key components. The Dryad tool itself provides reliable computing across thousands of servers. DryadLINQ, built on Microsoft\u2019s .NET Language Integrated Query (opens in new tab)<\/span><\/a> (LINQ), enables developers to write their applications in a SQL-like query language, using familiar programming tools such as Microsoft Visual Studio (opens in new tab)<\/span><\/a>. Most programmers will work only with DryadLINQ; once they have launched their application into the cloud, Dryad will do the rest, invisibly.<\/p>\n

A third piece, the Distributed Storage Catalog (DSC), is a distributed file system built for Dryad. It manages the data that Dryad is processing, keeping it stored reliably and safely with user-configurable redundancy. The DSC also keeps the data close to the servers processing it, so time is not wasted transmitting the data to a server.<\/p>\n

Dryad and DryadLINQ make it easier for programmers to take advantage of the power of parallel computing, in which rows of servers or multicore processors within a single machine tackle a single computing problem. Such computing is extremely powerful, especially with so-called \u201cunstructured\u201d data such as information on buying habits that a retailer might collect from tens of thousands of customers but that has not been tagged or annotated, in contrast to structured data found, for instance, in a SQL database.<\/p>\n

It is difficult, though, to harness the power afforded by parallel computing. Most programmers are more familiar with writing sequential programs, in which Action A is followed by Action B, then Action C. It is challenging to think and program in parallel.<\/p>\n

While DryadLINQ enables developers to write their applications in a query language using Visual Studio, Dryad breaks up the program and assigns it across clusters of servers or processors. In effect, Dryad acts as a computing traffic cop, sending data down potentially millions of computing pathways. It helps make sure that when one piece of data is modified, other servers don\u2019t also change that data. It balances the computing load between many computers, and it re-routes computing traffic if an error or communications problem temporarily takes one or even several servers offline.<\/p>\n

That removes a huge burden from programmers and lets them focus on the problem they are trying to solve, not how the computers will act in parallel.<\/p>\n

\u201cWe want programmers to be able to write their programs without having to think about things like fault tolerance [a byproduct of parallel computing\u2019s complexity],\u201d says Yuan Yu, a principal researcher at Microsoft Research Silicon Valley who led the creation of the DryadLINQ component.<\/p>\n

\"Yuan

Yuan Yu<\/p><\/div>\n

\u201cWe want them to be able to write sequential and declarative code, and then, that same code can be run on a single machine, on a multicore machine, or on a cluster of machines. That\u2019s the beauty of the DryadLINQ programming model.\u201d<\/p>\n

A second benefit is that Dryad gives programmers supercomputer-level power with everyday programming tools and relatively inexpensive hardware.<\/p>\n

\u201cThis is a much cheaper way of doing things,\u201d Yu says. \u201cEverything is a commodity\u2014a commodity operating system, using commodity servers and switches. Dryad deals with the reliability and the bandwidth issues.\u201d<\/p>\n

Dryad also utilizes Microsoft\u2019s big investment in the cloud. As Dryad is integrated with Azure, all a programmer will need to take advantage of Dryad is a client and an Azure connection. Whether they are working on a cluster or the cloud, programmers can store their data and then manipulate it through their DryadLINQ-written applications. On a cluster, the DSC unit manages the data to keep it close to the processors working on it, so time is not lost in communicating data between servers.<\/p>\n

\u201cThe only thing we\u2019ll give the customer is some client software for writing DryadLINQ programs,\u201d Isard says. \u201cThey\u2019ll basically write the program on their machine and submit it to Windows Azure, where Dryad is running internally.\u201d<\/p>\n

The Evolution of Dryad<\/h2>\n

Dryad had its roots in an idea developed in October 2004 by Isard\u2014then working on search for Microsoft\u2014when he recognized the need for a large-scale data-intensive computation platform and began discussions with researchers at Microsoft to build on the idea.<\/p>\n

Not long afterward, the newly created Dryad came into widespread use within Microsoft\u2019s search offering, where it was used on thousands of servers. But while the tool worked well, the programming interface was awkward. Yu recognized the potential of LINQ to serve as the front-end programming tool for Dryad, and started the DryadLINQ project in September 2006. By early 2008, the Dryad\/DryadLINQ combination was made available within Microsoft. A release to a small collection of academic researchers followed. Dryad also was adopted as a key tool for the development of the Xbox 360 (opens in new tab)<\/span><\/a> Kinect gaming device.\u00a0 The DryadLINQ research paper won a best-paper award in 2008 during the eighth USENIX Symposium on Operating Systems Design and Implementation.<\/p>\n

\u201cIt was easily the largest project in our lab,\u201d Yu says. \u201cAnd this was a long-term project, so management had to believe in it. But they said, \u2018We believe in you guys, so here is the money you need to build a server cluster to do the research.\u2019 Also, the entire lab was very supportive\u2014we built the (Dryad) system, and many researchers are using it for real work. Their feedback, in particular, has been invaluable in refining the DryadLINQ programming model.\u201d<\/p>\n

Isard adds that while it might seem Dryad had a long gestation, the market time for its release is right.<\/p>\n

\u201cI think the HPC product group moved at the right time\u2014when they saw the opportunity,\u201d he says. \u201cWe were a year or two ahead of the curve on the research side, but we were ready when the product group saw a need for it.\u201d<\/p>\n

Dryad Enters the Market<\/h2>\n

A big step is coming, as Dryad and DryadLINQ become fully productized as part of the Microsoft HPC Server suite. It also will be integrated with Microsoft SQL Server and Windows Azure to give customers from academia to the business community a new, powerful computing tool.<\/p>\n

Isard is confident that Dryad\u2019s ease of use and familiar Microsoft tools will win over developers.<\/p>\n

\u201cDryad will particularly appeal to customers who would love to keep using Windows and Excel and Visual Studio and all the tools they already use,\u201d he says, \u201cand need a technology for unstructured data analysis that really scales.\u201d<\/p>\n

John Dunagan, a principal architect for Microsoft\u2019s High Performance Computing group, thinks HPC Server customers who use Dryad will find that they now can solve problems that had been challenging.<\/p>\n

\u201cWe\u2019re convinced that we will delight our customers, both with the pure capability of the system, as well as its ease of use,\u201d he says. \u201cWhat I really like about Dryad is that is not just about handling a problem in a better way, it is also about new possibilities in computing that you couldn\u2019t imagine before.\u201d<\/p>\n

The Microsoft Research team that worked on Dryad is pleased to see its project in a position to seek a larger audience.<\/p>\n

\u201cOffering an easy-to-use but powerful, data-intensive computing tool is exciting to see,\u201d Isard says. \u201cIt will benefit a whole new set of Microsoft customers.\u201d<\/p>\n","protected":false},"excerpt":{"rendered":"

By Douglas Gantenbein, Senior Writer, Microsoft News Center Researchers and businesspeople around the world now have at their disposal a new way to perform massive computations over large quantities of unstructured data more quickly and easily than they\u2019ve ever imagined. The reason: a Microsoft Research-developed computing tool called Dryad, a name derived from shy tree […]<\/p>\n","protected":false},"author":39507,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"categories":[194488],"tags":[214451,186604,187348,193659,196484,196501,214448,197838],"research-area":[13560],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-305990","post","type-post","status-publish","format-standard","hentry","category-program-languages-and-software-engineering","tag-net-language-integrated-query","tag-bing","tag-kinect","tag-microsoft-azure","tag-microsoft-sql-server","tag-microsoft-visual-studio","tag-windows-hpc-2008-server","tag-xbox-360","msr-research-area-programming-languages-software-engineering","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[169537,169536],"related-events":[],"related-researchers":[],"msr_type":"Post","byline":"","formattedDate":"January 26, 2011","formattedExcerpt":"By Douglas Gantenbein, Senior Writer, Microsoft News Center Researchers and businesspeople around the world now have at their disposal a new way to perform massive computations over large quantities of unstructured data more quickly and easily than they\u2019ve ever imagined. The reason: a Microsoft Research-developed…","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/305990"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/39507"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=305990"}],"version-history":[{"count":3,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/305990\/revisions"}],"predecessor-version":[{"id":306014,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/305990\/revisions\/306014"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=305990"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=305990"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=305990"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=305990"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=305990"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=305990"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=305990"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=305990"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=305990"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=305990"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=305990"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}