{"id":15351,"date":"2016-03-22T13:07:16","date_gmt":"2016-03-22T20:07:16","guid":{"rendered":"https:\/\/blogs.technet.microsoft.com\/dataplatforminsider\/?p=15351"},"modified":"2024-01-22T22:50:27","modified_gmt":"2024-01-23T06:50:27","slug":"expanding-the-data-footprint-of-sql-server-2016-with-polybase","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2016\/03\/22\/expanding-the-data-footprint-of-sql-server-2016-with-polybase\/","title":{"rendered":"Expanding the data footprint of SQL Server 2016 with PolyBase"},"content":{"rendered":"

This post was authored by Casey Karst, Program Manager, SQL Server.<\/em><\/p>\n

Changing data landscape<\/h2>\n

A lot has changed in the world of data over the last 10 years. The rise of connected devices, unstructured event data and ever-decreasing hardware prices has caused a Big Data boom. Solutions built on commodity hardware, such as Hadoop and HDFS (Hadoop Distributed File System), were developed to land machine-born, semi-structured data and deliver insights. This created new opportunities for generating value, but it has come at the cost of added complexity to enterprise data solutions. With the additional data also came the problem of having two or more disjoint datasets, some relational in SQL Server and some non-relational in HDFS. If a data analyst wanted to combine relational data with semi-structured data, they had to spend time and resources copying the data from one environment into the other, ultimately slowing the time to insight.<\/p>\n

With PolyBase in SQL Server 2016, the days of disjoint relational and semi-structured data are over. With the combination of PolyBase and T-SQL, users can query data stored in HDFS as if it is local to the SQL Server, enabling a wide variety of new insights and scenarios.<\/strong><\/p>\n

\"PolyBase<\/p>\n

Key scenarios<\/h2>\n

Because PolyBase allows you to interact with both SQL Server and Hadoop, three new scenarios are possible:<\/p>\n