{"id":34478,"date":"2021-02-16T09:00:59","date_gmt":"2021-02-16T17:00:59","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/?p=34478"},"modified":"2024-01-22T22:51:28","modified_gmt":"2024-01-23T06:51:28","slug":"whats-new-with-sql-server-big-data-clusters","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2021\/02\/16\/whats-new-with-sql-server-big-data-clusters\/","title":{"rendered":"What\u2019s new with SQL Server Big Data Clusters"},"content":{"rendered":"<p><a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/big-data-cluster\/big-data-cluster-overview?view=sql-server-ver15\" target=\"_blank\" rel=\"noopener\">SQL Server Big Data Clusters<\/a> (BDC) is a new capability brought to market as part of the SQL Server 2019 release. BDC extends SQL Server\u2019s analytical capabilities beyond in-database processing of transactional and analytical workloads by uniting the SQL engine with Apache Spark and Apache Hadoop to create a single, secure, and unified data platform. BDC is available exclusively to run on Linux containers, orchestrated by Kubernetes, and can be deployed in multiple-cloud providers or on-premises.<\/p>\n<p>Today, we\u2019re announcing the release of the latest cumulative update (CU9) for SQL Server Big Data Clusters, which includes important capabilities:<\/p>\n<ul>\n<li>Support to configure BDC post deployment.<\/li>\n<li>Improved experience for encryption at rest.<\/li>\n<li>Ability to install Python packages at Spark job submission time.<\/li>\n<li>Upgraded software versions for most of our OSS components (Grafana, Kibana, FluentBit, etc.) to ensure Big Data Clusters images are up to date with the latest enhancements and fixes.<\/li>\n<li>Miscellaneous improvements and bug fixes.<\/li>\n<\/ul>\n<p>This announcement highlights some of the major improvements, provides additional context to better understand the design behind these capabilities, and points you to relevant resources to learn more and get started.<\/p>\n<h2>Configuring SQL Server Big Data Clusters to meet your business needs<\/h2>\n<p>SQL Server Big Data Clusters, a feature released as part of SQL Server 2019, is a data platform for operational and analytical workloads. We are announcing new configuration management functionality as part of today\u2019s CU9 release. Workload requirements are constantly changing and these enhancements will help customers ensure that their Big Data Cluster is always prepared for their needs.<\/p>\n<p>Configuration management is the ability to alter or tune various parts of the Big Data Cluster after deployment and to provide users with clarity into the cluster\u2019s configurations. This allows administrators to configure the Big Data Cluster configurations to meet their workload\u2019s needs. Whether an administrator wants to turn on SQL Agent, define the baseline resources for their organization\u2019s Spark jobs, or even see what settings are configurable at each scope\u2014configuration management is the one-stop solution to meet these needs.<\/p>\n<p>To enable this functionality, we are exposing new commands to the azdata\u00a0 command line interface (CLI). Azdata, an interface to manage a BDC, now includes post-deployment configuration functionality to set, diff, and apply configuration settings. To start, customers can configure settings at the cluster, service, and resource scope and then commit them for change. After applying pending configuration changes, customers can monitor the process through azdata or Azure Data Studio. Once the update is completed, the Big Data Cluster is ready for the next workload.<\/p>\n<p>Learn more and get started with <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/big-data-cluster\/configure-bdc-overview?view=sql-server-ver15\" target=\"_blank\" rel=\"noopener\">configuration management<\/a>.<\/p>\n<h2>Spark job library management<\/h2>\n<p>Data engineers and data scientists often want to experiment with and use a variety of different libraries and packages as part of their workflows. There are separate ways to do this for each language including importing from Maven, installing from Python Package Index (PyPi) or conda, or installing from Microsoft R Application Network (MRAN). Before today, customers could import jars from Maven or reference custom packages stored in Hadoop Distributed File System (HDFS) through Spark job configurations.<\/p>\n<p>Starting in CU9, data engineers and data scientists now have added flexibility for their PySpark jobs through job-level virtual environments. They can easily configure a conda virtual environment and get to work with their favorite Python libraries.<\/p>\n<p>Learn how to configure a <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/big-data-cluster\/spark-install-packages?view=sql-server-ver15\" target=\"_blank\" rel=\"noopener\">job-level Spark environment<\/a>.<\/p>\n<h2>Improving the experience on encryption at rest<\/h2>\n<p>In SQL Server Big Data Clusters CU8, we introduced a comprehensive encryption at rest feature set that focused on system-managed keys. This enabled application-level encryption capabilities to all data stored in the platform, on both SQL Server and HDFS. The HDFS experience provided at that time for administrators was centered on usage of Azure Data Studio Notebooks to control all aspects of the feature. Starting with CU9, in addition to expanding the Notebook experience, we are enabling HDFS encryption zones and HDFS key management through azdata. This enables the automation of encryption at rest administrative tasks for HDFS administrators, a much desirable and consistent feature of the SQL Server Big Data Clusters platform.<\/p>\n<p>To learn more about the new notebooks and the new azdata commands, visit <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/big-data-cluster\/release-notes-big-data-cluster?view=sql-server-ver15\" target=\"_blank\" rel=\"noopener\">the release notes<\/a>.<\/p>\n<h2>Ready to learn more?<\/h2>\n<p>Check out the SQL Server CU9 <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/big-data-cluster\/release-notes-big-data-cluster?view=sql-server-ver15\" target=\"_blank\" rel=\"noopener\">release notes<\/a> for Big Data Clusters to learn more about all of the improvements available with the latest update. For a technical deep-dive on Big Data Clusters, read the documentation and visit our <a href=\"https:\/\/github.com\/microsoft\/sql-server-samples\/tree\/master\/samples\/features\/sql-big-data-cluster\" target=\"_blank\" rel=\"noopener\">GitHub repository<\/a>.<\/p>\n<p>Follow the instructions on our <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/big-data-cluster\/big-data-cluster-overview?view=sql-server-ver15\" target=\"_blank\" rel=\"noopener\">documentation page<\/a> to get started and deploy Big Data Clusters.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>SQL Server Big Data Clusters (BDC) is a new capability brought to market as part of the SQL Server 2019 release. BDC extends SQL Server\u2019s analytical capabilities beyond in-database processing of transactional and analytical workloads by uniting the SQL engine with Apache Spark and Apache Hadoop to create a single, secure, and unified data platform.<\/p>\n","protected":false},"author":1457,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"ep_exclude_from_search":false,"_classifai_error":"","_classifai_text_to_speech_error":"","footnotes":""},"post_tag":[],"product":[],"content-type":[2448],"topic":[2451],"coauthors":[2487],"class_list":["post-34478","post","type-post","status-publish","format-standard","hentry","content-type-updates","topic-big-data","review-flag-alway-1593580309-407","review-flag-new-1593580247-437"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v24.7 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>What\u2019s new with SQL Server Big Data Clusters - Microsoft SQL Server Blog<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2021\/02\/16\/whats-new-with-sql-server-big-data-clusters\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"What\u2019s new with SQL Server Big Data Clusters - Microsoft SQL Server Blog\" \/>\n<meta property=\"og:description\" content=\"SQL Server Big Data Clusters (BDC) is a new capability brought to market as part of the SQL Server 2019 release. BDC extends SQL Server\u2019s analytical capabilities beyond in-database processing of transactional and analytical workloads by uniting the SQL engine with Apache Spark and Apache Hadoop to create a single, secure, and unified data platform.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2021\/02\/16\/whats-new-with-sql-server-big-data-clusters\/\" \/>\n<meta property=\"og:site_name\" content=\"Microsoft SQL Server Blog\" \/>\n<meta property=\"article:publisher\" content=\"http:\/\/www.facebook.com\/sqlserver\" \/>\n<meta property=\"article:published_time\" content=\"2021-02-16T17:00:59+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-01-23T06:51:28+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/wp-content\/uploads\/2018\/08\/cropped-microsoft_logo_element.png\" \/>\n\t<meta property=\"og:image:width\" content=\"512\" \/>\n\t<meta property=\"og:image:height\" content=\"512\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"SQL Server Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@SQLServer\" \/>\n<meta name=\"twitter:site\" content=\"@SQLServer\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"SQL Server Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 min read\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2021\/02\/16\/whats-new-with-sql-server-big-data-clusters\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2021\/02\/16\/whats-new-with-sql-server-big-data-clusters\/\"},\"author\":[{\"@id\":\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/author\/sql-server-team\/\",\"@type\":\"Person\",\"@name\":\"SQL Server Team\"}],\"headline\":\"What\u2019s new with SQL Server Big Data Clusters\",\"datePublished\":\"2021-02-16T17:00:59+00:00\",\"dateModified\":\"2024-01-23T06:51:28+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2021\/02\/16\/whats-new-with-sql-server-big-data-clusters\/\"},\"wordCount\":763,\"publisher\":{\"@id\":\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/#organization\"},\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2021\/02\/16\/whats-new-with-sql-server-big-data-clusters\/\",\"url\":\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2021\/02\/16\/whats-new-with-sql-server-big-data-clusters\/\",\"name\":\"What\u2019s new with SQL Server Big Data Clusters - Microsoft SQL Server Blog\",\"isPartOf\":{\"@id\":\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/#website\"},\"datePublished\":\"2021-02-16T17:00:59+00:00\",\"dateModified\":\"2024-01-23T06:51:28+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2021\/02\/16\/whats-new-with-sql-server-big-data-clusters\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2021\/02\/16\/whats-new-with-sql-server-big-data-clusters\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2021\/02\/16\/whats-new-with-sql-server-big-data-clusters\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"What\u2019s new with SQL Server Big Data Clusters\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/#website\",\"url\":\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/\",\"name\":\"Microsoft SQL Server Blog\",\"description\":\"Official News from Microsoft\u2019s Information Platform\",\"publisher\":{\"@id\":\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/#organization\",\"name\":\"Microsoft SQL Server Blog\",\"url\":\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/wp-content\/uploads\/2019\/08\/Microsoft-Logo.png\",\"contentUrl\":\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/wp-content\/uploads\/2019\/08\/Microsoft-Logo.png\",\"width\":259,\"height\":194,\"caption\":\"Microsoft SQL Server Blog\"},\"image\":{\"@id\":\"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"http:\/\/www.facebook.com\/sqlserver\",\"https:\/\/x.com\/SQLServer\",\"https:\/\/www.youtube.com\/user\/MSCloudOS\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"What\u2019s new with SQL Server Big Data Clusters - Microsoft SQL Server Blog","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2021\/02\/16\/whats-new-with-sql-server-big-data-clusters\/","og_locale":"en_US","og_type":"article","og_title":"What\u2019s new with SQL Server Big Data Clusters - Microsoft SQL Server Blog","og_description":"SQL Server Big Data Clusters (BDC) is a new capability brought to market as part of the SQL Server 2019 release. BDC extends SQL Server\u2019s analytical capabilities beyond in-database processing of transactional and analytical workloads by uniting the SQL engine with Apache Spark and Apache Hadoop to create a single, secure, and unified data platform.","og_url":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2021\/02\/16\/whats-new-with-sql-server-big-data-clusters\/","og_site_name":"Microsoft SQL Server Blog","article_publisher":"http:\/\/www.facebook.com\/sqlserver","article_published_time":"2021-02-16T17:00:59+00:00","article_modified_time":"2024-01-23T06:51:28+00:00","og_image":[{"width":512,"height":512,"url":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/wp-content\/uploads\/2018\/08\/cropped-microsoft_logo_element.png","type":"image\/png"}],"author":"SQL Server Team","twitter_card":"summary_large_image","twitter_creator":"@SQLServer","twitter_site":"@SQLServer","twitter_misc":{"Written by":"SQL Server Team","Est. reading time":"3 min read"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2021\/02\/16\/whats-new-with-sql-server-big-data-clusters\/#article","isPartOf":{"@id":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2021\/02\/16\/whats-new-with-sql-server-big-data-clusters\/"},"author":[{"@id":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/author\/sql-server-team\/","@type":"Person","@name":"SQL Server Team"}],"headline":"What\u2019s new with SQL Server Big Data Clusters","datePublished":"2021-02-16T17:00:59+00:00","dateModified":"2024-01-23T06:51:28+00:00","mainEntityOfPage":{"@id":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2021\/02\/16\/whats-new-with-sql-server-big-data-clusters\/"},"wordCount":763,"publisher":{"@id":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/#organization"},"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2021\/02\/16\/whats-new-with-sql-server-big-data-clusters\/","url":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2021\/02\/16\/whats-new-with-sql-server-big-data-clusters\/","name":"What\u2019s new with SQL Server Big Data Clusters - Microsoft SQL Server Blog","isPartOf":{"@id":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/#website"},"datePublished":"2021-02-16T17:00:59+00:00","dateModified":"2024-01-23T06:51:28+00:00","breadcrumb":{"@id":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2021\/02\/16\/whats-new-with-sql-server-big-data-clusters\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2021\/02\/16\/whats-new-with-sql-server-big-data-clusters\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2021\/02\/16\/whats-new-with-sql-server-big-data-clusters\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/"},{"@type":"ListItem","position":2,"name":"What\u2019s new with SQL Server Big Data Clusters"}]},{"@type":"WebSite","@id":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/#website","url":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/","name":"Microsoft SQL Server Blog","description":"Official News from Microsoft\u2019s Information Platform","publisher":{"@id":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/#organization","name":"Microsoft SQL Server Blog","url":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/wp-content\/uploads\/2019\/08\/Microsoft-Logo.png","contentUrl":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/wp-content\/uploads\/2019\/08\/Microsoft-Logo.png","width":259,"height":194,"caption":"Microsoft SQL Server Blog"},"image":{"@id":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/#\/schema\/logo\/image\/"},"sameAs":["http:\/\/www.facebook.com\/sqlserver","https:\/\/x.com\/SQLServer","https:\/\/www.youtube.com\/user\/MSCloudOS"]}]}},"msxcm_display_generated_audio":false,"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/wp-json\/wp\/v2\/posts\/34478","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/wp-json\/wp\/v2\/users\/1457"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/wp-json\/wp\/v2\/comments?post=34478"}],"version-history":[{"count":0,"href":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/wp-json\/wp\/v2\/posts\/34478\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/wp-json\/wp\/v2\/media?parent=34478"}],"wp:term":[{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/wp-json\/wp\/v2\/post_tag?post=34478"},{"taxonomy":"product","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/wp-json\/wp\/v2\/product?post=34478"},{"taxonomy":"content-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/wp-json\/wp\/v2\/content-type?post=34478"},{"taxonomy":"topic","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/wp-json\/wp\/v2\/topic?post=34478"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/wp-json\/wp\/v2\/coauthors?post=34478"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}