SQL Server 2019 - Microsoft SQL Server Blog

Update on the support of DBCC CLONEDATABASE for production use

Madhumita Tripathy — Mon, 25 Mar 2024 15:00:00 +0000

DBCC CLONEDATABASE command generates a schema-only clone or copy of a database. Effective March 1, 2025, Microsoft will no longer support creating copy of a database using the DBCC CLONEDATABASE command and using it as a new database in a production environment. However, the command will persist for generating schema-only copies solely for diagnostic and troubleshooting purposes. This change impacts all editions of SQL Server 2016 and later versions. 

To generate a production-ready schema-only copy of a database, it’s highly recommended to utilize tools such as Microsoft SQL Server Data Tools (SSDT); the Generate and Publish scripts Wizard; or data-tier application extraction, which includes either the Extract Data-tier Application (DAC) Wizard or a PowerShell script. These tools provide a reliable way to create a copy of a database for use in production environments.

Learn more about

Frequently asked questions

What is SSDT?

SQL Server Data Tools (SSDT) is a modern development tool that integrates with Microsoft Visual Studio and provides design, debugging, and deployment capabilities for building SQL Server relational databases, databases in Azure SQL, Analysis Services (AS) data models, Integration Services (IS) packages, and Reporting Services (RS) reports. With SSDT, developers can perform necessary tasks without needing the admin-focused SQL Server Management tool on their developer computer. Essentially, Visual Studio removes unnecessary features like admin tools, and focuses on tools that are useful for developers, like database and schema comparison tools.

What is the Generate and Publish Scripts Wizard?

The Generate and Publish Scripts Wizard is a feature in SQL Server Management Studio (SSMS) that allows you to create scripts for transferring a database between instances of the SQL Server Database Engine or Azure SQL Database. You can generate scripts for a database on an instance of the Database Engine in your local network, or from SQL Database. The generated scripts can be run on another instance of the Database Engine or SQL Database. You can also use the wizard to publish the contents of a database directly to a Web service created by using the Database Publishing Services. You can create scripts for an entire database or limit it to specific objects.

What is a data-tier application (DAC)?

A data-tier application (DAC) is a logical database entity that defines all of the SQL Server objects—such as tables, views, and instance objects, including logins—associated with a user’s database. A DAC is a self-contained unit of the entire database model and is portable in an artifact known as a DAC package, or .dacpac. Tooling support for data-tier applications enables developers and database administrators to deploy dacpacs to new or existing databases. Deployments to an existing database update the database model from the existing state to match the contents of the dacpac. Developers build DACs from SQL database projects, a declarative development concept for building SQL objects that enables source control on the database schema.

A .bacpac is a related artifact that, by default, encapsulates the database schema and the data stored in the database. The primary use case for a BACPAC is to move a database from one server to another—or to migrate a database from a local server to the cloud—and archiving an existing database in an open format.

What is DBCC CLONEDATABASE command?

DBCC CLONEDATABASE creates a new database that contains the schema of all the objects and statistics from the specified source database. Cloned databases copy all schema and metadata of the source database without copying any data.

How do I use Schema Compare to compare different database definitions?

SSDT includes a Schema Compare utility that you can use to compare two database definitions. The source and target of the comparison can be any combination of connected database, SQL Server database project or snapshot, or .dacpac file. The results of the comparison appear as a set of actions that must be taken with the target to make it the same as the source. Once the comparison is complete, you can update the target directly (if the target is a project or a database) or generate an update script that has the same effect.

The differences between source and target appear in a grid for easy review. You can drill into and review each difference in the results grid or in script form. You can then selectively exclude specific differences.

You can save comparisons either as part of an SQL Server Database project or as a standalone file. You can also set options that control the scope of the comparison and aspects of the update. Then you can save the comparison so that you can easily repeat the same comparison later or use it as the starting point for a new comparison.

Why do I need to generate a schema-only clone of a database with statistics in SQL Server?

You will need to generate a schema-only clone of a database with statistics to investigate performance issues.

The query optimizer in SQL Server uses the following types of information to determine an optimal query plan:

Database metadata
Hardware environment
Database session state

Typically, you must simulate all these same types of information to reproduce the behavior of the query optimizer on a test system.

Microsoft Customer Support Services might ask you to generate a schema script of the database along with statistics to investigate a query optimizer issue.

The post Update on the support of DBCC CLONEDATABASE for production use appeared first on Microsoft SQL Server Blog.

The path forward for SQL Server analytics

SQL Server Team — Fri, 25 Feb 2022 18:00:00 +0000

Today, we are announcing changes to SQL Server analytics which includes:

Customer feedback
Retirement of SQL Server 2019 Big Data Clusters
Retirement of PolyBase scale-out groups
Path forward

Customer feedback

We continue to see increased migration to the cloud, with analytical workloads leading that charge.

Customers have indicated that analytics in the cloud best aligns to employee skillsets, deployment simplicity and manageability, and cloud flexibility and scalability.

When we first introduced cloud analytics in 2017, many were still investing in on-premises analytical workloads. Today, we offer a wealth of cloud-based services that provide users with similar functionality, including Azure Data Lake Storage (ADLS), Azure Synapse Analytics, Azure SQL, and Azure Machine Learning.

According to the Gartner® 2020 Data and Analytics survey:

Analytics, BI, and data science are the most common use cases being accelerated to the cloud due to COVID-19. The organization needs faster delivery of analytics insights to take action. Cloud, with its fast provision and prototyping ability, is an ideal place to start analytics and data science initiatives to nimbly react to the fast pace of changes.¹
In the 2020 Gartner Data and Analytics Cloud survey, 74 percent of organizations use or plan to use cloud for analytics, BI and data science.¹

Retirement of SQL Server Big Data Clusters

Today, we are announcing the retirement of SQL Server 2019 Big Data Clusters. All existing users of SQL Server 2019 with Software Assurance will be fully supported on the platform for the next three years, through February 28, 2025. This software will continue to be maintained through SQL Server cumulative updates until that time. In the latest version of SQL Server, we are engineering the best mix of on-premises and in-cloud relational workloads and connectivity to Azure Synapse Analytics for advanced analytics in a flexible, scalable, and integrated environment. Please see below and read our documentation on SQL Server Big Data Clusters to learn more.

Changes to PolyBase support in SQL Server

Today, we are announcing the retirement of PolyBase scale-out groups in Microsoft SQL Server. Scale-out group functionality will be removed from the product in SQL Server 2022. In-market SQL Server 2019, 2017, and 2016 will continue to support the functionality to the end of support for those products.

PolyBase data virtualization will continue to be fully supported as a scale-up feature in SQL Server.

Secondly, Cloudera (CDP) and Hortonworks (HDP) external data sources will also be retired for all in-market versions of SQL Server and will not be included in SQL Server 2022. Moving forward, support for external data sources will be limited to product versions in mainstream support by the respective vendor. You are encouraged to use the new object storage integration functionality available in SQL Server 2022.

In SQL Server 2022, users will need to configure their external data sources to use new connectors when connecting to Azure Storage. The table below summarizes the change:

External Data Source	From	To
Azure Blob Storage	wasb[s]	abs
ADLS Gen 2	abfs[s]	adls

The path forward

If you wish to run analytics on-premises, SQL Server 2022 also provides important new capabilities, building upon its data virtualization suite of connectors by providing object storage integration over REST APIs. We will also continue to invest in the Spark SQL connector to ensure first-class connectivity from Apache Spark to all our SQL products. Additionally, we continue to invest in expanding hybrid capabilities with Azure Arc-enabled data services.

Integrating SQL Server with cloud analytics solutions is a critical capability, which is why we are introducing Azure Synapse Link for SQL Server 2022, the latest release of SQL Server, which will be generally available to purchase later this year. This is a major investment in helping you realize cloud-scale analytics in near real-time on your operational data.

Our priority is to empower you with the tools and services that ensure SQL Server integrates seamlessly into the world of analytic workloads in the cloud by blending operational, analytical, and virtual use cases in our flagship database engine. Please contact your Microsoft account manager if you need assistance in exploring how Microsoft can best empower your analytical needs.

¹Gartner Inc.: Use Cloud to Compose Analytics, BI and Data Science Capabilities for Reusability and Resilience, Julian Sun, Joao Tapadinhas, June 10, 2021.

The post The path forward for SQL Server analytics appeared first on Microsoft SQL Server Blog.

What’s new with SQL Server Big Data Clusters—CU13 Release

Daniel Coelho — Wed, 06 Oct 2021 15:00:09 +0000

SQL Server Big Data Clusters (BDC) is a capability brought to market as part of the SQL Server 2019 release. Big Data Clusters extends SQL Server’s analytical capabilities beyond in-database processing of transactional and analytical workloads by uniting the SQL engine with Apache Spark and Apache Hadoop to create a single, secure, and unified data platform. It is available exclusively to run on Linux containers, orchestrated by Kubernetes, and can be deployed in multiple-cloud providers or on-premises.

Today, we’re proud to announce the release of the latest cumulative update, CU13, for SQL Server Big Data Clusters which includes important changes and capabilities:

Hadoop Distributed File System (HDFS) distributed copy capabilities through azdata
Apache Spark 3.1.2
SQL Server Big Data Clusters runtime for Apache Spark release 2021.1
Password rotation for Big Data Cluster’s auto-generated Active Directory service accounts during BDC deployment
Enable Advanced Encryption Standard (AES) Optional parameter on the automatically generated AD accounts

Major improvements in this update are highlighted below, along with resources for you to learn more and get started.

HDFS distributed copy capabilities through azdata

Hadoop HDFS DistCP is a command line tool that enables high-performant distributed data copy between HDFS clusters. On SQL Server Big Data Clusters CU13 we are surfacing the capability of distcp through the new azdata bdc hdfs distcp command to enable inter Big Data Clusters distributed data copy. This enables data migration scenarios between SQL Server Big Data Clusters; supporting both secure and non-secure cluster deployment configurations.

For more information, see:

Apache Spark 3.1.2

Up to cumulative update 12, Big Data Clusters relied on the Apache Spark 2.4 line, which reached its end of life in May 2021. Consistent with our continuous improvement commitment to the Big Data and Machine Learning capabilities of the Apache Spark engine, CU13 brings in the current release of Apache Spark, version 3.1.2.

This new version of Apache Spark brings stellar performance benefits on big data processing workloads. Using the reference TCP-DS 10 TB workload in our tests we were able to reduce runtime from 4.19 hours to 2.96 hours, a 29.36 percent improvement achieved just by switching engines while using the same hardware and configuration profiles, no additional application optimizations. The improvement mean of individual query runtime is 36 percent.

Spark 3 is a major release and as such, contains breaking changes. Following the same established best practice in the SQL Server universe, perform a side-by-side deployment of SQL Server Big Data Clusters to validate your current workload with Spark 3 before upgrading. You can leverage the new azdata HDFS distributed copy capability to have a subset of your data needed to validate this workload. For more information, see the following articles to help you assess your scenario before upgrading to the CU13 release:

SQL Server Big Data Clusters runtime for Apache Spark release 2021.1

With this release of SQL Server Big Data Clusters, we doubled down on our commitment of release cadence, binary compatibility, and consistency of experiences for data engineers and data scientists through the SQL Server Big Data Clusters runtime for Apache Spark initiative.

The SQL Server Big Data Clusters runtime for Apache Spark is a consistent versioned block of programming language distributions, engine optimizations, core libraries, and packages for Apache Spark.

Here is a summary of the SQL Server Big Data Clusters runtime for Apache Spark release 2021.1 shipped with SQL Server Big Data Clusters CU13:

Apache Spark 3.1.2
Scala 2.12 for Scala Spark
Python 3.8 for PySpark
Microsoft R Open 3.5.2 for SparkR and sparklyr

For more information on all included packages and how to use it, see:

Password rotation for Big Data Cluster’s Active Directory service accounts

When a big data cluster is deployed with Active Directory integration for security, there are Active Directory (AD) accounts and groups that SQL Server creates during a big data cluster deployment, see auto-generated active directory objects for further information.

When it comes to security-sensitive customers, it is usually required security reinforcement such as setting password expiration policies, allowing the administrator to set user passwords to never expire or expire after a certain number of days. For SQL Server Big Data Cluster deployments it was previously required to manually rotate the password for those auto-generated active directory objects.

With SQL Server Big Data Clusters CU13, we are now releasing the azdata bdc rotate command to rotate passwords for all auto-generated accounts except for the DSA account. In order to update the DSA password for SQL Server Big Data Clusters we are releasing a specific operation notebook.

Enable Advanced Encryption Standard (AES) on the automatically generated AD accounts

Today’s enterprise environments are facing a lot more challenges than it used to be. Using secure and encrypted connections when authenticating with Kerberos will significantly lower the risk to encounter attacks such as Kerberoasting; a type of attack targeting service accounts in Active Directory. Starting with SQL Server Big Data Clusters CU13, we’re enabling the Advanced Encryption Standard (AES) support on the auto-generated AD accounts by allowing users to set an optional boolean parameter in the BDC deployment profile to indicate this AD account supports Kerberos AES 128 bit and 256 bit encryptions.

For more information, see:

Ready to learn more?

Check out the SQL Server Big Data Clusters CU13 release notes to learn more about all the improvements available with the latest update. For a technical deep-dive on Big Data Clusters, read the documentation and visit our GitHub repository.

Follow the instructions on our documentation page to get started and deploy Big Data Clusters.

The post What’s new with SQL Server Big Data Clusters—CU13 Release appeared first on Microsoft SQL Server Blog.

Open sourcing the .NET 5 C# Language Extension for SQL Server

Nikita Takru — Wed, 08 Sep 2021 16:30:44 +0000

For over two decades, the C# programming language has allowed developers to build secure and robust applications within the .NET ecosystem.

SQL Server 2019 supports the R, Python, and Java Language Extensions. These language extensions provide many benefits to developers. They provide data security, rapid speed for deployment, and ease of integration.

Previously, we announced the release of Java, R, and Python extensions. Today we are thrilled to share that we are open sourcing the .NET 5 C# Language Extension for SQL Server on GitHub.

This extension is yet another example of using the developed programming language extensibility architecture which provides integration with a new type of language extension. This latest architecture allows customers to have the freedom to use existing SQL Server tables which they can pass to a C# application as a DataFrame. Then, they can perform operations in C#, use the rich libraries, and obtain back a result set. One of the reasons to use C# is to reuse existing customer C# code, calculations, logic, or extensive libraries that provide functionality which you cannot get in T-SQL. We encourage developers to use the .NET 5 C# Language Extension and build on it.

Let us look at what use cases C# can enable inside SQL Server. Bringing C# workloads closer to the data opens a variety of possibilities:

Data cleaning.
Fast data querying.
Any processing in C# can now occur through a DataFrame.
Customers are not limited to the T-SQL language surface area.
C# application development teams that leverage SQL Server as backend storage can now even embed C# code in stored procedures which enables pushing business logic down into the database for better performance.
Furthermore, this will help avoid unnecessary data movement and latency when data must be retrieved from SQL Server and moved into the app tier to do the business logic processing.

Why open source?

Like Java, Python, and R have open-source language extensions, it is a natural next step to open source the .NET 5 C# Language Extension. This language extension leverages the Extensibility Framework API for SQL Server and this API is publicly documented. The API along with the open-source code of the .NET Core C# language extension shows an example of how a programming language extension can be built. Now, more developers in the SQL Server community can continue to develop additional programming language extensions.

Get started

If you want to start using the .NET 5 C# language extension for SQL Server, follow this tutorial. If you are interested in creating your own language extension, explore more information on SQL Server Language Extensions.

The post Open sourcing the .NET 5 C# Language Extension for SQL Server appeared first on Microsoft SQL Server Blog.

Microsoft at Data Platform Virtual Summit 2021

SQL Server Team — Tue, 24 Aug 2021 16:00:02 +0000

Data Platform Virtual Summit 2021 (DPS 2021) is just a few days away. A free, global learning event for the data professionals, DPS 2021 will feature over 150 breakout sessions and 15 training classes. This content will be delivered by Azure Data Engineering, partner organizations, community leaders, and Data Platform MVPs. The event is fine-tuned for global time zones, AMERICA, EMEA, and APAC making it a truly global and inclusive learning event. Attendees will get to learn about the latest SQL Server and Azure Data innovations and gain deep technical skills to move ahead in your careers.

DPS 2021 will feature five parallel tracks focusing on Azure Data (Development and Administration), Advanced Analytics, Power BI, and Artificial Intelligence. The virtual platform will give a truly immersive experience to the attendees by offering live Q&A, networking lounge, Azure Data + AI Gurukul (Technical Round Tables), and community zone. Attendees can attend sessions and network amongst peers and speakers, convenient to their time zones. Delegates will get the session recordings for 12-month on-demand access. DPS 2021 offers an incredible opportunity to learn directly from our engineering teams, who will be sharing the latest advances and insights on the data platform, providing in-depth training across key products and technologies.

Bob Ward, Anna Hoffman, and Buck Woody will be the keynote speakers for DPS 2021. The trio will showcase SQL from edge to cloud.

Microsoft Azure Data Teams are delivering over 40 sessions at DPS 2021. Hear the latest from the teams who develop the tools you use every day, and engage in live discussions. Visit the virtual expo hall where you can connect with our team across SQL Server, Azure SQL, Azure Synapse Analytics, Power BI, and more.

DPS 2021 features a variety of learning formats including breakouts, deep-dives, short-drives, and demo-only sessions.

Register today

Register for Data Platform Virtual Summit today and be part of an amazing week of training convenient to your time zone and receive 12 months of on-demand access to world-class learning.

The post Microsoft at Data Platform Virtual Summit 2021 appeared first on Microsoft SQL Server Blog.

What’s new with SQL Server Big Data Clusters—CU11 Release

Daniel Coelho — Thu, 08 Jul 2021 16:00:23 +0000

SQL Server Big Data Clusters (BDC) is a capability brought to market as part of the SQL Server 2019 release. BDC extends SQL Server’s analytical capabilities beyond in-database processing of transactional and analytical workloads by uniting the SQL engine with Apache Spark and Apache Hadoop to create a single, secure, and unified data platform. BDC is available exclusively to run on Linux containers, orchestrated by Kubernetes, and can be deployed in multiple-cloud providers or on-premises.

Today, we’re announcing the release of the latest cumulative update, CU11, for SQL Server Big Data Clusters, which includes important capabilities:

Encryption at Rest with external key providers, commonly known as “bring your own key” (BYOK).
Several SQL Server PolyBase Hadoop fixes and SQL Server PolyBase additional support to many data sources.

Major improvements in this update are highlighted below, along with resources for you to learn more and get started.

Data Encryption at Rest

SQL Server 2019 CU8 introduced the Encryption at Rest initial feature set, bringing together a system-managed experience across both SQL Server and HDFS components. With each additional release shaped by our community and insightful customer feedback, many features were added. With the release of the latest cumulative update, CU11, we get to a complete Encryption at Rest feature set, with seamless application-level encryption for the SQL Server and HDFS components.

In CU11, we introduced the BYOK functionality with integration with external key providers, such as Hardware Security Modules (HSM) or services like Azure Key Vault or even Hashicorp Vault. With that capability SQL Server Big Data Clusters Encryption at Rest feature set now contains both system-managed and user-managed Encryption at Rest for SQL Server and HDFS components.

To learn more about the complete Encryption at Rest feature set, see the in-depth documentation:

SQL Server Big Data Clusters PolyBase improvements

Consistent with our commitment to continuous improvements of the Data Virtualization and scale-out capabilities, CU11 bring fixes and new support for the following data sources: Hortonworks HDP 3.1, Cloudera CDH 6.1, 6.2, 6.3, Azure Blob Storage (WASB[S]), and Azure Data Lake Storage Gen2 (ABFS[S]).

For more information, see:

Ready to learn more?

Check out the SQL Server Big Data Clusters CU11 release notes to learn more about all the improvements available with the latest update. For a technical deep-dive on Big Data Clusters, read the documentation and visit our GitHub repository.

Follow the instructions on our documentation page to get started and deploy Big Data Clusters.

The post What’s new with SQL Server Big Data Clusters—CU11 Release appeared first on Microsoft SQL Server Blog.

Microsoft Azure at Data Platform Virtual Summit 2020

SQL Server Team — Thu, 19 Nov 2020 17:00:48 +0000

Data Platform Virtual Summit 2020 (DPS 2020) is just a couple of weeks away. A global learning event for data professionals, DPS 2020 features a keynote from Rohan Kumar, Microsoft Corporate Vice President of Azure Data, as well as 200 breakout sessions and 30 training classes delivered by Azure Data engineering, partner organizations, and community leaders. With content delivered around-the-clock, DPS 2020 empowers Azure Data professionals worldwide with the deep technical skills they need to move ahead in their careers and digitally transform their organizations.

This year, DPS 2020 features five parallel tracks focusing on Azure Data:

Advanced Analytics
Artificial Intelligence
Azure Data Administration
Azure Data Development
Power BI

The virtual platform offers live Q and A, a networking lounge, a community zone, and technical round tables. Additionally, attendees will receive 12 month on-demand access to session recordings.

DPS 2020 offers an incredible opportunity to learn directly from our engineering teams, who will share the latest advances and insights on the Azure Data platform.

Rohan Kumar will deliver the keynote. Rohan will highlight the latest innovations across the Microsoft Azure Data platform and share customer case studies. The keynote will also feature demos from multiple Microsoft engineers, including Anitha Adusumilli, Anna Hoffman, Buck Woody, Travis Wright, and Vasiya Krishnan.
Microsoft Azure Data engineering teams will deliver over 35 sessions at DPS 2020. Hear the latest from the people who develop the tools you use every day, and engage in live discussions.
Visit the virtual expo hall where you can connect with our team across SQL Server, Azure SQL, Azure Synapse Analytics, Power BI, and more.

Register now for a week of training at Data Platform Virtual Summit and receive twelve months of on-demand access to the DPS 2020 sessions.

The post Microsoft Azure at Data Platform Virtual Summit 2020 appeared first on Microsoft SQL Server Blog.

Microsoft Azure at PASS Virtual Summit 2020

SQL Server Team — Tue, 20 Oct 2020 17:00:49 +0000

It is almost time for PASS Virtual Summit 2020, the premier learning event for the data professional. Similar to the in-person event, this virtual conference will include the latest SQL Server and Azure Data innovations, practical training, and networking—to empower you to transform your career and your company.

This year, PASS is taking full advantage of the virtual platform and partnering with Microsoft to make your PASS Virtual Summit experience second to none. Learn directly from our engineering team, who will be sharing the latest insights and advances on the data platform, providing in depth training across key tools and topics, and engaging live for one to one support to help you solve your challenges in real time.

If you haven’t already heard, Rohan Kumar, Microsoft Corporate Vice President of Azure Data, will be back this year for the Day 1 keynote. Rohan will showcase the latest innovation across the Microsoft Data Platform, including exclusive announcements at PASS Virtual Summit. Additionally, in the Day 2 keynote, Technical Fellow, Hanuma Kodavalla will discuss SQL eco-system transformation and innovation at scale.

After the keynotes, ground your learning with in-depth training in one of the over thirty sessions Microsoft will be delivering. Hear the latest from the teams who develop the tools you use every day, and ask or upvote questions through the Interactive Platform. After your sessions, don’t forget to visit the virtual exhibit hall where you can connect with our team across SQL Server, Azure SQL, Azure Synapse Analytics, Power BI, and more.

The Microsoft Azure Data Clinic will be back again this year, providing you with one to one access to the engineers behind the products you love. This year we are taking full advantage of the virtual platform, to allow you to pre-book, scheduling your clinic appointment around those must-attend networking activities, and ensuring that you are connected to the right expert for your challenge.

Register for PASS Virtual Summit today for an incredible week of training, and 12 months on-demand access to world class learning.

The post Microsoft Azure at PASS Virtual Summit 2020 appeared first on Microsoft SQL Server Blog.

Expanding SQL Server Big Data Clusters capabilities, now on Red Hat OpenShift

Mihaela Blendea — Tue, 23 Jun 2020 17:00:15 +0000

SQL Server Big Data Clusters (BDC) is a new capability brought to market as part of the SQL Server 2019 release. BDC extends SQL Server’s analytical capabilities beyond in-database processing of transactional and analytical workloads by uniting the SQL engine with Apache Spark and Apache Hadoop to create a single, secure and unified data platform. BDC is available exclusively to run on Linux containers, orchestrated by Kubernetes, and can be deployed in multiple-cloud providers or on-premises.

Today, we’re announcing the availability of the latest cumulative update (CU5) for SQL Server 2019, that includes important capabilities for SQL Server and BDC including:

Support for deploying BDC on Red Hat OpenShift Kubernetes platform.
Enabled running applications within BDC as non-root users.
Support for deploying multiple BDCs against the same Active Directory domain.
Enriched data virtualization experiences.
Enhanced and open sourced Spark SQL connector.
Miscellaneous improvements and bug fixes.

This announcement blog highlights some of the major improvements, provides additional context to better understand the design behind these capabilities, and points you to relevant resources to learn more and get you started.

Deploy Big Data Clusters on Red Hat OpenShift Kubernetes platform

Red Hat OpenShift provides an enterprise-grade, commercially-supported distribution of Kubernetes as the foundation of its container platform across hybrid and multi-cloud environments. Through a close partnership with the Red Hat team, today we’re announcing support for SQL Server BDC deployments on OpenShift, for version 4.3 and up, on-premises or in public cloud environments with (ARO). You can now leverage a fully supported stack to operationalize your next unified analytics platform using BDC, ensuring design and development best practices, and enterprise-grade security guidelines that are core to OpenShift.

We have enhanced the security design of BDC to take full advantage of the OpenShift Container Platform. In addition to privileged containers being no longer required, containers are also running as a non-root user by default. This includes enabling enhanced process isolation within a container. The white paper produced in collaboration with SQL Server and Red Hat security teams describes the design in detail, highlighting what and why we require certain security policies when deploying BDC on OpenShift.

The BDC deployment model and experiences were enhanced so that you can follow the prescribed guidance, in an integrated manner, with built-in deployment profiles targeting OpenShift environments or UX enhancements in Azure Data Studio that include OpenShift as a target platform. With containers and Kubernetes powered Red Hat OpenShift, organizations can achieve the desired agility, scalability, flexibility, security, and portability for Big Data Clusters.

Bringing SQL Server and Big Data Clusters to the OpenShift Container Platform has been a real team effort. Red Hat provided our team with valuable help, bootstrapping our initial efforts, as well as providing best practice guidance during implementation. Security and trust are critical for both companies and so we appreciate the valuable input and contributions of Dan Walsh, Senior Distinguished Engineer at Red Hat, and Michael Nelson, Principal Software Engineering Manager at Microsoft, who collaborated on the security design for Big Data Clusters on OpenShift.

For more information on the BDC deployment process on OpenShift, follow the instructions on our documentation page.

Secure by default containers, running as non-root users

As a modern data platform, BDC ensures enterprise-grade secure data access by enabling Active Directory authentication though innovative implementations for applications running in containers. In addition, we are now making the platform implementation safer by ensuring that all container applications running within BDC are started as non-root users by default, on all supported platforms. These capabilities are available for all new deployments using the SQL Server 2019 CU5 corresponding image tag. Existing pre-CU5 BDC deployments will not be impacted, and applications in these clusters will continue to run as root user. Support for migrating these clusters to non-root type configuration will be added in a future cumulative update.

Deploy multiple BDCs against the same Active Directory domain

To complement the above platform enhancements regarding secure big data clusters, we are pleased to announce that we added support for deploying multiple BDCs against a single Active Directory domain. You can now leverage multiple BDC deployments in your secure enterprise environment, to accommodate multiple use cases like development/test, pre-production or production, CI/CD pipelines or HADR.

To learn more about Active Directory integration for BDC and deploying multiple BDCs against the same domain, see the security related topics on our documentation page.

Announcing new data virtualization enhancements

In addition to the improvements above, we have also improved our data virtualization capabilities. Namely, we’ve introduced two new stored procedures, sp_data_source_objects and sp_data_source_table_columns, to support introspection of certain External Data Sources. They can be used by customers directly via T-SQL for schema discovery and to see what tables are available to be virtualized. We leverage these in the External Table Wizard of the Data Virtualization Extension for Azure Data Studio, which allows you to create external tables from SQL Server, Oracle, MongoDB, and Teradata.

For more information on the external table wizard, visit the documentation page.

SQL Server and Azure SQL Connector for Apache Spark Open Sourcing

BDC includes the SQL Server and Azure SQL Connector for Apache Spark. Based on the Apache Spark DataSource V1 APIs and SQL Server Bulk APIs, this connector enables you to read/write to and from any SQL Server using Apache Spark. As part of Microsoft’s commitment to open-source technology, we will be releasing this connector under the ApacheV2 license for anyone to use and contribute to. Stay tuned for more updates once the connector is live!

SQL Server BDC team hears your feedback

If you would like to help make BDC an even better analytics platform, please share any recommendations or report issues through our feedback page. SQL Server engineering team is thoroughly going through the reported suggestions. They are valuable input for us, that is being considered when planning and prioritizing the next set of improvements. We are committed to ensuring that SQL Server enhancements are based on customer experiences, so we build robust solutions that meet real production requirements in terms of functionality, security, scalability, and performance.

Ready to learn more?

With SQL Server 2019 CU5 updates, BDC continues to simplify the security, deployment, and management of your key data workloads. Industry-leading innovative security and compliance features and support for market-leading Kubernetes based platforms like Red Hat’s OpenShift will help our mutual customers achieve the expected agility, scalability, flexibility, and portability to develop and operationalize intelligent applications.

Check out the SQL Server CU5 release notes for BDC to learn more about all the improvements available with the latest update. For a technical deep-dive on Big Data Clusters, read the documentation and visit our GitHub repository.

To get started with deploying BDC on OpenShift, follow the instructions on our documentation page. Make sure to read the Security Best Practices whitepaper to better understand the security requirements.

The post Expanding SQL Server Big Data Clusters capabilities, now on Red Hat OpenShift appeared first on Microsoft SQL Server Blog.

Open innovation, customer choice, and reliability with SQL Server on SUSE

Victoria Nwobodo — Tue, 26 May 2020 17:00:05 +0000

With nearly two decades of delivering joint innovation to meet changing business demands, Microsoft and SUSE continue to focus on enabling digital transformation for our customers, building on open source solutions, and a seamless collaborative support model for SUSE workloads on SQL Server and Azure.

To broaden deployment options for our customers, you can run SQL Server on-premises with SUSE Linux Enterprise Server SLES 12, in the cloud with SLES-based Azure Virtual Machines, and in SQL Server SLES containers running on the SUSE Container as a Service (CaaS) Platform, and expect a consistent experience from on-premises to cloud. Customers will not need a separate repository or package requirements in case of SLES 12 SP5 and the guidelines documentation for quickstart installation remains current. Support levels are also updated in the release notes. Additionally, last week at SUSECON Digital we made announcements related to Azure Arc hybrid capabilities. Azure Arc enables deployment of Azure services anywhere and extends Azure management to any infrastructure across on-premises, edge, and multi-cloud environments.

SQL Server 2019 on SLES 12 on-premises

Announcing SQL Server 2019 on SLES 12 SP5 is now fully supported for production use. With SQL Server 2019, you can take advantage of features such as intelligent query processing, data virtualization, accelerated database recovery, improved developer experiences, and much more deployed on the SUSE SLES 12 environment of your choice. See how SQL Server 2019 customers like Itaú Unibanco, a banking leader in Latin America, use the intelligent query processing capabilities in SQL Server 2019 with virtually no code changes to achieve incredible performance for several business-critical processes.

One major performance improvement in SUSE SLES 12 SP5 is the support for Forced Unit Access (FUA) for user-mode IO calls for XFS file system. Users of SQL Server on Linux may have been introduced to certain storage and IO flush related configurations due to the unavailability of FUA in user mode. With the introduction of FUA capability support in user mode for XFS filesystem by the SUSE engineering team, users can realize high-performance gains for IO-bound workloads on SQL Server.

Learn more about how you can get the best performance with SQL Server 2019 on SUSE SLES 12.

SQL Server 2019 on SLES 12 Azure Virtual Machines

For customers interested in running SQL Server as Infrastructure as a Service (IaaS), the Azure Marketplace offers pre-configured SQL Server 2019 on SUSE SLES 12 Azure Virtual Machines. The SUSE Linux Enterprise Server with High Availability Extension provides mission-critical uptime, fast failover, improved manageability, and easy configuration for Always-on Availability Groups (AG) for SQL Server High Availability setup on Azure. While users will be able to set up a pacemaker cluster on SUSE SLES VM on Azure, high availability in the SUSE repository will be enabled in a future SQL Server community update.

Currently, to have a highly available environment with SUSE on Azure, you can bring your own subscriptions to Azure SLES 12 SP5 and SUSE HA extension.

Customers can bring their SQL Server licenses with Software Assurance to the cloud with Azure Hybrid Benefit to maximize cost savings, and may also consider Azure Reserved VM Instances.

Learn how to get started running SQL Server on Azure Virtual Machines on SUSE Linux in this on-demand webinar.

SQL Server on SUSE CaaS Platform

Deploying SQL Server in containers simplifies and speeds up deployments making it easier for application development, database DevOps, and deploying in production. Customers can run SQL Server on the Kubernetes-based SUSE CaaS Platform at-scale in your on-premises environment.

Learn more and get started today

There are a number of exciting free virtual sessions from SUSECON Digital for you to watch either live or on-demand. Here’s a list of all content from Microsoft at SUSECON Digital. Highlighted sessions include:

Learn more about SQL Server on Azure Virtual Machines, and SQL Server 2019 Big Data Clusters. For a technical deep-dive on Big Data Clusters, read the documentation and visit our GitHub repository.

Get started today by downloading SQL Server on SUSE Linux on-premises or provisioning a pre-configured Azure Virtual Machine image.

The post Open innovation, customer choice, and reliability with SQL Server on SUSE appeared first on Microsoft SQL Server Blog.