Data analytics - Microsoft SQL Server Blog http://approjects.co.za/?big=en-us/sql-server/blog/topic/data-analytics/ Official News from Microsoft’s Information Platform Fri, 19 Apr 2024 17:47:03 +0000 en-US hourly 1 http://approjects.co.za/?big=en-us/sql-server/blog/wp-content/uploads/2018/08/cropped-cropped-microsoft_logo_element-150x150.png Data analytics - Microsoft SQL Server Blog http://approjects.co.za/?big=en-us/sql-server/blog/topic/data-analytics/ 32 32 The path forward for SQL Server analytics http://approjects.co.za/?big=en-us/sql-server/blog/2022/02/25/the-path-forward-for-sql-server-analytics/ Fri, 25 Feb 2022 18:00:00 +0000 Today, we are announcing changes to SQL Server analytics.

The post The path forward for SQL Server analytics appeared first on Microsoft SQL Server Blog.

]]>
Today, we are announcing changes to SQL Server analytics which includes:

  • Customer feedback
  • Retirement of SQL Server 2019 Big Data Clusters
  • Retirement of PolyBase scale-out groups
  • Path forward

Customer feedback

We continue to see increased migration to the cloud, with analytical workloads leading that charge.

Customers have indicated that analytics in the cloud best aligns to employee skillsets, deployment simplicity and manageability, and cloud flexibility and scalability.

When we first introduced cloud analytics in 2017, many were still investing in on-premises analytical workloads. Today, we offer a wealth of cloud-based services that provide users with similar functionality, including Azure Data Lake Storage (ADLS), Azure Synapse Analytics, Azure SQL, and Azure Machine Learning.

According to the Gartner® 2020 Data and Analytics survey:

  • Analytics, BI, and data science are the most common use cases being accelerated to the cloud due to COVID-19. The organization needs faster delivery of analytics insights to take action. Cloud, with its fast provision and prototyping ability, is an ideal place to start analytics and data science initiatives to nimbly react to the fast pace of changes.¹
  • In the 2020 Gartner Data and Analytics Cloud survey, 74 percent of organizations use or plan to use cloud for analytics, BI and data science.¹

Retirement of SQL Server Big Data Clusters

Today, we are announcing the retirement of SQL Server 2019 Big Data Clusters. All existing users of SQL Server 2019 with Software Assurance will be fully supported on the platform for the next three years, through February 28, 2025. This software will continue to be maintained through SQL Server cumulative updates until that time. In the latest version of SQL Server, we are engineering the best mix of on-premises and in-cloud relational workloads and connectivity to Azure Synapse Analytics for advanced analytics in a flexible, scalable, and integrated environment. Please see below and read our documentation on SQL Server Big Data Clusters to learn more.

Changes to PolyBase support in SQL Server

Today, we are announcing the retirement of PolyBase scale-out groups in Microsoft SQL Server. Scale-out group functionality will be removed from the product in SQL Server 2022. In-market SQL Server 2019, 2017, and 2016 will continue to support the functionality to the end of support for those products.

PolyBase data virtualization will continue to be fully supported as a scale-up feature in SQL Server.

Secondly, Cloudera (CDP) and Hortonworks (HDP) external data sources will also be retired for all in-market versions of SQL Server and will not be included in SQL Server 2022. Moving forward, support for external data sources will be limited to product versions in mainstream support by the respective vendor. You are encouraged to use the new object storage integration functionality available in SQL Server 2022.

In SQL Server 2022, users will need to configure their external data sources to use new connectors when connecting to Azure Storage. The table below summarizes the change:

External Data SourceFromTo
Azure Blob Storagewasb[s]abs
ADLS Gen 2abfs[s]adls

The path forward

If you wish to run analytics on-premises, SQL Server 2022 also provides important new capabilities, building upon its data virtualization suite of connectors by providing object storage integration over REST APIs. We will also continue to invest in the Spark SQL connector to ensure first-class connectivity from Apache Spark to all our SQL products. Additionally, we continue to invest in expanding hybrid capabilities with Azure Arc-enabled data services.

Integrating SQL Server with cloud analytics solutions is a critical capability, which is why we are introducing Azure Synapse Link for SQL Server 2022, the latest release of SQL Server, which will be generally available to purchase later this year. This is a major investment in helping you realize cloud-scale analytics in near real-time on your operational data.

Our priority is to empower you with the tools and services that ensure SQL Server integrates seamlessly into the world of analytic workloads in the cloud by blending operational, analytical, and virtual use cases in our flagship database engine. Please contact your Microsoft account manager if you need assistance in exploring how Microsoft can best empower your analytical needs.


¹Gartner Inc.: Use Cloud to Compose Analytics, BI and Data Science Capabilities for Reusability and Resilience, Julian Sun, Joao Tapadinhas, June 10, 2021.

GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and is used herein with permission. All rights reserved. 

The post The path forward for SQL Server analytics appeared first on Microsoft SQL Server Blog.

]]>
Meet us at SQLBits 2022 and level up as a data professional http://approjects.co.za/?big=en-us/sql-server/blog/2022/02/10/meet-us-at-sqlbits-2022-and-level-up-as-a-data-professional/ Thu, 10 Feb 2022 16:00:00 +0000 We are excited to be the premium sponsor at this year’s SQLBits 2022, March 8 – 12, in-person in London and virtually.

The post Meet us at SQLBits 2022 and level up as a data professional appeared first on Microsoft SQL Server Blog.

]]>
It has been over two years since we have had the opportunity to meet face-to-face with our data community at a large event and we miss it. From hallway conversations to the energy that comes from solving problems and helping people understand complex concepts, we cannot wait to teach, meet and greet everyone. This is why we are excited to be the premium sponsor at this year’s SQLBits 2022, March 8 – 12, in-person in London and virtually.

As the lead sponsor, we will deliver content including the keynote, five full-day training days, and over fifty general sessions. With so many opportunities to educate, we are bringing the full Azure data team including folks from across the data platform, such as SQL Server, Azure SQL, Cosmos DB, Azure Purview, Azure Synapse Analytics, and Power BI.

Start the week with my team for two day-long training sessions where you will have a unique chance to work directly with Microsoft engineering:

The Hands-on Azure SQL Workshop on March 8 will help you translate your existing SQL Server skills to Azure SQL. Bring your laptop and get ready to learn hands-on. You will gain a foundational knowledge of what to use when, as well as how to configure, monitor, and troubleshoot the “meat and potatoes” of SQL Server in Azure: security, performance, and availability.

Migrate SQL Server to Azure on Wednesday, March 9 will help you migrate your SQL Server environments to Azure. In this session, the Microsoft engineering team will show you everything you need to know, including the tools and knowledge you need to make your migrations seamless, cost-efficient, and optimized for speed.

Other training sessions cover topics such as Azure SQL Database, Synapse Analytics, and Power BI.

All speaker proceeds from these sessions will be given back to a local charity.

The SQLBits event theme this year is Video Games—and in the “Level Up With Azure Data” keynote, Buck Woody has asked me to come talk about SQL Server 2022 and Azure Data. He assures me I will have help with some surprise guests so it should be interesting. It is always a fun keynote when Buck and I are on stage, and this year you really do not want to miss it!

You also have the opportunity to attend the Microsoft general sessions to learn about the entire Azure data platform.

Take a look at some of the learning available SQLBits 2022

Unified Data Governance with Azure PurviewGaurav Malhotra, Evangeline White
What’s New in Azure SQL MINiko Neugebauer
The fundamentals of building a lakehouse with SynapseLuke Moloney
SQL Server in Azure Virtual Machine ReimaginedPam Lahoud
Microsoft Database InnovationsAnna Hoffman
Azure Arc-Enabled Data ServicesJes Schultz, Buck Woody
Azure SQL Database customer success stories for IoT workloadsSilvano Coriani
Azure SQL availability and resiliencyEmily Lisa
Microsoft SQL Server 2022 Deep Dive (two parts)Pedro Lopes
Modernize your Oracle workloads to Azure DataAlexandra Ciortea
Empowering every individual with Power BIMohammad Ali, Patrick LeBlanc
AMA with the Microsoft Engineering team hosted by
Bob Ward
“Rockstars” of the engineering team

See all the opportunities to engage with Microsoft engineering by heading over to our blog on Microsoft Tech Community, Ready for SQLBits 2022. And don’t forget to stop by our booth, where you can get your questions answered by members of the Microsoft team.

SQLBits is a marathon of top-quality training from global specialists, with two days of full-day training sessions and three days of general sessions. As always with SQLBits, Saturday, March 12 is free to attend. Meet with community leaders sharing their real-world experience and Microsoft product teams providing deep insights into innovations that meet your needs.

Register today for SQLBits 2022

Join Microsoft at this hybrid event for the latest on the data platform and a chance to see whether Buck Woody or I have the best arcade game skills!

Register to attend, and we’ll see you there, in-person, or virtually!

The post Meet us at SQLBits 2022 and level up as a data professional appeared first on Microsoft SQL Server Blog.

]]>
PASS Data Community Summit keynote: a bridge to a new universe http://approjects.co.za/?big=en-us/sql-server/blog/2021/11/08/pass-data-community-summit-keynote-a-bridge-to-a-new-universe/ Mon, 08 Nov 2021 18:00:40 +0000 It is almost time for PASS Data Community Summit 2021, a free online conference for the Microsoft data platform professional.

The post PASS Data Community Summit keynote: a bridge to a new universe appeared first on Microsoft SQL Server Blog.

]]>
It is almost time for PASS Data Community Summit 2021, a free online conference for the Microsoft data platform professional. The conference, hosted by Redgate, will include the latest SQL Server and Azure data innovations, practical training, and networking to empower you to transform your career and your organization. This year’s event is coming to you online for free from November 8 – 12, 2021, and we will continue the tradition of a Microsoft day one keynote.

Deliver faster performance than ever before with SQL Server and Azure

Hear directly from Microsoft’s Rohan Kumar and senior Microsoft engineering leaders during the day one kick-off keynote as they take you on a journey to a new universe shaped by our past—and built to take us into a limitless future. The cloud has created a whole new universe and advancements in Microsoft data products and services are your bridge.

You’ll see how you can use your existing SQL Server and Azure skills, and learn about new tools and platforms available from Microsoft to deliver faster performance than ever before. You’ll see how to shape your data so you can harness its power to find a new galaxy of insights, answers, and predictions. And you will hear about new innovations that continue Microsoft’s rich heritage of data integrity and governance.

Additionally, in the special on-demand keynote, Microsoft Azure Data CTO Raghu Ramakrishnan and team will share a technical keynote and demos showing Azure Purview and SQL.

Register for the PASS Data Community Summit

Don’t miss this opportunity to see how Microsoft is uniquely positioned to provide you with an end-to-end data platform seamlessly integrating limitless database scale and performance, unmatched analytics and intelligence, and unified data governance.

After the keynotes, ground your learning with in-depth training in one of more than two dozen sessions Microsoft will be delivering. Hear the latest from the Engineering teams who develop the tools you use every day. After your sessions, don’t forget to visit the virtual exhibit hall where you can connect with our team across SQL Server 2022, Azure SQL, Azure Synapse Analytics, Microsoft Power BI, Azure Arc, and more.

Register for PASS Data Community Summit today.

The post PASS Data Community Summit keynote: a bridge to a new universe appeared first on Microsoft SQL Server Blog.

]]>
What’s new with SQL Server Big Data Clusters—CU13 Release http://approjects.co.za/?big=en-us/sql-server/blog/2021/10/06/whats-new-with-sql-server-big-data-clusters-cu13-release/ Wed, 06 Oct 2021 15:00:09 +0000 Today, we’re proud to announce the release of the latest cumulative update, CU13, for SQL Server Big Data Clusters which includes important changes and capabilities.

The post What’s new with SQL Server Big Data Clusters—CU13 Release appeared first on Microsoft SQL Server Blog.

]]>
SQL Server Big Data Clusters (BDC) is a capability brought to market as part of the SQL Server 2019 release. Big Data Clusters extends SQL Server’s analytical capabilities beyond in-database processing of transactional and analytical workloads by uniting the SQL engine with Apache Spark and Apache Hadoop to create a single, secure, and unified data platform. It is available exclusively to run on Linux containers, orchestrated by Kubernetes, and can be deployed in multiple-cloud providers or on-premises.

Today, we’re proud to announce the release of the latest cumulative update, CU13, for SQL Server Big Data Clusters which includes important changes and capabilities:

  • Hadoop Distributed File System (HDFS) distributed copy capabilities through azdata
  • Apache Spark 3.1.2
  • SQL Server Big Data Clusters runtime for Apache Spark release 2021.1
  • Password rotation for Big Data Cluster’s auto-generated Active Directory service accounts during BDC deployment
  • Enable Advanced Encryption Standard (AES) Optional parameter on the automatically generated AD accounts

Major improvements in this update are highlighted below, along with resources for you to learn more and get started.

HDFS distributed copy capabilities through azdata

Hadoop HDFS DistCP is a command line tool that enables high-performant distributed data copy between HDFS clusters. On SQL Server Big Data Clusters CU13 we are surfacing the capability of distcp through the new azdata bdc hdfs distcp command to enable inter Big Data Clusters distributed data copy. This enables data migration scenarios between SQL Server Big Data Clusters; supporting both secure and non-secure cluster deployment configurations.

For more information, see:

Apache Spark 3.1.2

Up to cumulative update 12, Big Data Clusters relied on the Apache Spark 2.4 line, which reached its end of life in May 2021. Consistent with our continuous improvement commitment to the Big Data and Machine Learning capabilities of the Apache Spark engine, CU13 brings in the current release of Apache Spark, version 3.1.2.

This new version of Apache Spark brings stellar performance benefits on big data processing workloads. Using the reference TCP-DS 10 TB workload in our tests we were able to reduce runtime from 4.19 hours to 2.96 hours, a 29.36 percent improvement achieved just by switching engines while using the same hardware and configuration profiles, no additional application optimizations. The improvement mean of individual query runtime is 36 percent.

Individual TCP-DS 10TB query runtimes between Spark 2.4 and Spark 3.1. Chart shows that average runtimes across all queries are 30 lower, highlighting the benefits of using Spark 3.1 with CU13.

Spark 3 is a major release and as such, contains breaking changes. Following the same established best practice in the SQL Server universe, perform a side-by-side deployment of SQL Server Big Data Clusters to validate your current workload with Spark 3 before upgrading. You can leverage the new azdata HDFS distributed copy capability to have a subset of your data needed to validate this workload. For more information, see the following articles to help you assess your scenario before upgrading to the CU13 release:

SQL Server Big Data Clusters runtime for Apache Spark release 2021.1

With this release of SQL Server Big Data Clusters, we doubled down on our commitment of release cadence, binary compatibility, and consistency of experiences for data engineers and data scientists through the SQL Server Big Data Clusters runtime for Apache Spark initiative.

The SQL Server Big Data Clusters runtime for Apache Spark is a consistent versioned block of programming language distributions, engine optimizations, core libraries, and packages for Apache Spark.

Here is a summary of the SQL Server Big Data Clusters runtime for Apache Spark release 2021.1 shipped with SQL Server Big Data Clusters CU13:

  • Apache Spark 3.1.2
  • Scala 2.12 for Scala Spark
  • Python 3.8 for PySpark
  • Microsoft R Open 3.5.2 for SparkR and sparklyr

For more information on all included packages and how to use it, see:

Password rotation for Big Data Cluster’s Active Directory service accounts

When a big data cluster is deployed with Active Directory integration for security, there are Active Directory (AD) accounts and groups that SQL Server creates during a big data cluster deployment, see auto-generated active directory objects for further information.

When it comes to security-sensitive customers, it is usually required security reinforcement such as setting password expiration policies, allowing the administrator to set user passwords to never expire or expire after a certain number of days. For SQL Server Big Data Cluster deployments it was previously required to manually rotate the password for those auto-generated active directory objects.

With SQL Server Big Data Clusters CU13, we are now releasing the azdata bdc rotate command to rotate passwords for all auto-generated accounts except for the DSA account. In order to update the DSA password for SQL Server Big Data Clusters we are releasing a specific operation notebook.

Enable Advanced Encryption Standard (AES) on the automatically generated AD accounts

Today’s enterprise environments are facing a lot more challenges than it used to be. Using secure and encrypted connections when authenticating with Kerberos will significantly lower the risk to encounter attacks such as Kerberoasting; a type of attack targeting service accounts in Active Directory.  Starting with SQL Server Big Data Clusters CU13, we’re enabling the Advanced Encryption Standard (AES)  support on the auto-generated AD accounts by allowing users to set an optional boolean parameter in the BDC deployment profile to indicate this AD account supports Kerberos AES 128 bit and 256 bit encryptions.

For more information, see:

Ready to learn more?

Check out the SQL Server Big Data Clusters CU13 release notes to learn more about all the improvements available with the latest update. For a technical deep-dive on Big Data Clusters, read the documentation and visit our GitHub repository.

Follow the instructions on our documentation page to get started and deploy Big Data Clusters.

The post What’s new with SQL Server Big Data Clusters—CU13 Release appeared first on Microsoft SQL Server Blog.

]]>
Microsoft at Data Platform Virtual Summit 2021 http://approjects.co.za/?big=en-us/sql-server/blog/2021/08/24/microsoft-at-data-platform-virtual-summit-2021/ Tue, 24 Aug 2021 16:00:02 +0000 Data Platform Virtual Summit 2021 (DPS 2021) is just a few days away. A free, global learning event for the data professionals, DPS 2021 will feature over 150 breakout sessions and 15 training classes.

The post Microsoft at Data Platform Virtual Summit 2021 appeared first on Microsoft SQL Server Blog.

]]>
Data Platform Virtual Summit 2021 (DPS 2021) is just a few days away. A free, global learning event for the data professionals, DPS 2021 will feature over 150 breakout sessions and 15 training classes. This content will be delivered by Azure Data Engineering, partner organizations, community leaders, and Data Platform MVPs. The event is fine-tuned for global time zones, AMERICA, EMEA, and APAC making it a truly global and inclusive learning event. Attendees will get to learn about the latest SQL Server and Azure Data innovations and gain deep technical skills to move ahead in your careers.

DPS 2021 will feature five parallel tracks focusing on Azure Data (Development and Administration), Advanced Analytics, Power BI, and Artificial Intelligence. The virtual platform will give a truly immersive experience to the attendees by offering live Q&A, networking lounge, Azure Data + AI Gurukul (Technical Round Tables), and community zone. Attendees can attend sessions and network amongst peers and speakers, convenient to their time zones. Delegates will get the session recordings for 12-month on-demand access. DPS 2021 offers an incredible opportunity to learn directly from our engineering teams, who will be sharing the latest advances and insights on the data platform, providing in-depth training across key products and technologies.

Bob Ward, Anna Hoffman, and Buck Woody will be the keynote speakers for DPS 2021. The trio will showcase SQL from edge to cloud.

Microsoft Azure Data Teams are delivering over 40 sessions at DPS 2021. Hear the latest from the teams who develop the tools you use every day, and engage in live discussions. Visit the virtual expo hall where you can connect with our team across SQL Server, Azure SQL, Azure Synapse Analytics, Power BI, and more.

DPS 2021 features a variety of learning formats including breakouts, deep-dives, short-drives, and demo-only sessions.

Register today

Register for Data Platform Virtual Summit today and be part of an amazing week of training convenient to your time zone and receive 12 months of on-demand access to world-class learning.

The post Microsoft at Data Platform Virtual Summit 2021 appeared first on Microsoft SQL Server Blog.

]]>
Analyze data in Azure Data Explorer using Kusto Query Language (KQL) extension in Azure Data Studio http://approjects.co.za/?big=en-us/sql-server/blog/2020/09/24/analyze-data-in-azure-data-explorer-using-kusto-query-language-kql-extension-in-azure-data-studio/ Thu, 24 Sep 2020 17:00:24 +0000 The Kusto (KQL) extension in Azure Data Studio is now available in preview. This native Kusto (KQL) support brings another modern data experience to Azure Data Studio, a cross-platform client – for Windows, macOS, and Linux. Users can now connect and browse their Azure Data Explorer clusters and databases, write and run KQL, as well

The post Analyze data in Azure Data Explorer using Kusto Query Language (KQL) extension in Azure Data Studio appeared first on Microsoft SQL Server Blog.

]]>
The Kusto (KQL) extension in Azure Data Studio is now available in preview. This native Kusto (KQL) support brings another modern data experience to Azure Data Studio, a cross-platform client – for Windows, macOS, and Linux. Users can now connect and browse their Azure Data Explorer clusters and databases, write and run KQL, as well as author notebooks with Kusto kernel, all equipped with IntelliSense.

Azure Data Explorer connectivity support in Azure Data Studio

By enabling native Kusto (KQL) experiences in Azure Data Studio, users such as data engineers, data scientists, or data analysts can now quickly discover insights as well as identify trends and anomalies against a massive amount of data stored in Azure Data Explorer.

Here are four key benefits of using Kusto (KQL) extension in Azure Data Studio:

1. Efficiency in data exploration and data analysis

Users working with heterogeneous data sources can now do data exploration and data analysis from SQL and Big Data Clusters to Azure Data Explorer without breaking their flow. By supporting KQL natively with IntelliSense, users can benefit from optimized experience for fast and rich functionalities on a large amount of real-time streaming datasets in Azure Data Explorer.

Writing KQL query in Azure Data Studio

For more interactive data exploration, users can visualize the resultset from the KQL query in SandDance.

Using SandDance to visualize a KQL query resultset.

2. Reproducible analyses

Combined with the Kusto kernel addition to Notebook in Azure Data Studio, it makes it easy to create reproducible analyses in notebooks. Notebooks provide the benefits of being able to capture code, results and context on the analysis. When writing KQL queries in code cells, users can also be more productive with the IntelliSense support in Notebooks.

Below is an example of pattern detection in Storm Events data using autocluster plugin in Kusto notebook in Azure Data Studio accessing data from Azure Data Explorer databases:

Pattern detection in Storm Events data using autocluster plugin in Kusto notebook in Azure Data Studio

3. Improved DevOps troubleshooting experience with KQL notebooks

Engineers working on apps with telemetry connected to Azure Data Explorer can easily create a troubleshooting runbook or playbook in Azure Data Studio with Kusto kernel. These runbooks or playbooks, detailing how to troubleshoot apps via telemetry data and how to mitigate, can be stored as notebooks with different kernel types, organized as a Jupyter Book. For example, diagnosis steps and pattern or anomaly detections may be expressed as notebooks with Kusto kernel, and mitigation notebooks in PowerShell or other kernels.

4. Enriching your DevOps flow with KQL files

Azure Data Studio supports a Git source control manager (SCM). Now, users can take advantage of adding their KQL files and KQL notebook files to their Git repositories. This also enables users to add these files as part of their CI/CD pipelines in GitHub or Azure DevOps.

How to get started

This preview release is the beginning of a strategic journey to bring rich native Kusto (KQL) experiences in Azure Data Studio. Please feel free to submit your suggestions and bugs on GitHub.

The post Analyze data in Azure Data Explorer using Kusto Query Language (KQL) extension in Azure Data Studio appeared first on Microsoft SQL Server Blog.

]]>
New in Azure Synapse Analytics: CICD for SQL Analytics using SQL Server Data Tools http://approjects.co.za/?big=en-us/sql-server/blog/2019/11/07/new-in-azure-synapse-analytics-cicd-for-sql-analytics-using-sql-server-data-tools/ Thu, 07 Nov 2019 17:00:08 +0000 At Microsoft Ignite 2019, we announced Azure Synapse Analytics, a major evolution of Azure SQL Data Warehouse. The same industry leading data warehouse now provides a whole new level of performance, scale, and analytics capabilities. One of these capabilities is SQL Analytics, which provides a rich set of enterprise data warehousing features. Today we are announcing the general availability of the highest requested feature

The post New in Azure Synapse Analytics: CICD for SQL Analytics using SQL Server Data Tools appeared first on Microsoft SQL Server Blog.

]]>
At Microsoft Ignite 2019, we announced Azure Synapse Analytics, a major evolution of Azure SQL Data Warehouse. The same industry leading data warehouse now provides a whole new level of performance, scale, and analytics capabilities. One of these capabilities is SQL Analytics, which provides a rich set of enterprise data warehousing features.

Today we are announcing the general availability of the highest requested feature for SQL Analytics in Azure Synapse, SQL Server Data Tools (SSDT) database projects. This release includes support for SQL Server Data Tools with Visual Studio 2019 along with native platform integration with Azure DevOps providing built-in continuous integration and deployment (CI/CD) capabilities for enterprise level deployments. This announcement also comes with support for the Schema Compare extension in Azure Data Studio for SQL Analytics.  You can now expect a frictionless development and deployment experience on any platform for your analytics solution.

Flow diagram showing changes promoted across Development, Test, and Production environments using SSDT and Azure DevOps.

Since announcing preview support for SQL Server Data Tools (SSDT), customers have been able to use popular SQL Server Data Tools features such as Schema Compare, build, and publish for local development of their data warehouse. Although this has helped customers accelerate project development, an automated build, test, and deployment infrastructure is still critical for continuous integration and deployment (CI/CD) scenarios. Without the native integration with Azure DevOps, customers were still forced to manually write PowerShell and TSQL scripts integrated with Azure DevOps for an automated release process.

With SQL Server Data Tools generally available and native Azure DevOps support, you can now set up stable release pipelines without any custom code, and changes to your data warehouse model can be safely and automatically promoted across development, testing, and production environments. Preview customers such as T-Mobile will now be able to accelerate their feature development with Azure Synapse.

“In our current environment, we would have needed hundreds of custom scripts to validate and promote changes across our test and production environments. We’re excited to now simply use SSDT, MSBuild, and the Publish task in Azure DevOps to deploy and release features to production on a consistent and faster cadence.” – Anthony Sabol, Director, Reporting & Analytics at T-Mobile.

Integrate with Microsoft Azure Repos for continuous integration 

Data engineers and developers can easily integrate their SQL Server Data Tools database projects with Microsoft Azure Repos. 

Using Schema Compare in SSDT showing how changes can be tracked using a Git repository in Azure Repos.

Configure continuous deployment using Microsoft Azure Pipelines 

Changes committed to source control in Azure Repos can automatically be pre-validated using MSBuild and promoted to target environments using Microsoft Azure Pipelines and the built-in SQL Analytics deployment task extension 

Downloading the SQL analytics deployment task in the Azure DevOps marketplace.

Cross platform support for Schema Compare with Azure Data Studio 

Azure Data Studio is a cross-platform database tool that now allows you to compare the schema between two data warehouse definitions 

Using Schema Compare to generate change scripts in Azure Data studio.

Next steps

The post New in Azure Synapse Analytics: CICD for SQL Analytics using SQL Server Data Tools appeared first on Microsoft SQL Server Blog.

]]>
Azure Data Studio – Setting up your environment http://approjects.co.za/?big=en-us/sql-server/blog/2019/02/06/azure-data-studio-setting-up-your-environment/ http://approjects.co.za/?big=en-us/sql-server/blog/2019/02/06/azure-data-studio-setting-up-your-environment/#comments Wed, 06 Feb 2019 17:00:07 +0000 This blog entry comes from Buck Woody, who recently rejoined the SQL Server team from the Machine Learning and AI team. For those of you who haven’t met me or read any of my books or blog entries, it’s great to meet you! I’ve been a data professional for over 35 years, worked at a

The post Azure Data Studio – Setting up your environment appeared first on Microsoft SQL Server Blog.

]]>
This blog entry comes from Buck Woody, who recently rejoined the SQL Server team from the Machine Learning and AI team.

For those of you who haven’t met me or read any of my books or blog entries, it’s great to meet you! I’ve been a data professional for over 35 years, worked at a variety of places like NASA, various consulting firms, and here at Microsoft since 2006. I started on the SQL Server team, and then helped ship Microsoft Azure. After that I spent some time in Microsoft Consulting Services, then over to the Machine Learning team in Microsoft Research, and then the Machine Learning and AI team. I’ve rejoined the SQL Server team to help with the inclusion of Apache Spark, Kubernetes, and the Machine Learning and AI features. I’ll still be blogging at my regular location, and from time to time I’ll chime in here on the SQL Server blog. As I learn something new, I’ll share it. This time I’ve learned about a great new tool the team has put together. Azure Data Studio  is a new tool that you can use to work with SQL Server.

You may be thinking wait – don’t we already have a lot of those? Isn’t that the SQL Server Management Studio (SSMS), or “Data Dude” (SQL Server Tools for Visual Studio) or even Visual Studio Code with the add-in for SQL Server?

Well, yes. And those still work just fine – but Azure Data Studio goes further than those tools. And it does some things differently.

Multi-Source

In Azure Data Studio, you can connect to multiple data systems, not just SQL Server, like Apache Hadoop HDFS,  Apache Spark, and others. And if you don’t find what you need, you can make more.

Multi-Platform

SQL Server Management Studio is an amazing tool. When I started at Microsoft in 2006, that’s the product I worked on building. I know it VERY well. But it only runs on Windows. Is that a problem? Well, yes, and no. If you live in Windows all day exclusively, then that’s fine. But if you run a Mac or Linux, then you need a tool that runs on those platforms, and Azure Data Studio does just that. In fact, it does it quite well. I’m typing this on a Mac right now (in Azure Data Studio, no less) and it even maps the keyboard to a Mac-like paradigm. You can also run an add-on to map the keyboard to SSMS.

On each platform, when you start Azure Data Studio, it checks itself and all your extensions and gives you the option to update them, regardless of your platform.

Extensibility

Perhaps the biggest argument for a new tool for working with data is that Azure Data Studio has a fully functional extension feature. This means that Azure Data Studio works more as a platform where you can add in new functionality, themes, etc. that Microsoft, vendors, and even you can create extensions to do what you need. Do you need a tool that manages SQL Server Agent? We have that. Need search tools, check scripts, reports, or Jupyter Notebooks? Check. Need something else? Write it. We’ll show you how. If there isn’t something for a platform you regularly work with, contact that vendor and send them the link to create extensions, and have them publish it to the github.

I’m a huge fan of Visual Studio Code. I use it for everything, and Visual Studio Code has LOTS of extensions. And here’s a cool tip: You can use almost all of them in Azure Data Studio. At the end of this post, I’ll show you how to install an extension along with a few of my favorites I always use in my day-to-day coding.

You can also create widgets to show information about your system.  Learn how to do that in this guide.

Environment and Workspaces

If you’re sold on trying out Azure Data Studio, get it installed, and get started setting up.

Workspaces

You can connect to servers and open a directory easily from inside the tool.  The top left two icons will handle that for you. Azure Data Studio also has the concept of Workspaces, which is similar to how you might think about a solution or project in SSMS or Visual Studio. The key is that Workspaces aren’t as rigid as a solution or project. You can include a lot of files located in many places in a single file, such as MyFavoriteFilesAndFolders.code-workspace, that you can simply double-click to open it all at once in Azure Data Studio. Use the File | Save Workspace As menu item to create a Workspace. This feature alone sells me on using this for day-to-day work. Pair that with github integration, and it’s where I spend most of my time.

Task List

As your system works, you’ll see information about what it’s doing and has done in the Task History icon on the left.

Integrated Terminal(s) and output panels

In every operating system, you have a shell (or terminal) of some type that you often need to work with. In Azure Data Studio, you can run those right in the Azure Data Studio environment – and you can run several of them. You can have PowerShell and CMD running, or even the Azure Command-line interface (CLI) shell in the terminal where you can run a quick command.

Setting up your environment

I use Azure Data Studio as my primary work environment for everything from SQL Server code, to Python, R, PowerShell, Apache Spark, and more. It’s become my go-to integrated development environment (IDE). Here’s how I set mine up.

The first thing I care about is how my environment is laid out, and how it looks. When I work, I like a dark environment with brightly colored fonts. So, my primary theme is dark-plus-syntax. For presenting, which I now do in Azure Data Studio, even replacing PowerPoint in some cases, I use Quiet Light, a built-in theme. You can change those after you install the ones you like by opening the Color Theme picker with File > Preferences > Color Theme. You can open this via Code > Preferences > Color Theme on macOS.

Next, I add in the extensions I want. To do this, click the Extensions icon on the left, and select the Install button on the extension you want from the list. This will either re-install or open a web page. If the former happens, wait a moment while the extension loads, and then click the Reload button to activate it.

If the extension installation opens a web page, then download the .visx file you see on that page. Save that to a location on your hard drive, and then open Azure Data Studio, click the File menu item, then Install Extension from VISX Package. Point to the file you downloaded and when it finishes, click the Reload button next to the extension.

We already have several extensions in the list for you, but here’s a cool trick – you can actually load most extensions on the Visual Studio Code Marketplace. All you have to do is the following:

  1. Open the VS Code Marketplace
  2. Search for the extension you want
  3. Click the icon for that extension to go to its source page
  4. On the right-hand side of that page, look for the section marked Resources
  5. Click on the Download Extension link
  6. Save the file on your hard drive
  7. Open Azure Data Studio
  8. Click the File menu item, then Install Extension from VISX Package
  9. Point to the file you downloaded
  10. When the install finishes, click the Reload button next to the extension

Here is a list of the ones I use a lot:

  • SQL Server 2019 extension (preview)
  • Copy Markdown to HTML
  • Docker Explorer
  • Docs-markdown
  • Excel Viewer
  • First Responder Kit
  • GitLens
  • HCQ
  • Kubernetes Support
  • Live HTML Previewer
  • Markdown Preview Enhanced
  • Open in Browser
  • Python
  • R
  • Redgate SQL Search
  • Server Reports
  • SQL Server Agent
  • SQL Server Import
  • SQL Server Profiler
  • SQLite
  • TODO Highlight
  • Vscode-reveal
  • Whoisactive

I’ll be blogging more about Azure Data Studio as I continue the journey. If you’re ready to get started, you can download Azure Data Studio here.

We welcome feedback and suggestions, just click Help and then Report Issue to file a bug or a feature request.  This puts it right in our tracking system. Be verbose, include screenshots, and most of all add a gif of what you’re doing! It helps us fix things faster.

The post Azure Data Studio – Setting up your environment appeared first on Microsoft SQL Server Blog.

]]>
http://approjects.co.za/?big=en-us/sql-server/blog/2019/02/06/azure-data-studio-setting-up-your-environment/feed/ 5
Announcing SQL Server 2019 community technology preview 2.2 http://approjects.co.za/?big=en-us/sql-server/blog/2018/12/10/announcing-sql-server-2019-community-technology-preview-2-2/ Mon, 10 Dec 2018 23:35:21 +0000 In September we announced SQL Server 2019 preview, the first release of SQL Server to create a unified data platform by packaging Apache SparkTM and Hadoop Distributed File System (HDFS) together with SQL Server as a single, integrated solution.  SQL Server 2019 promises to simplify big data analytics for SQL Server users, break down data

The post Announcing SQL Server 2019 community technology preview 2.2 appeared first on Microsoft SQL Server Blog.

]]>
In September we announced SQL Server 2019 preview, the first release of SQL Server to create a unified data platform by packaging Apache SparkTM and Hadoop Distributed File System (HDFS) together with SQL Server as a single, integrated solution.  SQL Server 2019 promises to simplify big data analytics for SQL Server users, break down data silos using data virtualization, and provide industry-leading performance and advanced security for your mission-critical applications. Today, Microsoft is now pleased to announce SQL Server 2019 community technology preview 2.2, the third in a monthly cadence of preview releases.

This preview brings the following new features and capabilities to SQL Server 2019:

  • Customers can now use SparkR from Azure Data Studio on a big data cluster.
  • Customers can use UTF-8 character encoding with SQL Server Replication. Depending on the character set in use, switching to URF-8 encoding has the potential for large storage savings.

Getting started

Ready to learn more?  Here are some ways to get started with SQL Server 2019:

The post Announcing SQL Server 2019 community technology preview 2.2 appeared first on Microsoft SQL Server Blog.

]]>
Harness the future with the ultimate hybrid platform for data and AI http://approjects.co.za/?big=en-us/sql-server/blog/2018/11/07/harness-the-future-with-the-ultimate-hybrid-platform-for-data-and-ai/ Wed, 07 Nov 2018 16:00:45 +0000 Today I’m excited to give the Day 1 keynote at PASS Summit v.20, a gathering of our longtime community of SQL Server users and data professionals.  PASS Summit is an amazing chance to see the faces of old and new friends.  It’s a place to meet with customers and fans to continually learn about their

The post Harness the future with the ultimate hybrid platform for data and AI appeared first on Microsoft SQL Server Blog.

]]>
Today I’m excited to give the Day 1 keynote at PASS Summit v.20, a gathering of our longtime community of SQL Server users and data professionals.  PASS Summit is an amazing chance to see the faces of old and new friends.  It’s a place to meet with customers and fans to continually learn about their evolving needs and to help us grow as a SQL community and develop the best data platform products in the market.

Hybrid connects all your data

Now more than ever, we are architecting for hybrid, because we are hearing from customers that they will be running data workloads on-premises and in the cloud – rarely just one or the other. We believe that the value Microsoft can add is to provide a great and consistent experience wherever they deploy.  One example of this commitment is Azure SQL Database Managed Instance, which was recently made generally available.  Managed Instance enables organizations to migrate their SQL Server workloads to Azure with zero code changes and offers an easy path to the cloud at an incredible value – and with security, intelligent performance and management tools that are unique to our cloud database services.  The end of extended support for SQL Server 2008 and 2008 R2 next year is a great opportunity for customers to rehost to Azure SQL Database Managed Instance, a fully-managed solution that eliminates the need for future upgrades. It’s easy to get there using comprehensive, yet easy-to use migration tools like Azure Database Migration Service.

Microsoft is excited to announce the general availability of Azure SQL Database Managed Instance Business Critical tier on December 1. Designed for mission-critical business apps with high I/O requirements, the business critical tier supports high availability with the highest level of storage and compute redundancy.  This new tier provides support for in-memory processing, a range of sizes up to 80 cores on Gen5, and zone-redundant HA using several isolated replicas to provide the highest resilience to failure.  We offer all this performance at an incredibly compelling price point—up to 85% less expensive than AWS.  Programs such as the Azure Hybrid Benefit, which allows customers to re-use their on-premises SQL Server licenses for discounts in Azure, and upcoming Reserved capacity pricing for Managed Instance Business Critical which allows you to prepay for a 1 or 3-year term commitment, further help you manage costs in the cloud.

In addition, we’re announcing a limited preview of Machine Learning services in Azure SQL Database. You can now use the Azure SQL Database support for Microsoft Machine Learning Services with R language to complete data processing, model training, and scoring all inside your SQL Database. This means you no longer need to move data out of the database to train and operationalize machine learning models. The R code can be deployed in production by embedding it in T-SQL stored procedures.

Azure is also a great destination for open source database migration. We recently announced an expansion to our relational OSS database managed service offerings with a preview of Azure Database for MariaDB. With MariaDB joining MySQL and PostgreSQL, Azure now offers the community versions of all the most popular OSS relational databases as managed services in the cloud, with advantages like built-in high availability, dynamic scaling and world-class security features.  When looking to migrate NoSQL databases like MongoDB and Cassandra to the cloud, Cosmos DB provides an excellent destination.  Customers gain access to managed NoSQL at a lower total cost of ownership (TCO) vs. on-premises and cloud competitors not to mention the industry leading SLA, global distribution, and features that take all the work out of DevOps. You can get started today by using the Azure Cosmos DB API of your choice to migrate NoSQL data and apps from MongoDB, Cassandra, Hbase, and more, using a free trial of Azure Cosmos DB.

With SQL Server 2019, organizations can now seamlessly manage their structured and unstructured data in a single, integrated solution. It comes with big data capabilities built-in, including support for Apache SparkTM and Hadoop Distributed File System (HDFS)—everything you need to build a data lake with your SQL Server skills. Today we announce SQL Server 2019 community technology preview (CTP) 2.1 which has a number of new features for the database engine and big data clusters:

  • Ability to deploy R and Python apps inside a SQL Server big data cluster
  • Scalar UDF inlining feature in Intelligent query processing, optimizing a common performance problem scenario for User Defined Functions
  • Derived table or view aliases in graph match queries
  • Improved diagnostic data for long-running queries, helping to pinpoint when a query is blocked by stats background processing
  • Ability to put buffer pool in persistent memory, dramatically speeding up I/O operations

And we have more planned for upcoming previews of SQL Server 2019, including Accelerated Data Recovery to speed up recovery processing, transaction rollback, readable secondaries, and adding availability groups for system databases which enables users to replicate linked server definitions, logins, and SQL Agent jobs to the secondary replicas.

Hybrid enables comprehensive AI and analytics

Having the most consistent data platform across on-premises and cloud enables us to offer AI and analytics over all your data. Azure SQL Data Warehouse is a cloud data warehouse that combines lightning fast query performance with advanced security features to turn all your data into actionable insights. Azure SQL Data Warehouse has been recognized as the fastest cloud data warehouse by third party benchmarks. Building upon its industry leading performance, today we announced significant security and usability updates. Customers can now take advantage of a new workload importance feature to influence query execution by priority, making sure high business value work gets first access to system resources. In addition, SQL Data Warehouse now offers native row level security, enabling customers to implement the most stringent security policies for fine-grained access control. Other new capabilities include support for SQL Server Data Tool, enhanced performance monitoring, advanced tuning and accelerated database recovery to significantly improve service usability. Learn more about this and other new features in today’s Azure SQL Data Warehouse blog. Experience the performance of Azure SQL Data Warehouse by creating your first data warehouse.

Azure SQL Data Warehouse also offers efficient and scalable structured streaming write support through native Azure Databricks connector. Azure Databricks is an Apache® Spark™-based analytics platform that enables you to accelerate and simplify the process of building big data and AI solutions to drive the business forward, all backed by industry leading SLAs. We recently announced the preview of Azure Databricks Delta, a powerful transactional storage layer built on Apache Spark to provide better consistency of data and faster read access. Organizations also benefit from Azure Databricks’ native integration with other services like Azure Blob Storage, Azure Data Factory, and Azure Cosmos DB. This enables new analytics solutions that support modern data warehousing, advanced analytics, and real-time analytics scenarios.

Announced at Ignite, the Azure Data Explorer preview is a fast, highly scalable data exploration service for log and telemetry data. It helps you handle many data streams, so you can collect, store, and analyze data. Azure Data Explorer is ideal for analyzing large volumes of diverse data from any data source, such as websites, applications, IoT devices, and more. Azure Data Explorer makes it simple to ingest this data and enables you to perform complex ad-hoc queries on the data in seconds.  Today at PASS, we’re excited to host one of our customers, Taboola, to demonstrate how their organization is using Azure Data Explorer to analyze web data in near real time, crunching large amounts of web data in order to provide the best content recommendations to webpage readers – all in real time.

Customers who want to stream data or analyze in real-time to get valuable insights faster need a massively scalable, distributed, event-driven messaging platform with multiple producers and consumers. Apache Kafka and Azure Event Hubs provide such distributed platforms.  Azure Event Hubs for Apache Kafka, now generally available, provides a Kafka endpoint that can be used by your existing Kafka-based applications as an alternative to running your own Kafka cluster. With Event Hubs for Kafka, you get the best of both worlds—the ecosystem and tools of Kafka, along with Azure’s security and global scale—in a fully managed solution.

Experience your data

Microsoft’s Business Intelligence (BI) tools are evolving as well. Power BI already includes robust self-service data preparation capabilities in Power BI Desktop through the familiar Power Query based experiences that are used by millions of users worldwide. With the new public preview of dataflows in Power BI, we’re taking self-service data preparation to the next level, enabling business analysts to create data preparation logic that can be reused across multiple Power BI reports and dashboards and linked together to create sophisticated data transformation pipelines. Dataflows can be configured to store the data in the customer’s Azure Data Lake Storage Gen2 instance, and dataflows support the Microsoft Common Data Model, giving organizations the ability to leverage a standardized and extensible collection of data schemas (entities, attributes, and relationships).

For our long-time BI customers with investments in SQL Server Reporting Services, with the public preview today, you can include pixel-perfect paginated reports alongside to Power BI’s existing interactive reports. This provides a unified, secure, enterprise-wide reporting platform accessible to any user across devices. You can read more about these innovations in a blog from Arun Ulagaratchagan, General Manager, Power BI Engineering.

Getting started

In conclusion, I’m excited to share with you that hybrid is here:  Microsoft’s consistent data platform across on-premises and cloud connects all your data and makes intelligence over all your data possible. We are proud to provide customers with the widest range of options to run SQL on Azure at the best price.  And our best-of-breed data analytics options bring AI to all your data.

  •  If you’d like to watch my talk at PASS, you can sign in on the PASS website.  Registration is free and sessions the keynote content starts at 8:15 AM Pacific.
  •  If you’re ready to get started, here are a few great places to get going:

 

The post Harness the future with the ultimate hybrid platform for data and AI appeared first on Microsoft SQL Server Blog.

]]>