Azure Data Lake - Microsoft SQL Server Blog

Announcing the retirement of SQL Server Stretch Database

Debbi Lyons — Wed, 03 Jul 2024 16:00:00 +0000

Ever since Microsoft introduced SQL Server Stretch Database in 2016, our guiding principles for such hybrid data storage solutions have always been affordability, security, and native Azure integration. Customers have indicated that they want to reduce maintenance and storage costs for on-premises data, with options to scale up or down as needed, greater peace of mind from advanced security features such as Always Encrypted and row-level security, and they seek to unlock value from warm and cold data stretched to the cloud using Microsoft Azure analytics services.

During recent years, Azure has undergone significant evolution, marked by groundbreaking innovations like Microsoft Fabric and Azure Data Lake Storage. As we continue this journey, it remains imperative to keep evolving our approach on hybrid data storage, ensuring optimal empowerment for our SQL Server customers in leveraging the best from Azure.

Retirement of SQL Server Stretch Database

On November 16, 2022, the SQL Server Stretch Database feature was deprecated from SQL Server 2022. For in-market versions of SQL Server 2019 and 2017, we had added an improvement that allowed the Stretch Database feature to stretch a table to an Azure SQL Database. Effective July 9, 2024, the supporting Azure service, known as SQL Server Stretch Database edition, is retired. Impacted versions of SQL Server include SQL Server 2022, 2019, 2017, and 2016.

In July 2024, SQL Server Stretch Database will be discontinued for SQL Server 2022, 2019, 2017, and 2016. We understand that retiring an Azure service may impact your current workload and use of Stretch Database. Therefore, we kindly request that you either migrate to Azure or bring their data back from Azure to your on-premises version of SQL Server. Additionally, if you’re exploring alternatives for archiving data to cold and warm storage in the cloud, we’ve introduced significant new capabilities in SQL Server 2022, leveraging its data virtualization suite.

SQL Server in Microsoft Fabric? Use CETaS to move data!

The path forward

SQL Server 2022 supports a concept named CREATE EXTERNAL TABLE AS SELECT (CETaS). It can help customers archive and store cold data to Azure Storage. The data will be stored in an open source file format named Parquet. It operates well with complex data in large volumes. With its performant data compression, it turns out to be one of the most cost-effective data storage solutions. Using OneLake shortcuts, customers then can leverage Microsoft Fabric to realize cloud-scale analytics on archived data.

Our priority is to empower our SQL Server customers with the tools and services that leverage the latest and greatest from Azure. If you need assistance in exploring how Microsoft can best empower your hybrid data archiving needs, please contact us.

New solution FAQs

What’s CETaS?

Creates an external table and then exports, in parallel, the results of a Transact-SQL SELECT statement.

Azure Synapse Analytics and Analytics Platform System support Hadoop or Azure Blob Storage.
SQL Server 2022 (16.x) and later versions support CETaS to create an external table and then export, in parallel, the result of a Transact-SQL SELECT statement to Azure Data Lake Storage Gen2, Azure Storage Account v2, and S3-compatible object storage.

What is Fabric?

Fabric is an end-to-end analytics and data platform designed for enterprises that require a unified solution. It encompasses data movement, processing, ingestion, transformation, real-time event routing, and report building. Fabric offers a comprehensive suite of services including Data engineering, Data Factory, Data Science, Real-Time Analytics, Data Warehouse, and Databases.

With Fabric, you don’t need to assemble different services from multiple vendors. Instead, it offers a seamlessly integrated, user-friendly platform that simplifies your analytics requirements. Operating on a software as a service (SaaS) model, Fabric brings simplicity and integration to your solutions.

Fabric integrates separate components into a cohesive stack. Instead of relying on different databases or data warehouses, you can centralize data storage with Microsoft OneLake. AI capabilities are seamlessly embedded within Fabric, eliminating the need for manual integration. With Fabric, you can easily transition your raw data into actionable insights for business users.

What is OneLake shortcuts?

Shortcuts in OneLake allow you to unify your data across domains, clouds, and accounts by creating a single virtual data lake for your entire enterprise. All Fabric experiences and analytical engines can directly connect to your existing data sources such as Azure, Amazon Web Services (AWS), and OneLake through a unified namespace. OneLake manages all permissions and credentials, so you don’t need to separately configure each Fabric workload to connect to each data source. Additionally, you can use shortcuts to eliminate edge copies of data and reduce process latency associated with data copies and staging.

Shortcuts are objects in OneLake that point to other storage locations. The location can be internal or external to OneLake. The location that a shortcut points to is known as the target path of the shortcut. The location where the shortcut appears is known as the shortcut path. Shortcuts appear as folders in OneLake and any workload or service that has access to OneLake can use them. Shortcuts behave like symbolic links. They’re an independent object from the target. If you delete a shortcut, the target remains unaffected. If you move, rename, or delete a target path, the shortcut can break.

Learn more

Microsoft Fabric

Bring your data into the era of AI

Explore solutions

The post Announcing the retirement of SQL Server Stretch Database appeared first on Microsoft SQL Server Blog.

Serving AI with data: A summary of Build 2017 data innovations

SQL Server Team — Wed, 10 May 2017 19:00:49 +0000

This post was authored by Joseph Sirosh, Corporate Vice President, Microsoft Data Group

This week at the annual Microsoft Build conference, we are discussing how, more than ever, organizations are relying on developers to create breakthrough experiences. With big data, cloud and AI converging, innovation & disruption is accelerating to a pace never seen before. Data is the key strategic asset at the heart of this convergence. When combined with the limitless computing power of the cloud and new capabilities like Machine Learning and AI, it enables developers to build the next generation of intelligent applications. As a developer, you are looking for faster, easier ways to embrace these converging technologies and transform your app experiences.

Today at Build, we made several product announcements, adding to the recent momentum announced last month at Microsoft Data Amp, that will help empower every organization on the planet with data-driven intelligence. Across these innovations, we are pursuing three key themes:

Infusing AI within our data platform
Turnkey global distribution to push intelligence wherever your users are
Choice of database platforms and tools for developers

Infusing AI within our data platform

A thread of innovation you will see in our products is the deep integration of AI with data. In the past, a common application pattern was to create machine learning models outside the database in the application layer or in specialty statistical tools, and deploy these models in custom built production systems. This results in a lot of developer heavy lifting, and the development and deployment lifecycle can take months. Our approach dramatically simplifies the deployment of AI by bringing intelligence into existing well-engineered data platforms through a new extensibility model for databases.

SQL Server 2017

We started this journey by introducing R support within the SQL Server 2016 release and we are deepening this commitment with the upcoming release of SQL Server 2017. In this release, we have introduced support for a rich library of machine learning functions and introduced Python support to give you more choices across popular languages. SQL Server can also leverage GPU accelerated computing through the Python/R interface to power even the most intensive deep learning jobs on images, text and other unstructured data. Developers can implement GPU accelerated analytics and very sophisticated AI directly in the database server as stored procedures and gain orders of magnitude higher throughput.

Additionally, as data becomes more complex and the relationships across data are many-to-many, developers are looking for easier ways to ingest and manage this data. With SQL Server 2017, we have introduced Graph support to deliver the best of both relational and graph databases in a single product, including the ability to query across all data using a single platform.

We have made it easy for you to try SQL Server with R, Python, and Graph support today whether you are working with C#, Java, Node, PHP, or Ruby.

Azure SQL Database

We’re continuing to simultaneously ship SQL Server 2017 enhancements to Azure SQL Database, so you get consistent programming surface area across on-premises and cloud. Today, I am excited to announce the support for Graph is also coming to Azure SQL Database so you can also get the best of both relational and graph in a single proven service on Azure.

SQL Database is built for developer productivity with most database management tasks built-in. We have also built AI directly into the service itself, making it an intelligent database service. The service runs millions of customer databases, learns, and then adapts to offer customized experiences for each database. With Database Advisor, you can choose to let the service learn your unique patterns and make performance and tuning recommendations or automatically take action on your behalf. Today, I am also excited to announce general availability of Threat Detection, which uses machine learning around the clock to learn, profile and detect anomalous activity over your unique database and sends alerts in minutes so you can take immediate action versus what historically can take an organization days, months, or years to discover.

Also, we are making it even easier for you to move more of your existing SQL Server apps as-is to Azure SQL Database. Today we announced the private preview for a new deployment option within the service, Managed Instance—you get all the managed benefits of SQL Database and now at the instance level which offers support for SQL Agent, three-part names, DBMail, CDC and other instance-level capabilities.

To streamline this migration effort, we also introduced a preview for Azure Database Migration Service that will dramatically accelerate the migration of on-premises third-party and SQL Server databases into Azure SQL Database.

Eric Fleischman, Vice President & Chief Architect from DocuSign notes, “Our transaction volume doubles every year. We wanted the best of what we do in our datacenter…with the best of what Azure could bring to it. For us, we found that Azure SQL Database was the best way to do it. We deploy our SQL Server schema elements into a Managed Instance, and we point the application via connection string change directly over to the Managed Instance. We basically picked up our existing build infrastructure and we’re able to deploy to Azure within a few seconds. It allows us to scale the business very quickly with minimal effort.”

Learn more about our investments in Azure SQL Database in this deeper blog.

Turnkey global distribution to push intelligence wherever your users are

With the intersection of mobile apps, internet of things, cloud and AI, users and data can come from anywhere around the globe. To deliver transformative intelligent apps that support the global nature of modern applications, and the volume, velocity, variety of data, you need more than a relational database, and more than a simple NoSQL database. You need a flexible database that can ingest massive volumes of data and data types, and navigate the challenges of space and time to ensure millisecond performance to any user anywhere on earth. And you want this with simplicity and support for the languages and technologies you know.

I’m also excited to share that today, Microsoft announced Azure Cosmos DB, the industry’s first globally-distributed, multi-model database service. Azure Cosmos DB was built from the ground up with global distribution and horizontal scale at its core – it offers turn-key global distribution across any number of Azure regions by transparently scaling and distributing your data wherever your users are, worldwide. Azure Cosmos DB leverages the work of Turing award winner Dr. Leslie Lamport, PAXOS algorithm for distributed systems and TLA+ a high-level modeling language. Check out a new interview with Dr. Lamport on Azure Cosmos DB.

Azure Cosmos DB started as “Project Florence” in 2010 to address developer the pain-points faced by large scale applications inside Microsoft. Observing that the challenges of building globally distributed apps are not a problem unique to Microsoft, in 2015 we made the first generation of this technology available to Azure developers in the form of Azure DocumentDB. Since that time, we’ve added new features and introduced significant new capabilities. Azure Cosmos DB is the result. It is the next big leap in globally distributed, at scale, cloud databases.

Now, with more innovation and value, Azure Cosmos DB delivers a schema-agnostic database service with turnkey global distribution, support for multiple models across popular NoSQL technologies, elastic scale of throughput and storage, five well-defined consistency models, and financially-backed SLAs across uptime, throughput, consistency, and millisecond latency.

“Domino’s Pizza chose Azure to rebuild their ordering system and a key component in this design is Azure Cosmos DB—delivering the capability to regionally distribute data, to scale easily, and support peak periods which are critical to the business. Their online solution is deployed across multiple regions around the world—even with the global scaling they can also rely on Azure Cosmos DB millisecond load latency and fail over to a completely different country if required.”

Learn more about Azure Cosmos DB in this deeper blog.

Choice of database platforms and tools for developers

We understand that SQL Server isn’t the only database technology developers want to build with. Therefore, I’m excited to share that today we also announced two new relational database services; Azure Database for MySQL and Azure Database for PostgreSQL to join our database services offerings.

These new services are built on the proven database services platform, which has been powering Azure SQL Database, and offers high availability, data protection and recovery, and scale with minimal downtime—all built-in at no extra cost or configurations. Starting today, you can now develop on MySQL and PostgreSQL database services on Azure. Microsoft is managing the MySQL and PostgreSQL technology you know, love and expect but backed by an enterprise-grade, highly available and fault tolerant cloud services platform that allows you to focus on developing great apps versus management and maintenance.

“Each month, up to 2 million people turn to the GeekWire website for the latest news on tech innovation. Now, GeekWire is making news itself by migrating its popular WordPress site to the Microsoft Azure platform. Kevin Lisota, Web Developer at GeekWire notes, “The biggest benefit of Azure Database for MySQL will be to have Microsoft manage and back up that resource for us so that we can focus on other aspects of the site. Plus, we will be able to scale up temporarily as traffic surges and then bring it back down when it is not needed. That’s a big deal for us.”

Learn more about these new services and try them today.

Azure Data Lake Tools for Visual Studio Code (VSCode)

Azure Data Lake includes all the capabilities required to make it easy for developers, data scientists, and analysts to store data of any size, shape, and speed, and do all types of processing and analytics across platforms and languages. Additionally, Azure Data Lake includes a set of cognitive capabilities built-in, making it seamless to execute AI over petabytes of data. On our journey to make it easier for every developer to become an AI and data science developer, we are investing in bringing more great tooling for data into the tools you know and love.

Today, I’m excited to announce General Availability of Azure Data Lake Tools for Visual Studio Code (VSCode) which gives developers a light but powerful code editor for big data analytics. The new Azure Data Lake Tools for VSCode supports U-SQL language authoring, scripting, and extensibility with C# to process different types of data and efficiently scale any size of data. The new tooling integrates with Azure Data Lake Analytics for U-SQL job submissions with job output to Azure Data Lake Analytics or Azure Blob Storage. In addition, U-SQL local run service has been added to allow developers to locally validate scripts and test data. Learn more and download these tools today.

Getting started

It has never been easier to get started with the latest advances in the intelligent data platform. We invite you to watch our Microsoft Build 2017 online event for streaming and recorded coverage of these innovations, including SQL Server 2017 on Windows, Linux and Docker; scalable data transformation and intelligence from Azure Cosmos DB, Azure Data Lake Store and Azure Data Lake Analytics; the Azure SQL Database approach to proactive Threat Detection and intelligent database tuning; new Azure Database for MySQL and Azure Database for PostgreSQL. I look forward to a great week at Build and your participation in this exciting journey of infusing AI into every software application.

The post Serving AI with data: A summary of Build 2017 data innovations appeared first on Microsoft SQL Server Blog.

Delivering AI with data: the next generation of the Microsoft data platform

SQL Server Team — Wed, 19 Apr 2017 15:10:00 +0000

This post was authored by Joseph Sirosh, Corporate Vice President, Microsoft Data Group

Leveraging intelligence out of the ever-increasing amounts of data can make the difference between being the next market disruptor or being relegated to the pages of history. Today at the Microsoft Data Amp online event, we will make several product announcements that can help empower every organization on the planet with data-driven intelligence. We are delivering a comprehensive data platform for developers and businesses to create the next generation of intelligent applications that drive new efficiencies, help create better products, and improve customer experiences.

I encourage you to attend the live broadcast of the Data Amp event, starting at 8 AM Pacific, where Scott Guthrie, executive VP of Cloud and Enterprise, and I will describe product innovations that integrate data and artificial intelligence (AI) to transform your applications and your business. You can stream the keynotes and access additional on-demand technical content to learn more about the announcements of the day.

Today, you’ll see three key innovation themes in our product announcements. The first is the close integration of AI functions into databases, data lakes, and the cloud to simplify the deployment of intelligent applications. The second is the use of AI within our services to enhance performance and data security. The third is flexibility—the flexibility for developers to compose multiple cloud services into various design patterns for AI, and the flexibility to leverage Windows, Linux, Python, R, Spark, Hadoop, and other open source tools in building such systems.

Hosting AI where the data lives

A novel thread of innovation you’ll see in our products is the deep integration of AI with data. In the past, a common application pattern was to create statistical and analytical models outside the database in the application layer or in specialty statistical tools, and deploy these models in custom-built production systems. That results in a lot of developer heavy lifting, and the development and deployment lifecycle can take months. Our approach dramatically simplifies the deployment of AI by bringing intelligence into existing well-engineered data platforms through a new computing model: GPU deep learning. We have taken that approach with the upcoming release of SQL Server, and deeply integrated deep learning and machine learning capabilities to support the next generation of enterprise-grade AI applications.

So today it’s my pleasure to announce the first RDBMS with built-in AI—a production-quality Community Technology Preview (CTP 2.0) of SQL Server 2017. In this preview release, we are introducing in-database support for a rich library of machine learning functions, and now for the first time Python support (in addition to R). SQL Server can also leverage NVIDIA GPU-accelerated computing through the Python/R interface to power even the most intensive deep-learning jobs on images, text, and other unstructured data. Developers can implement NVIDIA GPU-accelerated analytics and very sophisticated AI directly in the database server as stored procedures and gain orders of magnitude higher throughput. In addition, developers can use all the rich features of the database management system for concurrency, high-availability, encryption, security, and compliance to build and deploy robust enterprise-grade AI applications.

We have also released Microsoft R Server 9.1, which takes the concept of bringing intelligence to where your data lives to Hadoop and Spark, as well as SQL Server. In addition to several advanced machine learning algorithms from Microsoft, R Server 9.1 introduces pretrained neural network models for sentiment analysis and image featurization, supports SparklyR, SparkETL, and SparkSQL, and GPU for deep neural networks. We are also making model management easier with many enhancements to production deployment and operationalization. R Tools for Visual Studio provides a state-of-the-art IDE for developers to work with Microsoft R Server. An Azure Microsoft R Server VM image is also available, enabling developers to rapidly provision the server on the cloud.

In the cloud, Microsoft Cognitive Services enable you to infuse your apps with cognitive intelligence. Today I am excited to announce that the Face API, Computer Vision API, and Content Moderator are now generally available in the Azure Portal. Here are some of the different types of intelligence that cognitive services can bring to your application:

Face API helps detect and compare human faces, organize faces into groups according to visual similarity, and identify previously tagged people in images.
Computer Vision API gives you the tools to understand the contents of any image: It creates tags that identify objects, beings like celebrities or actions in an image, and crafts coherent sentences to describe it. You can now detect landmarks and handwriting in images. Handwriting detection remains in preview.
Content Moderator provides machine-assisted moderation of text and images, augmented with human review tools.

Azure Data Lake Analytics (ADLA) is a breakthrough serverless analytics job service where you can easily develop and run massively parallel petabyte-scale data transformation programs that compose U-SQL, R, Python, and .NET. With no infrastructure to manage, you can process data on demand, scale instantly, and pay per job only. Furthermore, we’ve incorporated the technology that sits behind the Cognitive Services inside U-SQL directly as functions. Now you can process massive unstructured data, such as texthttps://www.microsoft.com/images, extract sentiment, age, and other cognitive features using Azure Data Lake, and query/analyze these by content. This enables what I call “Big Cognition—it’s not just extracting one piece of cognitive information at a time, and not just about understanding an emotion or whether there’s an object in an individual image, but rather it’s about integrating all the extracted cognitive data with other types of data, so you can perform powerful joins, analytics, and integrated AI.

Azure Data Lake Store (ADLS) is a no-limit cloud HDFS storage system that works with ADLA and other big data services for petabyte-scale data. We are announcing the general availability of Azure Data Lake Analytics and Azure Data Lake Store in the Azure North Europe region.

Yet another powerful integration of data and AI is the seamless integration of DocumentDB with Spark to enable machine learning and advanced analytics on top of globally distributed data. To recap, DocumentDB is a unique, globally distributed, limitless NoSQL database service in Azure designed for mission-critical applications. Designed as such from the ground up, it allows customers to distribute their data across any number of Azure regions worldwide, guarantees low read and write latencies, and offers comprehensive SLAs for data-loss, latency, availability, consistency, and throughput. You can use it as either your primary operational database or as an automatically indexed, virtually infinite data lake. The Spark connector understands the physical structure of DocumentDB store (indexing and partitioning) and enables computation pushdown for efficient processing. This service can significantly simplify the process of building distributed and intelligent applications at global scale.

I’m also excited to announce the general availability of Azure Analysis Services. Built on the proven business intelligence (BI) engine in Microsoft SQL Server Analysis Services, it delivers enterprise-grade BI semantic modeling capabilities with the scale, flexibility, and management benefits of the cloud. Azure Analysis Services helps you integrate data from a variety of sources—for example, Azure Data Lake, Azure SQL DW, and a variety of databases on-premises and in the cloud—and transform them into actionable insights. It speeds time to delivery of your BI projects by removing the barrier of procuring and managing infrastructure. And by leveraging the BI skills, tools, and data your team has today, you can get more from the investments you’ve already made.

Stepping up performance and security

Performance and security are central to databases. SQL Server continues to lead in database performance benchmarks, and in every release we make significant improvements. SQL Server 2016 on Windows Server 2016 holds a number of records on the Transaction Processing Performance Council (TPC) benchmarks for operational and analytical workload performance, and SQL Server 2017 does even better. I’m also proud to announce that the upcoming version of SQL Server will run just as fast on Linux as on Windows, as you’ll see in the newly published 1TB TPC-H benchmark world record nonclustered data warehouse performance achieved with SQL Server 2017 on Red Hat Enterprise Linux and HPE ProLiant hardware.

SQL Server 2017 will also bring breakthrough performance, scale, and security features to data warehousing. With up to 100x faster analytical queries using in-memory Columnstores, PolyBase for single T-SQL querying across relational and Hadoop systems, capability to scale to hundreds of terabytes of data, modern reporting, plus mobile BI and more, it provides a powerful integrated data platform for all your enterprise analytics needs.

In the cloud, Azure SQL Database is bringing intelligence to securing your data and increasing database performance. Threat Detection in Azure SQL Database works around the clock, using machine learning to detect anomalous database activities indicating unusual and potentially harmful attempts to access or exploit databases. Simply turning on Threat Detection helps customers make databases resilient to the possibility of intrusion. Other features of Azure SQL Database such as auto-performance tuning automatically implement, tune, and validate performance to guarantee the most optimal query performance. Together, our intelligent database management features help make your database more secure and faster automatically, freeing up scarce DBA capacity for more strategic work.

Simple, flexible multiservice AI solutions in the cloud

We are very committed to simplifying the development of AI systems. Cortana Intelligence is a collection of fully managed big data and analytics services that can be composed together to build sophisticated enterprise-grade AI and analytics applications on Azure. Today we are announcing Cortana Intelligence solution templates that make it easy to compose services and implement common design patterns. These solutions templates have been built on best practice designs motivated by real-world customer implementations done by our engineering team, and include Personalized Offers (for example, for retail applications), Quality Assurance (for example, for manufacturing applications), and Demand Forecasting. These templates accelerate your time to value for an intelligent solution, allowing you to deploy a complex architecture within minutes, instead of days. The templates are flexible and scalable by design. You can customize them for your specific needs, and they’re backed by a rich partner ecosystem trained on the architecture and data models. Get started today by going to the Azure gallery for Cortana Intelligence solutions.

Also, AppSource is a single destination to discover and seamlessly try business apps built by partners and verified by Microsoft. Partners like KenSci have already begun to showcase their intelligent solutions targeting business decision-makers in AppSource. Now partners can submit Cortana Intelligence apps at AppSource “List an app” page.

Cross-platform and open source flexibility

Whether on-premises or in the cloud, cross-platform compatibility is increasingly important in our customers’ diverse and rapidly changing data estates. SQL Server 2017 will be the first version of SQL Server compatible with Windows, Linux, and Linux-based container images for Docker. In addition to running on Windows Server, the new version will also run on Red Hat Enterprise Linux, SUSE Enterprise Linux Server, and Ubuntu. It can also run inside Docker containers on Linux or Mac, which can help your developers spend more time developing and less on DevOps.

Getting started

It has never been easier to get started with the latest advances in the intelligent data platform. We invite you to join us to learn more about SQL Server 2017 on Windows, Linux, and in Linux-based container images for Docker; Cognitive Services for smart, flexible APIs for AI; scalable data transformation and intelligence from Azure Data Lake Store and Azure Data Lake Analytics; the Azure SQL Database approach to proactive threat detection and intelligent database tuning; new solution templates from Cortana Intelligence; and precalibrated models for Linux, Hadoop, Spark, and Teradata in R Server 9.1.

Join our Data Amp event to learn more! You can go now to the Microsoft Data Amp online event for live coverage starting at 8 AM Pacific on April 19. You’ll also be able to stream the keynotes and watch additional on-demand technical content after the event ends. I look forward to your participation in this exciting journey of infusing intelligence and AI into every software application.

The post Delivering AI with data: the next generation of the Microsoft data platform appeared first on Microsoft SQL Server Blog.

Five ways Microsoft helps you do amazing things with data in the cloud

SQL Server Team — Wed, 12 Apr 2017 19:00:00 +0000

Microsoft can help you do amazing things with your data in the cloud! Here are five examples to help you get started. If you’d like more information about using the cloud to get the most from your data, please join us for the upcoming Microsoft Data Amp event on April 19 at 8 AM Pacific. The online event will showcase how data is the nexus between application innovation and artificial intelligence—how data and analytics powered by the most trusted and intelligent cloud can help companies differentiate and out innovate their competition.

1: Build data-driven apps that learn and adapt

Applications show intelligence when they can spot trends, react to events, predict outcomes or recommend choices—often leading to richer customer experiences, improved business process, or addressing issues before they arise. The three key ingredients to creating an intelligent app are:

Ingest data in real time
Query across historical and real-time data
Analyze patterns and make predictions with machine learning

With Azure, you can make your applications intelligent by establishing feedback loops, and applying big data and machine learning techniques to classify, predict, or otherwise analyze explicit and implicit signals. Today, apps for consumers and enterprises can deliver greater customer or business benefit by learning from user behavior and other signals.

Pier 1 Imports launched a mobile-friendly pier1.com, making shopping online easier. It enabled the selection of delivery options like direct shipment, picking up products in the local store, or a white-glove delivery option from any mobile device. “Although the Pier 1 Imports brand is the same as it has been for more than 50 years, we are continually getting better at identifying what our customer wants, using Microsoft Azure Machine Learning and resulting data insights,” Sharon Leite, EVP Sales and Customer Experience.

Get started with sample code

If you want to learn more about building an intelligent app, try the AdventureWorks Ski App. This sample application can be used to demonstrate the value of building intelligence into an existing application. Learn more by going to GitHub and watching the application being built here.

2: Run big cognition for human-like intelligence over petabyte scale

Microsoft’s Cognitive Services APIs allow developers to integrate vision, speech, language, knowledge and search APIs into your apps. To run these services over petabyte scale, we’ve integrated the capabilities directly into Azure Data Lake. You can join emotions from image content with any other type of data you have and do incredibly powerful analytics and intelligence over it. This is what we call “Big Cognition.” This goes beyond extracting one piece of cognitive information at a time, understanding an emotion or whether there’s an object in an image. Big Cognition joins all the extracted cognitive data with other types of data, so you can do some really powerful analytics with it.

On a global scale, Azure Data Lake is also being used by Carnival Corp., the world’s largest leisure travel company, which has a total of over 100 ships across 10 global cruise line brands, at its Fleet Operations Centers. “We chose to partner with Microsoft to kick off a project of the Internet of Things, because it was strategic for us to rely on a platform that would allow us to collect, analyse, and display data from sensors in a simple, integrated and immediate way on our ships and make them available both to the officers on board and to our operations centre on the ground,” says Franco Caraffi, IT Marine Systems Director of Costa Cruises.

Get started with sample code

We have demonstrated Big Cognition at Microsoft Ignite and PASS Summit, by showing a demo in which we used U-SQL inside Azure Data Lake Analytics to process a million images and understand what’s inside those images. You can watch this demo here and try it yourself using a sample project on GitHub or discover more ways to get started with Azure Data Lake on GitHub.

3: Deliver <10ms latency to any customer, anywhere on the planet

With today’s globally connected world, developers and organizations alike have three simple requirements for their customer-facing applications: millisecond performance across global distribution, and application availability—without hard tradeoffs. NoSQL can be a great technology for tackling these tough challenges, especially when facing increasing data volume and variety.

Most NoSQL technologies force customers to make binary choices among global performance, availability, and transactional consistency. With Azure DocumentDB, Microsoft’s fully managed NoSQL database service, you get four tunable consistency levels to reduce friction related to tradeoffs and unlock new application patterns previously not possible—without ever trading off availability or <10ms latency, which are guaranteed. For example, session consistency gives an ideal blend of performance and consistency for multitenant applications. Tenants are able to achieve strong consistency within the scope of their own session, without having to trade off performance for other tenants. IoT devices emit events at an extremely high rate. Thus, a scale-out database is required to handle heavy write ingestion to persist the full fidelity of unaggregated streams of events. The events from each generation of device looks slightly different as new capabilities and sensors are added. DocumentDB can uniquely ingest a high write of events with varying schema with automatic indexing—and serve it back out using rich queries with low latency, enabling applications to react with real-time anomaly detection.

Citrix delivers solutions used by more than 400,000 organizations and more than 100 million individuals globally. The Citrix web portal was getting a lot of traffic, which was good news, but it was running into challenges integrating the web identity into its SaaS portals. It turned to Azure Service Fabric and DocumentDB to run its Citrix Identity Platform to deliver against its availability, durability, and performance requirements.

Get started with sample code

There are so many great code samples available on GitHub for DocumentDB that we aggregated our 10 favorite GitHub samples into a single blog for you. Check out these samples across .NET, Node.js, and Python for an array of app scenarios and start playing with DocumentDB today.

4: Serve up a first-class search experience with just a few lines of code

Azure Search is a cloud search-as-a-service solution that delegates server and infrastructure management to Microsoft, leaving you with a ready-to-use service with which, using only a few lines of code, you can populate your data and then easily add a first-class search experience to your web, mobile, or cognitive-based application. Azure Search allows you to easily add a robust search experience to any application using a simple REST API or .NET SDK without managing search infrastructure or becoming an expert in search.

autoTRADER.ca, Canada’s largest automobile search site, uses Azure Search to help dealers advertise and inventory products, determine the best pricing, and provide market data on which vehicles are in high demand. “We’re really excited about using Azure Search for marketplace. It gives us an opportunity to provide better and better services to our customers with instant, seamless experiences across all devices,” says Shane Sullivan, director of Software Engineering.

Get started with sample code

Try the First Response app code on GitHub—an online collaboration platform built to support first responders—which lets police, fire fighters, and paramedics share critical data with each other in real time. This app scenario and demo and toolkit combine App Service, DocumentDB, and Search with Xamarin support for cross-device support into a real-time mobile app.

5: Scale your business, protect your margins

For software builders with existing packaged apps looking to also extend their business to SaaS or those building a new business app as SaaS, the number one question we get is, “how do I run and grow my business on a cloud while ensuring operating costs don’t accidently consume my margins?” When we dig into this app pattern with customers, the concern really boils down to how to manage the costs associated with isolating and managing your customers’ data while ensuring each customer gets the best performance despite varying performance demands. There are two challenges as a result of this: First, managing and maintaining an isolated database for each customer would require more staff as you grow; second, over-provisioning resources to ensure spikes in demand don’t cause a poor experience and overspending on operating costs. We dove into this problem with customers and as a result introduced SQL Database Elastic Pools—a unique solution to help you manage thousands of databases as one while maintaining isolation and security at dramatic cost savings.

SQL Database Elastic Pools are a simple, cost-effective solution for managing and scaling multiple databases that have varying and unpredictable usage demands. The databases in an elastic pool are on a single Azure SQL Database server and share a set number of resources at a set price.

SnelStart makes popular financial- and business-management software for small and medium-sized businesses in the Netherlands. Its 55,000 customers are serviced by a staff of 110 employees, including an IT staff of 35. By moving from desktop software to a software-as-a-service (SaaS) offering built on Azure, SnelStart made the most of built-in services, automating management using the familiar environment in C#, and optimizing performance and scalability by neither over- or under-provisioning businesses using elastic pools. “By using elastic pools, we can optimize performance based on the needs of our customers, without over-provisioning. If we had to provision based on peak load, it would be quite costly. Instead, the option to share resources between multiple, low-usage databases allows us to create a solution that performs well and is cost effective,” says Henry Been, solution architect.

Get started with sample code

We built this Contoso shopkeeper app to demonstrate just how easy it is to build a multitenant SaaS app using SQL Database Elastic Pools. You’ll see how easy it is to scale out to support your growing customer base with no scheme changes required for your app and also how easy it is to manage many databases as one.

Azure can help you do amazing things with data in the cloud. Organizations have used Azure to transform their business, providing compelling customer experiences while managing costs. Try one of these five new amazing things you can do with Azure today! And to learn more, join us for the upcoming Microsoft Data Amp event on April 19 at 8 AM Pacific.

The post Five ways Microsoft helps you do amazing things with data in the cloud appeared first on Microsoft SQL Server Blog.

Announcing the Next Generation of Databases and Data Lakes from Microsoft

SQL Server Team — Wed, 16 Nov 2016 15:35:58 +0000

This post was authored by Joseph Sirosh, Corporate Vice President of the Microsoft Data Group.

For the past two years, we’ve unveiled several of our cutting-edge technologies and innovative solutions at Connect(); which will be livestreaming globally from New York City starting November 16. This year, I am thrilled to announce the next generation of SQL Server and Azure Data Lake, and several new capabilities to help developers build intelligent applications.

1. Next release of SQL Server with Support for Linux and Docker (Preview)

I am excited to announce the public preview of the next release of SQL Server which brings the power of SQL Server to both Windows – and for the first time ever – Linux. Now you can also develop applications with SQL Server on Linux, Docker, or macOS (via Docker) and then deploy to Linux, Windows, Docker, on-premises, or in the cloud. This represents a major step in our journey to making SQL Server the platform of choice across operating systems, development languages, data types, on-premises and the cloud. All major features of the relational database engine, including advanced features such as in-memory OLTP, in-memory columnstores, Transparent Data Encryption, Always Encrypted, and Row-Level Security now come to Linux. Getting started is easier than ever. You’ll find native Linux installations (more info here) with familiar RPM and APT packages for Red Hat Enterprise Linux, Ubuntu Linux, and SUSE Linux Enterprise Server. The public preview on Windows and Linux will be available on Azure Virtual Machines and as images available on Docker Hub, offering a quick and easy installation within minutes. The Windows download is available on the Technet Eval Center.

We have also added significant improvements into R Services inside SQL Server, such as a very powerful set of machine learning functions that are used by our own product teams across Microsoft. This brings new machine learning and deep neural network functionality with increased speed, performance and scale, especially for handling a large corpus of text data and high-dimensional categorical data. We have just recently showcased SQL Server running more than one million R predictions per second and encourage you all to try out R examples and machine learning templates for SQL Server on GitHub.

The choice of application development stack with the next release of SQL Server is absolutely amazing – it includes .NET, Java, PHP, Node.JS, etc. on Windows, Linux and Mac (via Docker). Native application development experience for Linux and Mac developers has been a key focus for this release. Get started with the next release of SQL Server on Linux, macOS (via Docker) and Windows with our developer tutorials that show you how to install and use the next release of SQL Server on macOS, Docker, Windows, RHEL and Ubuntu and quickly build an app in a programming language of your choice.

2. SQL Server 2016 SP1

We are announcing SQL Server 2016 SP1 which is a unique service pack – for the first time we introduce consistent programming model across SQL Server editions. With this model, programs written to exploit powerful SQL features such as in-memory OLTP, in-memory columnstore analytics, and partitioning will work across Enterprise, Standard and Express editions. Developers will find it easier than ever to take advantage of innovations such as in memory databases and advanced analytics – you can use these advanced features in the Standard Edition and then step up to Enterprise for Mission Critical performance, scale and availability – without having to re-write your application.

Our software partners are excited about the flexibility that this change gives them to adopt advanced features while supporting multiple editions of SQL Server.

“With SQL Server 2016 SP1, we can run the same code entirely on both platforms and customers who need Enterprise scale buy Enterprise, and customers who don’t need that can buy Standard and run just fine. From a programming point of view, it’s easier for us and easier for them,” said Nick Craver, Architecture Lead at Stack Overflow.

To be even more productive with SQL Server, you can now take advantage of improved developer experiences on Windows, Mac and Linux for Node.js, Java, PHP, Python, Ruby, .NET core and C/C++. Our JDBC Connector is now published and available as 100% open source which gives developers more access to information and flexibility on how to contribute and work with the JDBC driver. Additionally, we’ve made updates to ODBC for PHP driver and launched a new ODBC for Linux connector, making it much easier for developers to work with Microsoft SQL-based technologies. To make it more seamless for all developers Microsoft VSCode users can also now connect to SQL Server, including SQL Server on Linux, Azure SQL Database and Azure SQL Data Warehouse. In addition, we’ve released updates to SQL Server Management Studio, SQL Server Data Tools, and Command line tools which now support SQL Server on Linux.

3. Azure Data Lake Analytics and Store GA

Today, I am excited to announce the general availability of Azure Data Lake Analytics and Azure Data Lake Store.

Azure Data Lake Analytics is a cloud analytics service that allows you to develop and run massively parallel data transformations and processing programs in U-SQL, R, Python and .Net over petabytes of data with just a few lines of code. There is no infrastructure to manage, and you can process data on demand allowing you to scale in seconds, and only pay for the resources used. U-SQL is a simple, expressive, and super-extensible language that combines the power of C# with the simplicity of SQL. Developers can write their code either in Visual Studio or Visual Studio Code and the execution environment gives you debugging and optimization recommendations to improve performance and reduce cost.

Azure Data Lake Store is a cloud analytics data lake for enterprises that is secure, massively scalable and built to the open HDFS standard. You can store trillions of files, and single files can be greater than a petabyte in size. It provides massive throughput optimized to run big analytic jobs. It has data encryption in motion and at rest, single sign-on (SSO), multi-factor authentication and management of identities built-in through Azure Active Directory, and fine-grained POSIX-based ACLS for role-based access controls.

Furthermore, we’ve incorporated the technology that sits behind the Microsoft Cognitive Services inside U-SQL directly. Now you can process any amount of unstructured data, e.g., text, images, and extract emotions, age, and all sorts of other cognitive features using Azure Data Lake and perform query by content. You can join emotions from image content with any other type of data you have and do incredibly powerful analytics and intelligence over it. This is what I call Big Cognition. It’s not just extracting one piece of cognitive information at a time, not just about understanding an emotion or whether there’s an object in an image, but rather it’s about joining all the extracted cognitive data with other types of data, so you can do some really powerful analytics with it. We have demonstrated this capability at Microsoft Ignite and PASS Summit, by showing a Big Cognition demo in which we used U-SQL inside Azure Data Lake Analytics to process a million images and understand what’s inside those images. You can watch this demo (starting at minute 38) and try it yourself using a sample project on GitHub.

4. DocumentDB Emulator

We live on a Planet of the Apps, and the best back-end system to build modern intelligent mobile or web apps is Azure DocumentDB – planet-scale, globally distributed managed NoSQL service, with 99.99% availability and guarantees for low latency and consistency, all of which is backed by an enterprise grade security and SLA.

Today I am happy to announce a public preview of DocumentDB Emulator which provides a local development experience for the Azure DocumentDB. Using the DocumentDB Emulator, you can develop and test your application locally without an internet connection, without creating an Azure subscription, and without incurring any costs. This has long been the most requested feature on the user voice site, so we are thrilled to roll this out to everyone.

Furthermore, we’ve added .NET Core support in DocumentDB. The .Net Core is a lightweight and modular platform to create applications and services that run on Linux, Mac and Windows. With DocumentDB support for .Net Core, developers can now use .Net Core to build cross platform applications and services that use DocumentDB API.

5. Other Announcements

Today we also are announcing the General Availability of R Server for Azure HDInsight. HDInsight is the only fully managed Cloud Hadoop offering that provides optimized open source analytic clusters for Spark, Hive, Map Reduce, HBase, Storm, and R Server backed by a 99.9% SLA. Running Microsoft R Server as a service on top of Apache Spark, customers can achieve unprecedented scale and performance by combining enterprise-scale analytics in R with the power of Spark. With transparently parallelized analytic functions, it’s now possible to handle up to 1000x more data with up to 50x faster speeds than open source R – helping you train more accurate models for better predictions than previously possible. Plus, because R Server is built to work with the open source R language, all of your R scripts can run without significant changes.
We are also announcing the public preview of Kafka for HDInsight, an enterprise-grade, open-source streaming ingestion service which is cost-effective, easy to provision, manage and use. This service enables you to build real-time solutions like IoT, fraud detection, click-stream analysis, financial alerts, and social analytics. Using out-of-the-box integration with Storm for HDInsight or Spark Stream for HDInsight, you can architect powerful streaming pipelines to drive intelligent real-time actions.
Another exciting news is the availability of Operational Analytics for Azure SQL Database. It’s the first fully managed Hybrid Transactional and Analytical Processing (HTAP) database service in the cloud. The ability to run both analytics (OLAP) and OLTP workloads on the same database tables at the same time allows developers to build a new level of analytical sophistication into their applications. Developers can eliminate the need for ETL and a data warehouse in some cases (using one system for OLAP and OLTP, instead of creating two separate systems), helping to reduce complexity, cost, and data latency. The in-memory technologies in Azure SQL DB helps achieve phenomenal performance – e.g., 75,000 transactions per second for order processing (11X performance gain) and reduced query execution time from 15 seconds down to 0.26 (57X performance gain). This capability is now a standard feature of Azure SQL DB at no additional cost.

We are making our products and innovations more accessible to all developers – on any platform, on-premises and in the cloud. We are building for a future where our data platform is dwarfed by the aggregate value of the solutions built on top of it. This is the true measure of success of a platform – when the number and the value created by the apps built on top is far larger than the platform itself.

The live broadcast of Connect(); begins on November 16th at 9:45am EST, and continues with interactive Q&A and immersive on-demand content. Join us to learn more about these amazing innovations.

@josephsirosh

The post Announcing the Next Generation of Databases and Data Lakes from Microsoft appeared first on Microsoft SQL Server Blog.

Eight scenarios with Apache Spark on Azure that will transform any business

SQL Server Team — Mon, 29 Aug 2016 15:00:23 +0000

This post was authored by Rimma Nehme, Technical Assistant, Data Group.

Since its birth in 2009, and the time it was open sourced in 2010, Apache Spark has grown to become one of the largest open source communities in big data with over 400 organizations from 100 companies contributing to it. Spark stands out for its ability to process large volumes of data 100x faster, because data is persisted in-memory. Azure cloud makes Apache Spark incredibly easy and cost effective to deploy with no hardware to buy, no software to configure, with a full notebook experience to author compelling narratives, and integration with partner business intelligence tools. In this blog post, I am going to review of some of the truly game-changing usage scenarios with Apache Spark on Azure that companies can employ in their context.

Scenario #1: Streaming data, IoT and real-time analytics

Apache Spark’s key use case is its ability to process streaming data. With so much data being processed on a daily basis, it has become essential for companies to be able to stream and analyze it all in real time. Spark Streaming has the capability to handle this type of workload exceptionally well. As shown in the image below, a user can create an Azure Event Hub (or an Azure IoT Hub) to ingest rapidly arriving data into the cloud; both Event and IoT Hubs can intake millions of events and sensor updates per second that can then be processed in real-time by Spark.

Businesses can use this scenario today for:

Streaming ETL: In traditional ETL (extract, transform, load) scenarios, the tools are used for batch processing, and data must be first read in its entirety, converted to a database compatible format, and then written to the target database. With Streaming ETL, data is continually cleaned and aggregated before it is pushed into data stores or for further analysis.
Data enrichment: Streaming capability can be used to enrich live data by combining it with static or ‘stationary’ data, thus allowing businesses to conduct more complete real-time data analysis. Online advertisers use data enrichment to combine historical customer data with live customer behavior data and deliver more personalized and targeted ads in real-time and in the context of what customers are doing. Since advertising is so time-sensitive, companies have to move fast if they want to capture mindshare. Spark on Azure is one way to help achieve that.
Trigger event detection: Spark Streaming can allow companies to detect and respond quickly to rare or unusual behaviors (“trigger events”) that could indicate a potentially serious problem within the system. For instance, financial institutions can use triggers to detect fraudulent transactions and stop fraud in its tracks. Hospitals can also use triggers to detect potentially dangerous health changes while monitoring patient vital signs and sending automatic alerts to the right caregivers who can then take immediate and appropriate action.
Complex session analysis: Using Spark Streaming, businesses can use events relating to live sessions, such as user activity after logging into a website or application, can be grouped together and quickly analyzed. Session information can also be used to continuously update machine learning models. Companies can then use this functionality to gain immediate insights as to how users are engaging on their site and provide more real-time personalized experiences.

Scenario #2: Visual data exploration and interactive analysis

Using Spark SQL running against data stored in Azure, companies can use BI tools such as Power BI, PowerApps, Flow, SAP Lumira, QlikView and Tableau to analyze and visualize their big data. Spark’s interactive analytics capability is fast enough to perform exploratory queries without sampling. By combining Spark with visualization tools, complex data sets can be processed and visualized interactively. These easy-to-use interfaces then allow even non-technical users to visually explore data, create models and share results. Because wider audience can analyze big data without preconceived notions, companies can test new ideas and visualize important findings in their data earlier than ever before. Companies can identify new trends and new relationships that were not apparent before and quickly drill down into them, ask new questions and find ways to innovate in new and smarter ways.

This scenario is even more powerful when interactive data discovery is combined with predictive analytics (more on this later in this blog). Based on relationships and trends identified during discovery, companies can use logistic regression or decision tree techniques to predict the probability of certain events in the future (e.g., customer churn probability). Companies can then take specific, targeted actions to control or avert certain events.

Scenario #3: Spark with NoSQL (HBase and Azure DocumentDB)

This scenario provides scalable and reliable Spark access to NoSQL data stored either in HBase or our blazing fast, planet-scale Azure DocumentDB, through “native” data access APIs. Apache HBase is an open-source NoSQL database that is built on Hadoop and modeled after Google BigTable. DocumentDB is a true schema-free managed NoSQL database service running in Azure designed for modern mobile, web, gaming, and IoT scenarios. DocumentDB ensures 99% of your reads are served under 10 milliseconds and 99% of your writes are served under 15 milliseconds. It also provides schema flexibility, and the ability to easily scale a database up and down on demand.

The Spark with NoSQL scenario enables ad-hoc, interactive queries on big data. NoSQL can be used for capturing data that is collected incrementally from various sources across the globe. This includes social analytics, time series, game or application telemetry, retail catalogs, up-to-date trends and counters, and audit log systems. Spark can then be used for running advanced analytics algorithms at scale on top of the data coming from NoSQL.

Companies can employ this scenario in online shopping recommendations, spam classifiers for real time communication applications, predictive analytics for personalization, and fraud detection models for mobile applications that need to make instant decisions to accept or reject a payment. I would also include in this category a broad group of applications that are really “next-gen” data warehousing, where large amounts of data needs to be processed inexpensively and then served in an interactive form to many users globally. Finally, internet of things scenarios fit in here as well, with the obvious difference that the data represents the actions of machines instead of people.

Scenario #4: Spark with Data Lake

Spark on Azure can be configured to use Azure Data Lake Store (ADLS) as an additional storage. ADLS is an enterprise-class, hyper-scale repository for big data analytic workloads. Azure Data Lake includes all the capabilities required to make it easy for developers, data scientists, and analysts in an enterprise environment to store data of any size, shape and speed, and do all types of processing and analytics across platforms and languages. Because ADLS is a file system compatible with Hadoop Distributed File System (HDFS), it makes it very easy to combine it with Spark for running computations at scale using pre-existing Spark queries.

The data lake scenario arose because new types of data needed to be captured and exploited by companies, while still preserving all of the enterprise-level requirements like security, availability, compliance, failover, etc. Spark with data lake scenario enables a truly scalable advanced analytics on healthcare data, financial data, business-sensitive data, geo-location coordinates, clickstream data, server log, social media, machine and sensor data. If companies want an easy way of building data pipelines, have unparalleled performance, insure their data quality, manage access control, perform change data capture (CDC) processing, get enterprise-level security seamlessly and have world-class management and debugging tools, this is the scenario they need to implement.

Scenario #5: Spark with SQL Data Warehouse

While there is still a lot of confusion, Spark and big data analytics is not a replacement for traditional data warehousing. Instead, Spark on Azure can complement and enhance a company’s data warehousing efforts by modernizing the company’s approaches to analytics. A data warehouse can be viewed as an ‘information archive’ that supports business intelligence (BI) users and reporting tools for mission-critical functions of company. My definition of mission-critical is any system that supports revenue generation or cost control. If such a system fails, companies would have to manually perform these tasks to prevent loss of revenue or increased cost. Big data analytics systems like Spark help augment such systems by running more sophisticated computations, smarter analytics and delivering deeper insights using larger and more diverse datasets.

Azure SQL Data Warehouse (SQLDW) is a cloud-based, scale-out database capable of processing massive volumes of data, both relational and non-relational. Built on our massively parallel processing (MPP) architecture, SQLDW combines the power of the SQL Server relational database with Azure cloud scale-out capabilities. You can increase, decrease, pause, or resume a data warehouse in seconds with SQLDW. Furthermore, you save costs by scaling out CPU when you need it and cutting back usage during non-peak times. SQLDW is the manifestation of elastic future of data warehousing in the cloud.

Some of the use cases of Spark with SQLDW scenario may include: using data warehouse to get a better understanding of its customers across product groups, then using Spark for predictive analytics on top of that data. Running advanced analytics using Spark on top of the enterprise data warehouse containing sales, marketing, store management, point of sale, customer loyalty, and supply chain data, then run advanced analytics using Spark to drive more informed business decisions at the corporate, regional, and store levels. Using Spark with the data warehousing data, companies can literally do anything from risk modeling, to parallel processing of large graphs, to advanced analytics, text processing – all on top of their elastic data warehouse.

Scenario #6: Machine Learning using R Server, MLlib

Another and probably one of the most prominent Spark use cases in Azure is machine learning. By storing datasets in-memory during a job, Spark has great performance for iterative queries common in machine learning workloads. Common machine learning tasks that can be run with Spark in Azure include (but are not limited to) classification, regression, clustering, topic modeling, singular value decomposition (SVD) and principal component analysis (PCA) and hypothesis testing and calculating sample statistics.

Typically, if you want to train a statistical model on very large amounts of data, you need three things:

Storage platform capable of holding all of the training data
Computational platform capable of efficiently performing the heavy-duty mathematical computations required
Statistical computing language with algorithms that can take advantage of the storage and computation power

Microsoft R Server, running on HDInsight with Apache Spark provides all three things above. Microsoft R Server runs within HDInsight Hadoop nodes running on Microsoft Azure. Better yet, the big-data-capable algorithms of ScaleR takes advantage of the in-memory architecture of Spark, dramatically reducing the time needed to train models on large data. With multi-threaded math libraries and transparent parallelization in R Server, customers can handle up to 1000x more data and up to 50x faster speeds than open source R. And if your data grows or you just need more power, you can dynamically add nodes to the Spark cluster using the Azure portal. Spark in Azure also includes MLlib for a variety of scalable machine learning algorithms, or you can use your own libraries. Some of the common applications of machine learning scenario with Spark on Azure are listed in a table below.

Vertical

Sales and Marketing

Finance and Risk

Customer and Channel

Operations and Workforce

Retail

Demand forecasting

Loyalty programs

Cross-sell and upsell

Customer acquisition

Fraud detection

Pricing strategy

Personalization

Lifetime customer value

Product segmentation

Store location demographics

Supply chain management

Inventory management

Financial Services

Customer churn

Loyalty programs

Cross-sell and upsell

Customer acquisition

Fraud detection

Risk and compliance

Loan defaults

Personalization

Lifetime customer value

Call center optimization

Pay for performance

Healthcare

Marketing mix optimization

Patient acquisition

Fraud detection

Bill collection

Population health

Patient demographics

Operational efficiency

Pay for performance

Manufacturing

Demand forecasting

Marketing mix optimization

Pricing strategy

Perf risk management

Supply chain optimization

Personalization

Remote monitoring

Predictive maintenance

Asset management

Examples with just a few lines of code that you can try out right now:

Predict food inspection results using Machine Learning + Spark
Predict building temperature using Machine Learning + Spark

Scenario #7: Putting it all together in a notebook experience

For data scientists, we provide out-of-the-box integration with Jupyter (iPython), the most popular open source notebook in the world. Unlike other managed Spark offerings that might require you to install your own notebooks, we worked with the Jupyter OSS community to enhance the kernel to allow Spark execution through a REST endpoint.

We co-led “Project Livy” with Cloudera and other organizations to create an open source Apache licensed REST web service that makes Spark a more robust back-end for running interactive notebooks. As a result, Jupyter notebooks are now accessible within HDInsight out-of-the-box. In this scenario, we can use all of the services in Azure mentioned above with Spark with a full notebook experience to author compelling narratives and create data science collaborative spaces. Jupyter is a multi-lingual REPL on steroids. Jupyter notebook provides a collection of tools for scientific computing using powerful interactive shells that combine code execution with the creation of a live computational document. These notebook files can contain arbitrary text, mathematical formulas, input code, results, graphics, videos and any other kind of media that a modern web browser is capable of displaying. So, whether you’re absolutely new to R or Python or SQL or do some serious parallel/technical computing, the Jupyter Notebook in Azure is a great choice.

You can also use Zeppelin notebooks on Spark clusters in Azure to run Spark jobs. Zeppelin notebook for HDInsight Spark cluster is an offering just to showcase how to use Zeppelin in an Azure HDInsight Spark environment. If you want to use notebooks to work with HDInsight Spark, I recommend that you use Jupyter notebooks. To make development on Spark easier, we support IntelliJ Spark Tooling which introduces native authoring support for Scala and Java, local testing, remote debugging, and the ability to submit Spark applications to the Azure cloud.

Scenario #8: Using Excel with Spark

As a final example, I wanted to describe the ability to connect Excel to Spark cluster running in Azure using the Microsoft Open Database Connectivity (ODBC) Spark Driver. Download it here.

Excel is one of the most popular clients for data analytics on Microsoft platforms. In Excel, our primary BI tools such as PowerPivot, data-modeling tools, Power View, and other data-visualization tools are built right into the software, no additional downloads required. This enables users of all levels to do self-service BI using the familiar interface of Excel. Through a Spark Add-in for Excel users can easily analyze massive amounts of structured or unstructured data with a very familiar tool.

Conclusion

Above, I’ve described some of the amazing, game-changing scenarios for real-time big data processing with Spark on Azure. Any company across the globe, from a huge enterprise to a small startup can take their business to the next level with these scenarios and solutions. The question is, what are you waiting for?

The post Eight scenarios with Apache Spark on Azure that will transform any business appeared first on Microsoft SQL Server Blog.

SQL Server 2016: Everything built-in

SQL Server Team — Wed, 28 Oct 2015 15:00:00 +0000

This post was authored by Joseph Sirosh, Corporate Vice President of the Data Group at Microsoft.

Announcing SQL Server 2016 CTP 3.0, Azure Data Lake preview and much more.

We live in the age of data, and the ability to extract actionable intelligence from data is driving a fundamental transformation in how we live, work and play. This year, at the PASS Summit, we have several exciting announcements about new products and capabilities that will drive this transformation even further:

We are announcing the Community Technology Preview (CTP) 3.0 of SQL Server 2016. To experience the new, exciting features in SQL Server 2016 and the new rapid release model, download the preview. CTP 3.0 includes new innovations for mission-critical performance with In-Memory OLTP and real-time Operational Analytics, first in-market Always Encrypted, built-in SQL Server R Services, JSON support, and federated query from relational to Hadoop with PolyBase, and active archive of cold data to Azure with Stretch Database. This preview also includes new Business Intelligence (BI) capabilities for SQL Server Analysis Services and SQL Server Reporting Services, and we plan to include mobile BI capabilities in the coming months to deliver end-to-end BI solutions for on-premises implementations.
Azure Data Lake. Previously disclosed at the Strata conference, Azure Data Lake offers unbelievable analytic processing power and an exabyte-scale big data store as a fully managed service. It includes all the capabilities required to make it easy for developers, data scientists and analysts to store data of any size, shape and speed, and do all types of processing and analytics across platforms and languages. Part of the Cortana Analytics Suite, Azure Data Lake includes new previews available today for the Store and the Analytics service and also includes Azure HDInsight, which is already generally available.
We are also pleased to announce the public preview of In-Memory OLTP and general availability of Operational Analytics in Azure SQL Database. In-Memory OLTP dramatically improves transaction processing performance, our In-Memory Columnstore and In-Memory OLTP can naturally be used together in the same cloud solution for high throughput transaction processing and real-time operational analytics.

With the upcoming release of SQL Server 2016, our best SQL Server release in history, and the recent availability of the Cortana Analytics Suite, Microsoft is offering unmatched innovation across on-premises and the cloud to help you turn data into intelligent action.

SQL Server, an industry leader, now packs an even bigger punch

SQL Server 2016 builds on this leadership, and will come packed with powerful built-in features. As the least vulnerable database for six years in a row, SQL Server 2016 offers security that no other database can match. It also has the data warehouse with the highest price-performance, and offers end-to-end mobile BI solutions on any device at a fraction of the cost of other vendors. It provides tools to go beyond BI with in-database Advanced Analytics, integrating the R language and scalable analytics functions from our recent acquisition of Revolution Analytics.

Our cloud-first product development model means that new features get hardened at scale in the cloud, delivering proven on-premises experience. In addition, we offer a consistent experience across on-premises and cloud with common development and management tools and common T-SQL.

Security with Always Encrypted

The Always Encrypted feature in SQL Server 2016 CTP 3.0, an industry-first, is based on technology from Microsoft Research and helps protects data at rest and in motion. Using Always Encrypted, SQL Server can perform operations on encrypted data and – best of all – the encryption key resides with the application in the customers’ trusted environment. It offers unparalleled security.

One example of a customer that’s already benefitting from this new feature is Financial Fabric, an ISV that offers a service called DataHub to hedge funds. The service enables a hedge fund to collect data ranging from transactions to accounting and portfolio positions from multiple parties such as prime brokers and fund administrators, store it all in one central location, and make it available via reports and dashboards.

“Data protection is fundamental to the financial services industry and our stakeholders, but it can cause challenges with data driven business models,” said Subhra Bose, CEO, Financial Fabric. “Always Encrypted enables the storage and processing of sensitive data within and outside of business boundaries, without compromising data privacy in both on-premises and cloud databases. At Financial Fabric we are providing DataHub services with “Privacy by Design” for our client’s data, thanks to Always Encrypted in SQL Server 2016. We see this as a huge competitive advantage because this technology enables data science in Financial Services and gives us the tools to ensure we are compliant with jurisdictional regulations.”

Mission Critical Performance

With an expanded surface area, you can use the high performance In-Memory OLTP technology in SQL Server with a significantly greater number of applications. We are excited to introduce the unique capabilities of combine in-memory analytics (columnstore) with in-memory OLTP and traditional relational store in the same database to achieve real-time operational analytics. We have also made significant performance and scale improvements across all components in the SQL Server core engine.

Insights on All Your Data

You’ll find significant improvements in both SQL Server Analysis Services (SSAS) and SQL Server Reporting Services (SSRS) that help deliver business insights faster and improve productivity for BI developers and analysts. The enhanced DirectQuery enables high-performing access to external data sources like SQL Server Columnstore. This capability enhances the use of SSAS as a semantic model over your data for consistency across reporting and analysis without storing the data in Analysis Services.

SQL Server Reporting Services 2016 offers a modernized experience for paginated reports and updated tools as well as new capabilities to more easily design stunning documents. To get more from your investments in SSRS and to provide easy access to on-premises reports to everyone in your organization, you can now pin paginated reports items to the Power BI dashboard. In coming months, we will add new Mobile BI capabilities to Reporting Services, allowing you to create responsive, interactive BI reports optimized for mobile devices.

PolyBase, available today with the Analytic Platform System, is now built into SQL Server, expanding the power to extract value from unstructured and structured data using your existing T-SQL skills. PolyBase CTP 3.0 improvements including better performance and scale out PolyBase nodes to use other SQL Server instances.

Advanced Analytics

CTP 3.0 introduces a new workload for Advanced Analytics with built-in SQL Server R Services. SQL Server R Services bridges the gap between data scientists and DBAs by enabling you to embrace the highly popular open source R language in SQL Server to build intelligent applications and discover new insights about your business. The SQL Server database is the best place to run Advanced Analytics because you can leverage industry leading technologies such as the In-Memory Columnstore and Parallelized R Services for fast predictive in-database analytics.

SQL Developers can learn R skills and build intelligent applications, while Data Scientists can leverage powerful SQL Database tools to create value through predictive and prescriptive enhancements to their applications. SQL Server 2016 enables intelligent applications to be built by hosting analytical models in the database, while reducing complexity and overall costs by moving expensive analytic computations close to the data.

New Hybrid Scenario using Stretch Database

Stretch Database enables stretching a single database between on-premises and Azure. This will enable our customers to take advantage of the cloud economics of lower cost compute and storage without being forced into an all-or-nothing database move. Stretch Database is transparent to your application, and the trickle of data to Azure can be paused and restarted without downtime. You can use Always Encrypted with Stretch Database to extend data in a more secure manner for greater peace of mind.

Azure Data Lake Store and Analytics Service available in preview today

Last month we announced a new and expanded Azure Data Lake that makes big data processing and analytics simpler and more accessible. Azure Data Lake includes the Azure Data Lake Store, a single repository where you can easily capture data of any size, type and speed, Azure Data Lake Analytics, a new service built on Apache YARN that dynamically scales so you can focus on your business goals, not on distributed infrastructure, and Azure HDInsight, our fully managed Apache Hadoop cluster service. Azure Data Lake is an important part of the Cortana Analytics Suite and a key component of Microsoft’s big data and advanced analytics portfolio.

The Azure Data Lake service includes U-SQL, a language that unifies the benefits of SQL with the expressive power of user code. U-SQL’s scalable distributed query capability enables you to efficiently analyze data in the store and across SQL Servers in Azure, Azure SQL Database and Azure SQL Data Warehouse. Customers can use Azure Data Lake tools for Visual Studio, which simplifies authoring, debugging and optimization and provides an integrated development environment for analytics.

ASOS.com, the UK’s largest independent online fashion and beauty retailer, has been using Azure Data Lake to improve customer experience on their website. “At ASOS we are committed to putting the customer first. As a global fashion destination for 20-somethings we need to stay abreast of customer behaviour on our site, enabling us to optimize their shopping experience across all platforms of ASOS.com and wherever they are in the world. Microsoft Azure Data Lake Analytics assists in processing large amounts of unstructured clickstream data to track and optimize their experience. We have been able to get productive immediately using U-SQL because it was easy to use, extend and view and monitor the jobs all within Visual Studio” said Rob Henwood, Enterprise Architect at ASOS.com.

Azure SQL Database In-Memory OLTP and Operational Analytics

Today, we are releasing our next generation in-memory technologies to Azure with the public preview of In-Memory OLTP and real-time Operational Analytics in Azure SQL Database. In-Memory OLTP in the Azure SQL Database preview includes the expanded surface area available in SQL Server 2016, enabling more applications to benefit from higher performance. By bringing this technology to the cloud, customers will be able to take advantage of in-memory OLTP and Operational Analytics in a fully managed database-as-a-service with 99.99% SLA.

Combined with the releases earlier this month of Always Encrypted, Transparent Data Encryption, support for Azure Active Directory, Row-Level security, Dynamic Data Masking and Threat Detection, Azure SQL Database provides unparalleled data security in the cloud with fast performance. As part of our intelligent capabilities, SQL Database also has built-in advisors to help customers get started quickly with in-memory OLTP to optimize performance.

It’s never been easier to capture, transform, mash-up, analyze and visualize any data, of any size, at any scale, in its native format using familiar tools, languages and frameworks in a trusted environment, both on-premises and in the cloud. Share your feedback on the new SQL Server 2016 capabilities using Microsoft’s Connect tool. If you have questions, join discussion forums on SQL Server 2016 at MSDN and Stack Overflow.

Learn more about the Azure Data Lake Store and Analytics service today and try the new In-Memory and security previews of Azure SQL Database now.

Finally, don’t forget to join us, either live or via the livestream feed, at the PASS Summit 2015 keynote and foundational sessions. And be sure to take advantage all the great sessions at the PASS Summit this week.

Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

The post SQL Server 2016: Everything built-in appeared first on Microsoft SQL Server Blog.

Microsoft expands Azure Data Lake to unleash big data productivity

SQL Server Team — Mon, 28 Sep 2015 13:01:00 +0000

By T. K. “Ranga” Rengarajan, corporate vice president, Data Platform

In July of this year, Satya Nadella shared our broad vision for big data and analytics when he announced Cortana Analytics. Building on this vision, today we’re announcing a new and expanded Azure Data Lake that makes big data processing and analytics simpler and more accessible. The expanded Microsoft Azure Data Lake includes the following:

Azure Data Lake Store, previously announced as Azure Data Lake, will be available in preview later this year. The Data Lake Store provides a single repository where you can easily capture data of any size, type and speed without forcing changes to your application as data scales. In the store, data can be securely shared for collaboration and is accessible for processing and analytics from HDFS applications and tools.
Azure Data Lake Analytics, a new service built on Apache YARN that dynamically scales so you can focus on your business goals, not on distributed infrastructure. This service will be available in preview later this year and includes U-SQL, a language that unifies the benefits of SQL with the expressive power of user code. U-SQL’s scalable distributed query capability enables you to efficiently analyze data in the store and across SQL Servers in Azure, Azure SQL Database and Azure SQL Data Warehouse.
Azure HDInsight, our fully managed Apache Hadoop cluster service with a broad range of open source analytics engines including Hive, Spark, HBase and Storm. Today, we are announcing general availability of managed clusters on Linux with an industry-leading 99.9% uptime SLA. HDInsight will be able to take advantage of capabilities in the Store for increased throughput, scale and security.

Supporting the Azure Data Lake:

Azure Data Lake Tools for Visual Studio, provide an integrated development environment that spans the Azure Data Lake, dramatically simplifying authoring, debugging and optimization for processing and analytics at any scale.
Leading Hadoop ISV applications that span security, governance, data preparation and analytics can be easily deployed from the Azure Marketplace on top of Azure Data Lake.

Azure Data Lake

Azure Data Lake includes all the capabilities required to make it easy for developers, data scientists, and analysts to store data of any size, shape and speed, and do all types of processing and analytics across platforms and languages. It removes the complexities of ingesting and storing all of your data while making it faster to get up and running with batch, streaming, and interactive analytics. Azure Data Lake works with existing IT investments for identity, management, and security for simplified data management and governance. It also integrates seamlessly with operational stores and data warehouses so you can extend current data applications. We’ve drawn on the experience of working with enterprise customers and running some of the largest scale processing and analytics in the world for Microsoft businesses like Office 365, Xbox Live, Azure, Windows, Bing and Skype. Azure Data Lake solves many of the productivity and scalability challenges that prevent you from maximizing the value of your data assets with a service that’s ready to meet your current and future business needs.

“Hortonworks and Microsoft have partnered closely over many years to further the Hadoop platform for big data analytics, including contributions to YARN, Hive, and other Apache projects,” said Rob Bearden, CEO at Hortonworks. “Azure Data Lake, including Azure HDInsight powered by Hortonworks Data Platform, demonstrates our shared commitment to make it easier for everyone to work with big data.”

Azure Data Lake Store – A hyper-scale repository for big data processing and analytic workloads

The value of a data lake resides in the ability to develop solutions across data of all types – unstructured, semi-structured and structured. This begins with the Azure Data Lake Store, a single repository to capture and access any type of data for high-performance processing and analytics and low latency workloads with enterprise-grade security. For example, data can be ingested in real-time from sensors and devices for IoT solutions, or from online shopping websites into the store without the restriction of fixed limits on account or file size unlike current offerings in the market. As part of Azure Data Lake, the store supports development of your big data solutions with the language or framework of your choice. The store in Azure Data Lake is HDFS compatible so Hadoop distributions like Cloudera, Hortonworks®, and MapR can readily access the data for processing and analytics.

“Cloudera is pleased to be working closely with Microsoft to integrate our enterprise data hub with the Azure Data Lake Store,” said Mike Olson, founder and chief strategy officer at Cloudera. “Cloudera on Azure benefits from the Data Lake Store which acts as a cloud-based landing zone for data in your enterprise data hub. Because the store is compatible with WebHDFS, Cloudera can leverage Data Lake and provide customers with a secure and flexible big data solution.”

Azure Data Lake Analytics – a new distributed processing and analytics service

Azure Data Lake Analytics lets you focus on the logic of your application, not the distributed infrastructure running it. Instead of deploying, configuring and tuning hardware, you write queries to transform your data and extract valuable insight. Built on Apache YARN, and designed for the cloud, the analytics service can handle jobs of any scale instantly by simply setting the dial for how much power you need. The analytics service for Azure Data Lake is cost-efficient because you only pay for your job when it is running, and support for Azure Active Directory lets you manage access and roles simply and integrates with your on-premises identity system

We know that many developers and data scientists struggle to be successful with big data using existing technologies and tools. Code-based solutions offer great power, but require significant investments to master, while SQL-based tools make it easy to get started but are difficult to extend. We’ve faced the same problems inside Microsoft and that’s why we introduced, U-SQL, a new query language that unifies the ease of use of SQL with the expressive power of C#. The U-SQL language is built on the same distributed runtime that powers the big data systems inside Microsoft. Millions of SQL and .NET developers can now process and analyze all of their data with the skills they already have. The U-SQL support in Azure Data Lake Tools for Visual Studio includes state of the art support for authoring, debugging and advanced performance analysis features for increased productivity when optimizing jobs running across thousands of nodes.

“U-SQL was especially helpful because we were able to get up and running using our existing skills with .NET and SQL,” says Sam Vanhoutte, Chief Technology Officer at Codit. “This made big data easy because we didn’t have to learn a whole new paradigm. With Azure Data Lake, we were able to process data coming in from smart meters and combine it with the energy spot market prices to give our customers the ability to optimize their energy consumption and potentially save hundreds of thousands of dollars.”

Azure HDInsight – Fully Managed Hadoop, Spark, Storm and HBase

Azure Data Lake also includes HDInsight, our Apache Hadoop-based service that allows you spin up any number of nodes in minutes. As one of the fastest growing services in Azure, HDInsight gives you the breadth of the Hadoop ecosystem in a managed service that’s monitored and supported by Microsoft. Furthering our commitment to productivity, we’ve updated our Visual Studio Tools for authoring, advanced debugging, and tuning for Hive queries and Storm topologies running in HDInsight.

Today, we are announcing the general availability of HDInsight on Linux. We work closely with Hortonworks and Canonical to provide the HDP™ distribution on the Ubuntu Operating System that powers the Linux version of HDInsight in the Data Lake. This is another strategic step by Microsoft to meet customers where they are and make it easier for you run Hadoop workloads in the cloud.

Leading Hadoop ISVs on the Azure Data Lake

There are a growing set of leading data management applications for Azure Data Lake. This includes applications that provide end-to-end big data analytics like Datameer, technologies that address big data security and governance like Dataguise and BlueTalon, unified stream and batch with DataTorrent, and tools that give business users the ability to visualize and analyze data in compelling ways like AtScale and Zoomdata. Support from our partners ensures that you have the best applications available as you get started with Azure Data Lake.

We will continue to invest in solutions for big data processing and analytics to make it easier for everyone to work with data of any type, size and speed using the tools, languages and frameworks they want to in a trusted cloud, hybrid or on premise environment. Our goal is to make big data technology simpler and more accessible to the greatest number of people possible. This includes developers, data scientists, analysts, application developers, and also businesspeople and mainstream IT managers..

You can hear more about these announcements during my keynote at our free, virtual event AzureCon tomorrow or on-demand, and at Strata + Hadoop World in NYC.

The post Microsoft expands Azure Data Lake to unleash big data productivity appeared first on Microsoft SQL Server Blog.

MapR-based Hadoop Clusters Coming to the Azure Marketplace

SQL Server Team — Wed, 10 Jun 2015 12:00:00 +0000

Microsoft is committed to continuous innovation to make Azure the best cloud platform for running hyper-scale big data projects. This includes an existing Hadoop-as-a-service solution, Azure HDInsight, a hyper-scale repository for big data, Azure Data Lake, and Hadoop infrastructure-as-a-service offerings from Hortonworks and Cloudera. This week, Hortonworks also announced their most recent milestone with Hortonworks Data Platform 2.3 which will be available on Azure this summer.

Today, we are excited to announce that MapR will also be available in the summer as an option for customers to deploy Hadoop from the Azure Marketplace. MapR is a leader in the Hadoop community that offers the MapR Distribution including Hadoop which includes MapR-FS, an HDFS and POSIX compliant file store, and MapR-DB, a NoSQL key value store. The Distribution also includes core Hadoop projects such as Hive, Impala, SparkSQL, and Drill, and MapR Control System, a comprehensive management system. When MapR is available in the Azure Marketplace, customers will be able to launch a full Hadoop cluster based on MapR as an Azure Virtual Machine with a few clicks. Together with Azure Data Lake, SQL Server, and Power BI, this will allow organizations to build big data solutions quickly and easily by using the best of Microsoft and MapR.

Our partnership with MapR allows customers to use the Hadoop distribution of their choice while getting the cloud benefits of Azure. It is also a sign of our continued commitment to make Hadoop more accessible to customers by supporting the ability to run big data workloads anywhere – on hosted VM’s and managed services in the public cloud, on-premises or in hybrid scenarios.

We are very excited be on this journey of making big data more readily accessible to accelerate ubiquitous adoption. We hope you join us for this ride!

T.K. “Ranga” Rengarajan

Corporate Vice President, Data Platform at Microsoft

The post MapR-based Hadoop Clusters Coming to the Azure Marketplace appeared first on Microsoft SQL Server Blog.

Microsoft Announces Azure SQL Database elastic database, Azure SQL Data Warehouse, Azure Data Lake

SQL Server Team — Wed, 29 Apr 2015 21:00:00 +0000

In this mobile-first, cloud-first world, we’re creating and consuming data through new devices and services – and developers are building applications and analytics solutions at a rapid pace to take advantage of the new forms, types and sizes of data. As Scott Guthrie talked about in his keynote this morning, a big piece of what we’ve been working on and will continue to invest in, is making it easier to work with all your data – no matter how big or complex – and how to build new applications utilizing data to take advantage of the intelligent cloud. Today, we’re pleased to share three major data platform announcements: Azure SQL Database elastic database, Microsoft’s new offering to support SaaS applications; Azure SQL Data Warehouse, a fully managed relational data warehouse-as-a-service; and Azure Data Lake Microsoft’s hyper-scale data store optimized for big data analytic workloads.

Azure SQL Database elastic databases

As customers look to ease and expedite building and managing applications, the scale, simplicity and economics of the cloud are impossible to ignore. With new capabilities and enhanced security features, Microsoft’s relational database-as-a-service, Azure SQL Database, can support robust enterprise applications in the cloud as well new SaaS applications, including:

Elastic databases – available in preview today – allow you to build SaaS applications to manage large numbers of databases that have unpredictable resource demands. Managing dynamic resource needs can be more art than science, and with these new capabilities, you can pool resources across databases to support explosive growth and profitable business models. Instead of overprovisioning to accommodate peak demand, cloud ISVs and developers can use an elastic database pool to share resources across hundreds – or thousands – of databases within a budget that they control. Additionally, we are making tools available to help query and aggregate results across these databases as well as implement policies and perform transactions across the database pool.

Create a pool of elastic databases to scale and share resources across unpredictable demands.
New security capabilities for managing data and applications in Azure: Row-level security and Dynamic data masking are already currently in preview, and new in preview today is Transparent data encryption. Transparent data encryption has been a top request from customers and we are excited to bring this to market building on the other advanced security features already available in preview.
Preview of Full-text search capabilities in Azure SQL Database to support richer search capabilities in new cloud applications. With this and other features such as the in-memory columnstore and parallel query, we continue to bring the benefits from the decades of innovation in query processing technologies on-premises to the cloud and make it even easier to migrate existing on-premises SQL Server applications to the cloud.

Azure SQL Data Warehouse

As customers move more applications and structured data in the cloud, we’ve seen strong demand for additional options for cloud-based data warehousing and analytics. Scott also announced Azure SQL Data Warehouse, a new, first-of-its-kind elastic data warehouse in the cloud. It’s the first enterprise-class cloud data warehouse that can dynamically grow, shrink and pause compute in seconds independent of storage, enabling you to pay for the query performance you need, when you need it. Azure SQL Data Warehouse is based on the massively parallel processing architecture currently available in both SQL Server and the Analytics Platform System appliance, and will work with existing data tools including Power BI for data visualization, Azure Machine Learning for advanced analytics, Azure Data Factory for data orchestration and Azure HDInsight, our 100% Apache Hadoop managed big data service. The preview for Azure SQL Data Warehouse will be available later this calendar year.

Introducing Azure SQL Data Warehouse

Azure Data Lake

For customers looking to maximize value on unstructured, semi-structured and structured data, we announced Azure Data Lake, a hyper-scale data store for big data analytic workloads. Azure Data Lake is built to solve for restrictions found in traditional analytics infrastructure and realize the idea of a “data lake” – a single place to store every type of data in its native format with no fixed limits on account size or file size, high throughput to increase analytic performance and native integration with the Hadoop ecosystem. Azure Data Lake is a Hadoop File System compatible with HDFS that is integrated with Azure HDInsight and will be integrated with Microsoft offerings such as Revolution-R Enterprise and industry standard distributions like Hortonworks and Cloudera. The preview for Azure Data Lake will be available later this calendar year.

Microsoft Azure data lake supports multiple big data analytic workloads

Try and sign up for new previews today

The move to the cloud is accelerating across industries, and we are proud to provide a comprehensive database and analytics platform that enables you to more easily work with big data and extract as much value as possible from your data to accelerate your business. Additionally, over the last few months we’ve had the opportunity to share with you a wave of new platform offerings and innovations, from the general availability of the latest Azure SQL Database release bringing near-complete compatibility with SQL Server, our preview of the first managed service running on Linux with HDInsight and the general availability of new cloud services such as Azure DocumentDB and Azure Search. With today’s announcements, we’re build on our existing investments and continuing to make it easier for customers to capture, transform, and analyze any data, of any size, at any scale – using the tools, languages and frameworks they know and want in a trusted environment on-premises and in the cloud.

Try out the Azure SQL Database previews made available today and sign up to be notified as the Azure SQL Data Warehouse and Azure Data Lake previews become available. Stay tuned for more on Microsoft’s data platform at next week’s Ignite conference in Chicago.

The post Microsoft Announces Azure SQL Database elastic database, Azure SQL Data Warehouse, Azure Data Lake appeared first on Microsoft SQL Server Blog.