Azure Data Lake - Microsoft SQL Server Blog http://approjects.co.za/?big=en-us/sql-server/blog/product/azure-data-lake/ Official News from Microsoft’s Information Platform Thu, 19 Mar 2026 23:30:21 +0000 en-US hourly 1 http://approjects.co.za/?big=en-us/sql-server/blog/wp-content/uploads/2018/08/cropped-cropped-microsoft_logo_element-150x150.png Azure Data Lake - Microsoft SQL Server Blog http://approjects.co.za/?big=en-us/sql-server/blog/product/azure-data-lake/ 32 32 FabCon and SQLCon 2026: Unifying databases and Fabric on a single data platform https://azure.microsoft.com/en-us/blog/fabcon-and-sqlcon-2026-unifying-databases-and-fabric-on-a-single-data-platform/ Wed, 18 Mar 2026 12:45:00 +0000 Welcome to the third annual FabCon and our first ever SQLCon here in Atlanta, Georgia. With nearly 300 workshops and sessions, this joint event will highlight how they are bringing the power of Microsoft SQL and Microsoft Fabric together to create a single, unified platform.

The post FabCon and SQLCon 2026: Unifying databases and Fabric on a single data platform appeared first on Microsoft SQL Server Blog.

]]>
Welcome to the third annual FabCon and our first ever SQLCon here in Atlanta, Georgia. With nearly 300 workshops and sessions, this joint event will highlight how they are bringing the power of Microsoft SQL and Microsoft Fabric together to create a single, unified platform. But FabCon 2026 and SQLCon 2026 are about more than product innovation. It’s about providing space for our 8,000 attendees to come together and share real experiences, learn from each other, and solve challenges side-by-side. Only together can we move beyond the hype and into meaningful results.

Learn more about FabCon and SQLCon 2026
The excitement surrounding this event reflects the same momentum we’re seeing across our data portfolio. Just two and a half years after Microsoft Fabric reached general availability, it’s already serving more than 31,000 customers and remains the fastest-growing data platform in Microsoft’s history. Fortune 500 companies like The Coca-Cola Company are already using Fabric at scale across their organizations.

Microsoft Fabric is helping us evolve our data foundation into a more unified, AI-ready platform. Combined with Power BI and capabilities like Fabric IQ, it enables the enterprise to turn data into intelligence and act on it faster.

Shekhar Gowda, Vice President of Global Marketing Technologies at The Coca-Cola Company
Our databases are accelerating just as quickly, with SQL Server 2025 growing more than twice as fast as the previous version.

Today, we’re thrilled to share how we are bringing the power of databases and Fabric together to form a truly converged data platform—one that unifies transactional, operational, and analytical data under a single, consistent architecture. I’ll also highlight how we’ve enhanced Fabric to help you transform data into the semantic knowledge AI needs to understand your business, powered by Fabric IQ and Power BI’s industry-leading semantic model technology.

Introducing the Database Hub in Microsoft Fabric
Databases sit at the heart of the enterprise data estate—a system of record powering applications, transactions, and mission‑critical insights. Yet as organizations scale across cloud, on‑premises, and edge environments, database estates have become increasingly fragmented and isolated. As AI places even greater demands on data estates, unifying databases under a single access point and control plane has become essential.

To address this challenge, Fabric is expanding its role as the central access point for enterprise data with the Database Hub in Fabric, now available in early access. Database Hub in Fabric provides a unified database management experience that brings together databases across edge, cloud, and Fabric into a single, coherent view. Teams now have one place to explore, observe, govern, and optimize their entire database estate—including Azure SQL, Azure Cosmos DB, Azure Database for PostgreSQL, SQL Server (enabled by Azure Arc), Azure Database for MySQL, and Fabric Databases—without changing how each service is deployed.

Built for scale, the Database Hub in Fabric introduces an agent‑assisted, human-in-the loop approach to database management. With built-in observability, delegated governance, and Microsoft Copilot-powered insights, teams can deploy intelligent agents to continuously reason over estate‑wide signals and surface what changed, explain why it matters, and guide teams toward what to do next. The result is a simpler, more confident way to manage databases at scale. Over time, this model enables database estates to become more proactive, resilient, and intelligent, laying the foundation for greater autonomy, while keeping humans firmly in control of goals, boundaries, and trust.

Learn more about Database Hub in Fabric and what’s new across Databases
Bringing databases together under a single management layer is a critical step as you prepare your estates for AI at scale. But it’s not the end of the journey. The challenge shifts from where data lives to how data is understood, connected, and activated across the enterprise.

Getting your data estate ready for AI with Fabric
As organizations move from traditional applications to AI‑powered, multi‑agent systems, the advantage is shifting away from the specific model you deploy. It now lies in the intelligence and context that allow agents to understand how your business is run, the state of your business, and your institutional knowledge to help take meaningful action.

This is the challenge Microsoft IQ is designed to address. Unlike point solutions on the market today, Microsoft IQ provides an intelligence layer that delivers shared, enterprise-grade business context to every agent. That context is built from three complementary sources: productivity signals from Work IQ, institutional knowledge from Foundry IQ, and live business data from Fabric IQ.

However, like the database layer, while the IQ context layer is a critical part of a successful, and healthy AI foundation, it is not the full story. Building a complete AI-ready data foundation requires investing in four core steps:

Unifying your data estate to eliminate silos and reduce architectural complexity.
Processing and harmonizing data so it becomes AI-ready, clean, connected, and structured for both operational and analytical use.
Curating semantic meaning to give agents contextual understanding, enabling them to interpret data the way your teams already do. This is where Microsoft IQ comes into play.
Empowering AI agents to act, applying that context to automate workflows, accelerate decisions, and transform operations end‑to‑end.
Unifying your data estate with Microsoft OneLake
Every AI initiative starts with the same fundamental challenge: understanding where your data lives and how to bring it together. Microsoft OneLake was built to solve that problem by unifying data across clouds, on-premises environments, and third-party platforms into a single logical data lake without unnecessary extracting, transforming, and loading (ETL), fragmentation, or duplicated copies.

Are my agents hunting for data?

Watch the podcast
Connecting to more sources than ever before
Today, we’re expanding Mirroring in Fabric to support even more systems our customers rely on. Mirroring for SharePoint lists and Dremio are now in preview with Azure Monitor coming soon, while mirroring for Oracle and SAP Datasphere are generally available—all of which are available as part of the core mirroring capabilities. We are also introducing extended capabilities in mirroring designed to help you operationalize mirrored sources at scale, including Change Data Feed (CDF) and the ability to create views on top of mirrored data, starting with Snowflake. Extended capabilities for mirroring will be offered as a paid option.

Shortcut transformations are also now generally available, allowing data to be shaped automatically as it connects to or moves within OneLake. You can convert formats such as Excel to Delta tables, now in preview, and apply AI-powered transformations.

Additionally, we are continuing to invest in open interoperability, ensuring OneLake works seamlessly with the platforms organizations already use. We are excited to announce the ability to natively read from OneLake through Azure Databricks Unity Catalog is now in public preview. We also recently announced the general availability of our interoperability with Snowflake.

I’m also excited to share that Auger, a rapidly growing supply chain platform designed to bring intelligence and automation to global operations, has built its platform on Fabric, with all data stored natively in OneLake. This architecture enables Auger customers to seamlessly access their operations data through OneLake shortcuts within their own Fabric environments and use the full power of the platform including Power BI, Fabric data agents, and more. Learn more in my blog, co-authored with Auger Chief Executive Officer Dave Clark.

Protect your data with OneLake security, now generally available
Security and governance remain foundational to OneLake. I’m thrilled to announce OneLake security will be generally available in the coming weeks, enabling data owners to define roles, enforce row- and column-level controls, and manage permissions through a single unified model that follows the data.

To learn more about these announcements, read the OneLake blog and the Fabric Data Factory blog.

Processing and harmonizing data with Fabric analytics
AI agents are only as reliable as the data you feed them. Before data can train or ground an agent, it must be integrated, cleaned, and structured, so the agent operates from consistent, trusted information. With industry-leading engines in Fabric like Spark, T-SQL, KQL, and Analysis Services, we can equip data teams to do exactly that.

Now, we are expanding these capabilities with the introduction of Runtime 2.0 in preview, purpose-built for large-scale data computation. It incorporates Apache Spark 4.x, Delta Lake 4.x, Scala 2.13, and Azure Linux Mariner 3.0 to power advanced enterprise workloads. Materialized lake views are also now generally available, simplifying medallion architecture implementation in Spark SQL and PySpark and enabling always up-to-date pipelines with no manual orchestration. In addition, a new agentic Copilot experience in notebooks delivers deeper context awareness, reasoning over your workspace, and generating code with greater speed and precision.

For real-time scenarios, we’re launching Microsoft Fabric Maps into general availability. Maps add geospatial context to your agents and operations by turning large volumes of location-based data into interactive, real-time visual insights.

For a comprehensive overview of these announcements and much more, read the Fabric Analytics announcement blog and the Fabric Real-Time Intelligence announcement blog.

Creating semantic meaning with Fabric IQ
Preparing raw data for AI is essential. The next step is transforming that data into meaningful, unified business context. That is where Fabric IQ comes in.

Fabric IQ unifies analytical data and operational data, including telemetry, time series, graph, and geospatial data, within a shared semantic framework of business entities, relationships, properties, rules, and actions. Instead of thinking in terms of tables and schemas, your teams and agents can operate on this framework, or ontology, aligned to how the business actually runs.

Fabric IQ ontologies will soon become accessible through an MCP server in preview, enabling agents to discover, understand, and act on this semantic layer. Ontologies can also serve as context sources for maps and soon in operations agents in Fabric, extending shared business context directly into operational decision-making and execution.

We are also excited to announce planning in Fabric IQ, a new enterprise planning capability that enables organizations to create plans, budgets, forecasts, and scenario models directly on top of Fabric’s semantic models. By complementing Fabric IQ’s ontologies with integrated planning, you get a complete, contextual view of your historical, real-time, and forward planning data. This allows users and agents to quickly answer what has happened, what is happening, and what should happen all from a single source. See this in action:

Finally, we recently announced a strategic partnership with NVIDIA to power the next generation of Physical AI by integrating Real-Time Intelligence and Fabric IQ with NVIDIA Omniverse libraries. The combined platform unifies real‑time operational data, business semantics, and physical simulation to enable organizations to optimize their physical operations in scenarios like intelligent digital twins, predictive maintenance, autonomous logistics, and energy optimization.

To learn more about all of our partner announcements, read the Fabric ISV announcement blog and the planning in Fabric IQ blog.

Enhancing the underlying Fabric IQ technology
Powering much of Fabric IQ’s rich experience is a combination of Power BI’s industry-leading, rich semantic model technology and graph in Fabric, our highly scalable graph database. Already delivering insights to more than 35 million active users, semantic models provide the ideal foundation for training agents through Fabric IQ. Now, with the general availability of Direct Lake on OneLake, your tables can be read directly from OneLake with native security enforcement, richer cross-item modeling, and import-class performance without data movement or refresh.

I’m also excited to share that graph in Fabric will be generally available in the coming weeks, enabling teams to visualize and query complex relationships across customers, partners, and supply chains.

To learn more, check out the Fabric IQ announcement blog and the Power BI announcement blog.

Empowering agents to act with Fabric data and operations agents
Frontier organizations are moving beyond general-purpose assistants and instead, adopting multi-agent systems composed of specialized agents. These agents are each grounded on specific data and reusable across different systems, allowing you to deliver more accurate, accelerated, and scalable outcomes.

To support your multi-agent systems, Fabric comes with built-in agent creation capabilities with Fabric data agents and operations agents. I’m excited to share that Fabric data agents are now generally available. Fabric data agents can be thought of as virtual analysts, aligned to specific domain data to support deeper analysis and deliver insights. Operations agents complement them by monitoring real-time data, detecting patterns, and taking proactive action.

Check out a quick demo of operations agents in Fabric:

These agents can be used across Fabric or as foundational knowledge sources in leading AI tools like Microsoft Foundry, Copilot Studio or even Microsoft 365 Copilot. To learn more about our AI announcements, check out the Fabric analytics blog covering data agents and the Fabric IQ blog covering operations agents.

Building mission-critical applications with developer experiences in Fabric
Developers building the next generation of AI applications need a comprehensive, cost-effective data platform that’s already integrated with your existing tools and workflows. Today, we are expanding Fabric’s developer tooling to meet that demand.

First, Fabric Model Context Protocol (MCP) is advancing with two major milestones. Fabric local MCP is now generally available, providing an open-source local server that connects AI coding assistants such as GitHub Copilot directly to Fabric. Alongside this, we’re introducing the public preview of Fabric remote MCP, a secure, cloud‑hosted execution engine that enables AI agents and automation tools to perform authenticated actions in Fabric.

We’re also enhancing our Git integration with selective branching, allowing developers to branch out for a specific feature and pull only the items they need. You also get improved change comparisons to more easily review recent updates, and new folder relationships which show how feature workspaces connect to source workspaces.

We’re also launching two open-source projects to help teams move faster with Fabric: Agent Skills for Fabric and Fabric Jumpstart. Agent Skills for Fabric is an open-source set of purpose-built plugins that let you use natural language in the GitHub Copilot terminal to harness the full power of Microsoft Fabric. Additionally, Fabric Jumpstart is designed to help you get off the ground with detailed guidance, reference architectures, and single‑click deployments for sample datasets, notebooks, pipelines, and reports.

Finally, we are announcing that the Fabric Extensibility Toolkit (FET), an evolution of the Workload Development Kit (WDK), is now generally available. Along with this release, we are enabling support for full CI/CD, variable library, and a new management experience in the Admin portal.

Read the Fabric Platform announcement blog
Migrating your existing Azure service to Fabric
As Fabric continues to grow in functionality, we are also simplifying the migration from other Azure services. In addition to our existing Synapse tooling, we are bringing new migration assistants for Azure Data Factory, Azure Synapse Analytics, and Azure SQL in public preview.

The new Fabric migration assistant for Azure Data Factory and Synapse Analytics helps move your existing pipelines and artifacts like Spark pools and notebooks into Fabric with minimal disruption. It’s designed to support incremental modernization, allowing teams to evaluate, convert, and optimize pipelines as they transition to Fabric. The migration assistant for SQL databases helps move SQL Server into Fabric by importing schemas through DACPACs, identifying and resolving compatibility issues with AI assistance, and guiding teams through assessment and data copy workflows for a smoother cutover.

See more Fabric innovation
Explore the AI shift with The Shift podcast
In addition to the announcements above, we are also rolling out a broad set of Fabric innovations across the platform. For a deeper look at the updates and what’s new this month, visit the Fabric March 2026 Feature summary blog, the Power BI March 2026 feature summary blog, and the latest posts on the Fabric Updates channel.

Explore additional resources for Microsoft Fabric
Sign up for the Fabric free trial. View the updated Fabric Roadmap. Try the Microsoft Fabric SKU Estimator.
Visit the Fabric website. Join the Fabric community. Read other in-depth, technical blogs on the Microsoft Fabric Updates Blog.
Read additional blogs by industry-leading partners
Sonata Software: Building an AI-ready data platform with data agents, ontology, and governance in Microsoft Fabric
Quadrant Technologies LLC: Real-Time Operational Intelligence in Microsoft Fabric: Deep Dive into RTI Capabilities, Anomaly Detection and Activator Alerting
Inspark: Why switch from Azure Synapse to Microsoft Fabric?
Esri: Unlock the power of location intelligence with ArcGIS for Microsoft Fabric
Dream IT Consulting Services: 8 Real-World Use Cases of Data Agents in Microsoft Fabric
UB Technology Innovations Inc.: From Data Platform to Decision Platform: How Microsoft Fabric and Copilot are Redefining Enterprise Analytics
Simpson Associates: Fabric Data Warehouse: Bringing Structure to Modern Data Strategies
Synapx Ltd.: Migrating Power BI to Microsoft Fabric Lakehouse with Medallion Architecture: A Strategic Imperative for Modern Construction Enterprises
Cloud Services: Real-Time Intelligence in Action: How Microsoft Fabric Helped Delfi Transform Its Newsroom
Cloud Services: Microsoft Fabric Data Agents: A New Reality
iLink Digital: Detect to Act in Seconds: How Real-Time Intelligence Is Rewriting the Rules of Emissions Management
Valorem Reply: How Nonprofits Are Rethinking Data with Microsoft Fabric

The post FabCon and SQLCon 2026: Unifying databases and Fabric on a single data platform appeared first on Microsoft SQL Server Blog.

]]>
Accelerating SQL Server 2025 momentum: Announcing the first release candidate http://approjects.co.za/?big=en-us/sql-server/blog/2025/08/22/accelerating-sql-server-2025-momentum-announcing-the-first-release-candidate/ Fri, 22 Aug 2025 15:00:00 +0000 We are moving toward general availability of SQL Server 2025 and focusing on delivering enhanced stability, performance, and product improvements.

The post Accelerating SQL Server 2025 momentum: Announcing the first release candidate appeared first on Microsoft SQL Server Blog.

]]>
The first release candidate (RC0) of SQL Server 2025 is now available. As we move toward general availability, our focus shifts to delivering enhanced stability, performance, and product improvements based on your feedback.  

Adoption gains speed 

We’re seeing incredible momentum with SQL Server 2025 since its public preview debut at Microsoft Build. From lighting up community events like SQL Saturdays to being featured at SQLBits 2025 with CTP 2.1, the excitement is electric. SQL Server 2025 isn’t just keeping pace, it’s setting a new standard. Customers are adopting SQL Server 2025 twice as fast as SQL Server 2022 based on downloads of the public preview.

announcing sql server 2025

Read the blog

In the early adoption program, participants were asked to rank the SQL Server 2025 features they were most interested in testing. Built-in AI emerged as one of the top priorities, alongside performance and scalability enhancements. In addition, based on feedback from our preview customers, developer-friendly enhancements—especially the introduction of native JSON support—along with powerful T-SQL additions like regular expression support, have also been positively received—streamlining data processing and boosting developer efficiency. Enterprise customers like Entain, Mediterranean Shipping Company, Kramer & Crew, Schultz, and Bühler are already hands-on, exploring how SQL Server 2025 can power their next-gen applications. 

“As one of the largest SQL Server consulting firms in Brazil, we are excited about the AI features in SQL Server 2025, especially the potential for text processing that can benefit companies of all sizes. AI brings new ways to process and extract insights from data and with SQL Server being the core repository for many businesses, native AI features like embeddings, REST API support, and vector indexes are game changers. They eliminate the need for external vector databases, making AI integration more seamless and efficient.”

Rodrigo Ribeiro Gomes, Head of Innovation, Power Tuning

“SQL Server 2025 introduces seamless Azure and Arc integration and features, enhanced JSON and RegEx capabilities, and enhancements to database engine.”  

Shailesh Panday, Deputy Manager, IT, Buhler AG

Explore capabilities with new preview features

SQL Server 2025 introduces a new preview feature option, giving customers the flexibility to balance production stability with early access to innovation. When turned on, it unlocks access to upcoming features still in preview, enabling developers to test and evaluate new capabilities like vector indexing, improved text chunking, and change event streaming without impacting production workloads (a complete list of preview features is here).  

This opt-in model brings the agility of the cloud to on-premises SQL Server, empowering customers to innovate on their terms. Preview features are provided in alignment with Microsoft’s supportability guidelines. They are intended for evaluation and testing purposes only and are not recommended for use in production environments. The database itself in SQL Server 2025 remains as fully supported and is an essential component of the general availability release. Preview features are optional and designed to operate independently in preview mode. Enabling these features does not impact the stability or supportability of your database.  

SQL Server has traditionally used trace flags to enable or disable specific behaviors within the database engine. The new preview feature switch in SQL Server 2025 is fundamentally different from traditional trace flags. While trace flags are primarily used for debugging and diagnostics, often by DBAs or support engineers to control internal engine behavior, the preview feature switch is designed for developers to explore and test new, user-facing capabilities. Trace flags typically operate at the instance level, affecting the entire server, whereas the preview feature switch is a database-scoped configuration, offering more granular control and safer experimentation without impacting other workloads. Learn more about the preview features in the frequently asked questions.

New feature highlights

As SQL Server adoption on Linux continues to grow, we’re excited to introduce preview support for Ubuntu 24.04, one of the most widely used and trusted Linux distributions. This marks a significant step forward in our commitment to cross-platform flexibility and developer choice. By embracing the latest Ubuntu release, SQL Server 2025 ensures developers and IT teams can build and run modern, cloud-connected applications on a familiar and up-to-date Linux environment. 

PolyBase plays a critical role in enabling analytics scenarios by allowing SQL Server to query external data sources like Microsoft Azure Data Lake or Azure Blob Storage using familiar T-SQL. As many of SQL Server’s modern analytics capabilities are deeply integrated with Microsoft Azure services, secure and seamless access to cloud storage is essential. With preview support for Managed Identity authentication to Azure Storage, SQL Server 2025 takes a step forward in simplifying security and access management. This enhancement aligns with SQL Server’s decade-long track record as the most secure database and reinforces our commitment to enterprise-grade security. By eliminating the need for storing secrets or keys, Managed Identity makes it easier and safer for customers to build scalable, cloud-connected analytics solutions using PolyBase. 

Mirroring in Microsoft Fabric is a game-changing capability that unlocks seamless, near real-time analytics on operational data from SQL Server 2025. To help customers manage compute resources efficiently during the mirroring process, SQL Server now supports creating a dedicated Resource Governor (RG) pool. Each phase of mirroring—such as ingestion, transformation, and synchronization—can be assigned to a specific workload group, giving administrators fine-grained control over resource allocation. These workload groups can be placed in the same or different pools depending on capacity planning needs.  

Discover more 

A man sitting at a desk using a computer

SQL Server 2025

An AI-ready enterprise database with best-in-class security, performance, and availability.

The post Accelerating SQL Server 2025 momentum: Announcing the first release candidate appeared first on Microsoft SQL Server Blog.

]]>
Announcing the retirement of SQL Server Stretch Database http://approjects.co.za/?big=en-us/sql-server/blog/2024/07/03/announcing-the-retirement-of-sql-server-stretch-database/ Wed, 03 Jul 2024 16:00:00 +0000 In July 2024, SQL Server Stretch Database will be discontinued for SQL Server 2022, 2019, and 2017.

The post Announcing the retirement of SQL Server Stretch Database appeared first on Microsoft SQL Server Blog.

]]>
Ever since Microsoft introduced SQL Server Stretch Database in 2016, our guiding principles for such hybrid data storage solutions have always been affordability, security, and native Azure integration. Customers have indicated that they want to reduce maintenance and storage costs for on-premises data, with options to scale up or down as needed, greater peace of mind from advanced security features such as Always Encrypted and row-level security, and they seek to unlock value from warm and cold data stretched to the cloud using Microsoft Azure analytics services.     

During recent years, Azure has undergone significant evolution, marked by groundbreaking innovations like Microsoft Fabric and Azure Data Lake Storage. As we continue this journey, it remains imperative to keep evolving our approach on hybrid data storage, ensuring optimal empowerment for our SQL Server customers in leveraging the best from Azure.

Retirement of SQL Server Stretch Database 

On November 16, 2022, the SQL Server Stretch Database feature was deprecated from SQL Server 2022. For in-market versions of SQL Server 2019 and 2017, we had added an improvement that allowed the Stretch Database feature to stretch a table to an Azure SQL Database. Effective July 9, 2024, the supporting Azure service, known as SQL Server Stretch Database edition, is retired. Impacted versions of SQL Server include SQL Server 2022, 2019, 2017, and 2016.  

In July 2024, SQL Server Stretch Database will be discontinued for SQL Server 2022, 2019, 2017, and 2016. We understand that retiring an Azure service may impact your current workload and use of Stretch Database. Therefore, we kindly request that you either migrate to Azure or bring their data back from Azure to your on-premises version of SQL Server. Additionally, if you’re exploring alternatives for archiving data to cold and warm storage in the cloud, we’ve introduced significant new capabilities in SQL Server 2022, leveraging its data virtualization suite. 

The path forward 

SQL Server 2022 supports a concept named CREATE EXTERNAL TABLE AS SELECT (CETaS). It can help customers archive and store cold data to Azure Storage. The data will be stored in an open source file format named Parquet. It operates well with complex data in large volumes. With its performant data compression, it turns out to be one of the most cost-effective data storage solutions. Using OneLake shortcuts, customers then can leverage Microsoft Fabric to realize cloud-scale analytics on archived data.  

Our priority is to empower our SQL Server customers with the tools and services that leverage the latest and greatest from Azure. If you need assistance in exploring how Microsoft can best empower your hybrid data archiving needs, please contact us.

New solution FAQs

What’s CETaS? 

Creates an external table and then exports, in parallel, the results of a Transact-SQL SELECT statement. 

  • Azure Synapse Analytics and Analytics Platform System support Hadoop or Azure Blob Storage.
  • SQL Server 2022 (16.x) and later versions support CETaS to create an external table and then export, in parallel, the result of a Transact-SQL SELECT statement to Azure Data Lake Storage Gen2, Azure Storage Account v2, and S3-compatible object storage. 

What is Fabric? 

Fabric is an end-to-end analytics and data platform designed for enterprises that require a unified solution. It encompasses data movement, processing, ingestion, transformation, real-time event routing, and report building. Fabric offers a comprehensive suite of services including Data engineering, Data Factory, Data Science, Real-Time Analytics, Data Warehouse, and Databases. 

With Fabric, you don’t need to assemble different services from multiple vendors. Instead, it offers a seamlessly integrated, user-friendly platform that simplifies your analytics requirements. Operating on a software as a service (SaaS) model, Fabric brings simplicity and integration to your solutions. 

Fabric integrates separate components into a cohesive stack. Instead of relying on different databases or data warehouses, you can centralize data storage with Microsoft OneLake. AI capabilities are seamlessly embedded within Fabric, eliminating the need for manual integration. With Fabric, you can easily transition your raw data into actionable insights for business users. 

What is OneLake shortcuts?  

Shortcuts in OneLake allow you to unify your data across domains, clouds, and accounts by creating a single virtual data lake for your entire enterprise. All Fabric experiences and analytical engines can directly connect to your existing data sources such as Azure, Amazon Web Services (AWS), and OneLake through a unified namespace. OneLake manages all permissions and credentials, so you don’t need to separately configure each Fabric workload to connect to each data source. Additionally, you can use shortcuts to eliminate edge copies of data and reduce process latency associated with data copies and staging. 

Shortcuts are objects in OneLake that point to other storage locations. The location can be internal or external to OneLake. The location that a shortcut points to is known as the target path of the shortcut. The location where the shortcut appears is known as the shortcut path. Shortcuts appear as folders in OneLake and any workload or service that has access to OneLake can use them. Shortcuts behave like symbolic links. They’re an independent object from the target. If you delete a shortcut, the target remains unaffected. If you move, rename, or delete a target path, the shortcut can break. 

Learn more 

Abstract image

Microsoft Fabric

Bring your data into the era of AI

The post Announcing the retirement of SQL Server Stretch Database appeared first on Microsoft SQL Server Blog.

]]>
Serving AI with data: A summary of Build 2017 data innovations http://approjects.co.za/?big=en-us/sql-server/blog/2017/05/10/serving-ai-with-data-a-summary-of-build-2017-data-innovations/ Wed, 10 May 2017 19:00:49 +0000 This post was authored by Joseph Sirosh, Corporate Vice President, Microsoft Data Group This week at the annual Microsoft Build conference, we are discussing how, more than ever, organizations are relying on developers to create breakthrough experiences. With big data, cloud and AI converging, innovation & disruption is accelerating to a pace never seen before.

The post Serving AI with data: A summary of Build 2017 data innovations appeared first on Microsoft SQL Server Blog.

]]>
This post was authored by Joseph Sirosh, Corporate Vice President, Microsoft Data Group

This week at the annual Microsoft Build conference, we are discussing how, more than ever, organizations are relying on developers to create breakthrough experiences. With big data, cloud and AI converging, innovation & disruption is accelerating to a pace never seen before. Data is the key strategic asset at the heart of this convergence. When combined with the limitless computing power of the cloud and new capabilities like Machine Learning and AI, it enables developers to build the next generation of intelligent applications. As a developer, you are looking for faster, easier ways to embrace these converging technologies and transform your app experiences.

Today at Build, we made several product announcements, adding to the recent momentum announced last month at Microsoft Data Amp, that will help empower every organization on the planet with data-driven intelligence. Across these innovations, we are pursuing three key themes:

  1. Infusing AI within our data platform
  2. Turnkey global distribution to push intelligence wherever your users are
  3. Choice of database platforms and tools for developers

Infusing AI within our data platform

Joseph_AI1A thread of innovation you will see in our products is the deep integration of AI with data. In the past, a common application pattern was to create machine learning models outside the database in the application layer or in specialty statistical tools, and deploy these models in custom built production systems. This results in a lot of developer heavy lifting, and the development and deployment lifecycle can take months. Our approach dramatically simplifies the deployment of AI by bringing intelligence into existing well-engineered data platforms through a new extensibility model for databases.

SQL Server 2017

We started this journey by introducing R support within the SQL Server 2016 release and we are deepening this commitment with the upcoming release of SQL Server 2017. In this release, we have introduced support for a rich library of machine learning functions and introduced Python support to give you more choices across popular languages. SQL Server can also leverage GPU accelerated computing through the Python/R interface to power even the most intensive deep learning jobs on images, text and other unstructured data. Developers can implement GPU accelerated analytics and very sophisticated AI directly in the database server as stored procedures and gain orders of magnitude higher throughput.

Additionally, as data becomes more complex and the relationships across data are many-to-many, developers are looking for easier ways to ingest and manage this data. With SQL Server 2017, we have introduced Graph support to deliver the best of both relational and graph databases in a single product, including the ability to query across all data using a single platform.

We have made it easy for you to try SQL Server with R, Python, and Graph support today whether you are working with C#, Java, Node, PHP, or Ruby.

Azure SQL Database

We’re continuing to simultaneously ship SQL Server 2017 enhancements to Azure SQL Database, so you get consistent programming surface area across on-premises and cloud. Today, I am excited to announce the support for Graph is also coming to Azure SQL Database so you can also get the best of both relational and graph in a single proven service on Azure.

SQL Database is built for developer productivity with most database management tasks built-in. We have also built AI directly into the service itself, making it an intelligent database service. The service runs millions of customer databases, learns, and then adapts to offer customized experiences for each database. With Database Advisor, you can choose to let the service learn your unique patterns and make performance and tuning recommendations or automatically take action on your behalf. Today, I am also excited to announce general availability of Threat Detection, which uses machine learning around the clock to learn, profile and detect anomalous activity over your unique database and sends alerts in minutes so you can take immediate action versus what historically can take an organization days, months, or years to discover.

Also, we are making it even easier for you to move more of your existing SQL Server apps as-is to Azure SQL Database. Today we announced the private preview for a new deployment option within the service, Managed Instance—you get all the managed benefits of SQL Database and now at the instance level which offers support for SQL Agent, three-part names, DBMail, CDC and other instance-level capabilities.

To streamline this migration effort, we also introduced a preview for Azure Database Migration Service that will dramatically accelerate the migration of on-premises third-party and SQL Server databases into Azure SQL Database.

Eric Fleischman, Vice President & Chief Architect from DocuSign notes, “Our transaction volume doubles every year. We wanted the best of what we do in our datacenter…with the best of what Azure could bring to it. For us, we found that Azure SQL Database was the best way to do it. We deploy our SQL Server schema elements into a Managed Instance, and we point the application via connection string change directly over to the Managed Instance. We basically picked up our existing build infrastructure and we’re able to deploy to Azure within a few seconds. It allows us to scale the business very quickly with minimal effort.”

Learn more about our investments in Azure SQL Database in this deeper blog.

Turnkey global distribution to push intelligence wherever your users are

With the intersection of mobile apps, internet of things, cloud and AI, users and data can come from anywhere around the globe. To deliver transformative intelligent apps that support the global nature of modern applications, and the volume, velocity, variety of data, you need more than a relational database, and more than a simple NoSQL database. You need a flexible database that can ingest massive volumes of data and data types, and navigate the challenges of space and time to ensure millisecond performance to any user anywhere on earth. And you want this with simplicity and support for the languages and technologies you know.

Joseph_AI2I’m also excited to share that today, Microsoft announced Azure Cosmos DB, the industry’s first globally-distributed, multi-model database service. Azure Cosmos DB was built from the ground up with global distribution and horizontal scale at its core – it offers turn-key global distribution across any number of Azure regions by transparently scaling and distributing your data wherever your users are, worldwide. Azure Cosmos DB leverages the work of Turing award winner Dr. Leslie Lamport, PAXOS algorithm for distributed systems and TLA+ a high-level modeling language. Check out a new interview with Dr. Lamport on Azure Cosmos DB.

Azure Cosmos DB started as “Project Florence” in 2010 to address developer the pain-points faced by large scale applications inside Microsoft. Observing that the challenges of building globally distributed apps are not a problem unique to Microsoft, in 2015 we made the first generation of this technology available to Azure developers in the form of Azure DocumentDB. Since that time, we’ve added new features and introduced significant new capabilities.  Azure Cosmos DB is the result.  It is the next big leap in globally distributed, at scale, cloud databases.

Now, with more innovation and value, Azure Cosmos DB delivers a schema-agnostic database service with turnkey global distribution, support for multiple models across popular NoSQL technologies, elastic scale of throughput and storage, five well-defined consistency models, and financially-backed SLAs across uptime, throughput, consistency, and millisecond latency.

“Domino’s Pizza chose Azure to rebuild their ordering system and a key component in this design is Azure Cosmos DB—delivering the capability to regionally distribute data, to scale easily, and support peak periods which are critical to the business. Their online solution is deployed across multiple regions around the world—even with the global scaling they can also rely on Azure Cosmos DB millisecond load latency and fail over to a completely different country if required.”

Learn more about Azure Cosmos DB in this deeper blog.

Choice of database platforms and tools for developers

We understand that SQL Server isn’t the only database technology developers want to build with. Therefore, I’m excited to share that today we also announced two new relational database services; Azure Database for MySQL and Azure Database for PostgreSQL to join our database services offerings.

Joseph_AI3These new services are built on the proven database services platform, which has been powering Azure SQL Database, and offers high availability, data protection and recovery, and scale with minimal downtime—all built-in at no extra cost or configurations. Starting today, you can now develop on MySQL and PostgreSQL database services on Azure. Microsoft is managing the MySQL and PostgreSQL technology you know, love and expect but backed by an enterprise-grade, highly available and fault tolerant cloud services platform that allows you to focus on developing great apps versus management and maintenance.

“Each month, up to 2 million people turn to the GeekWire website for the latest news on tech innovation. Now, GeekWire is making news itself by migrating its popular WordPress site to the Microsoft Azure platform. Kevin Lisota, Web Developer at GeekWire notes, “The biggest benefit of Azure Database for MySQL will be to have Microsoft manage and back up that resource for us so that we can focus on other aspects of the site. Plus, we will be able to scale up temporarily as traffic surges and then bring it back down when it is not needed. That’s a big deal for us.”

Learn more about these new services and try them today.

Azure Data Lake Tools for Visual Studio Code (VSCode)

Azure Data Lake includes all the capabilities required to make it easy for developers, data scientists, and analysts to store data of any size, shape, and speed, and do all types of processing and analytics across platforms and languages. Additionally, Azure Data Lake includes a set of cognitive capabilities built-in, making it seamless to execute AI over petabytes of data. On our journey to make it easier for every developer to become an AI and data science developer, we are investing in bringing more great tooling for data into the tools you know and love.

Today, I’m excited to announce General Availability of Azure Data Lake Tools for Visual Studio Code (VSCode) which gives developers a light but powerful code editor for big data analytics. The new Azure Data Lake Tools for VSCode supports U-SQL language authoring, scripting, and extensibility with C# to process different types of data and efficiently scale any size of data. The new tooling integrates with Azure Data Lake Analytics for U-SQL job submissions with job output to Azure Data Lake Analytics or Azure Blob Storage. In addition, U-SQL local run service has been added to allow developers to locally validate scripts and test data. Learn more and download these tools today.

Getting started

It has never been easier to get started with the latest advances in the intelligent data platform. We invite you to watch our Microsoft Build 2017 online event for streaming and recorded coverage of these innovations, including SQL Server 2017 on Windows, Linux and Docker; scalable data transformation and intelligence from Azure Cosmos DB, Azure Data Lake Store and Azure Data Lake Analytics; the Azure SQL Database approach to proactive Threat Detection and intelligent database tuning; new Azure Database for MySQL and Azure Database for PostgreSQL. I look forward to a great week at Build and your participation in this exciting journey of infusing AI into every software application.

The post Serving AI with data: A summary of Build 2017 data innovations appeared first on Microsoft SQL Server Blog.

]]>
Delivering AI with data: the next generation of the Microsoft data platform http://approjects.co.za/?big=en-us/sql-server/blog/2017/04/19/delivering-ai-with-data-the-next-generation-of-microsofts-data-platform/ http://approjects.co.za/?big=en-us/sql-server/blog/2017/04/19/delivering-ai-with-data-the-next-generation-of-microsofts-data-platform/#comments Wed, 19 Apr 2017 15:10:00 +0000 This post was authored by Joseph Sirosh, Corporate Vice President, Microsoft Data Group Leveraging intelligence out of the ever-increasing amounts of data can make the difference between being the next market disruptor or being relegated to the pages of history.

The post Delivering AI with data: the next generation of the Microsoft data platform appeared first on Microsoft SQL Server Blog.

]]>
This post was authored by Joseph Sirosh, Corporate Vice President, Microsoft Data Group

Leveraging intelligence out of the ever-increasing amounts of data can make the difference between being the next market disruptor or being relegated to the pages of history. Today at the Microsoft Data Amp online event, we will make several product announcements that can help empower every organization on the planet with data-driven intelligence. We are delivering a comprehensive data platform for developers and businesses to create the next generation of intelligent applications that drive new efficiencies, help create better products, and improve customer experiences.

I encourage you to attend the live broadcast of the Data Amp event, starting at 8 AM Pacific, where Scott Guthrie, executive VP of Cloud and Enterprise, and I will describe product innovations that integrate data and artificial intelligence (AI) to transform your applications and your business. You can stream the keynotes and access additional on-demand technical content to learn more about the announcements of the day.

Today, you’ll see three key innovation themes in our product announcements. The first is the close integration of AI functions into databases, data lakes, and the cloud to simplify the deployment of intelligent applications. The second is the use of AI within our services to enhance performance and data security. The third is flexibility—the flexibility for developers to compose multiple cloud services into various design patterns for AI, and the flexibility to leverage Windows, Linux, Python, R, Spark, Hadoop, and other open source tools in building such systems.

Hosting AI where the data lives

A novel thread of innovation you’ll see in our products is the deep integration of AI with data. In the past, a common application pattern was to create statistical and analytical models outside the database in the application layer or in specialty statistical tools, and deploy these models in custom-built production systems. That results in a lot of developer heavy lifting, and the development and deployment lifecycle can take months. Our approach dramatically simplifies the deployment of AI by bringing intelligence into existing well-engineered data platforms through a new computing model: GPU deep learning. We have taken that approach with the upcoming release of SQL Server, and deeply integrated deep learning and machine learning capabilities to support the next generation of enterprise-grade AI applications.

So today it’s my pleasure to announce the first RDBMS with built-in AIa production-quality Community Technology Preview (CTP 2.0) of SQL Server 2017. In this preview release, we are introducing in-database support for a rich library of machine learning functions, and now for the first time Python support (in addition to R). SQL Server can also leverage NVIDIA GPU-accelerated computing through the Python/R interface to power even the most intensive deep-learning jobs on images, text, and other unstructured data. Developers can implement NVIDIA GPU-accelerated analytics and very sophisticated AI directly in the database server as stored procedures and gain orders of magnitude higher throughput. In addition, developers can use all the rich features of the database management system for concurrency, high-availability, encryption, security, and compliance to build and deploy robust enterprise-grade AI applications.

new-slide-for-data-amp-blog-ss-2017

We have also released Microsoft R Server 9.1, which takes the concept of bringing intelligence to where your data lives to Hadoop and Spark, as well as SQL Server. In addition to several advanced machine learning algorithms from Microsoft, R Server 9.1 introduces pretrained neural network models for sentiment analysis and image featurization, supports SparklyR, SparkETL, and SparkSQL, and GPU for deep neural networks. We are also making model management easier with many enhancements to production deployment and operationalization. R Tools for Visual Studio provides a state-of-the-art IDE for developers to work with Microsoft R Server. An Azure Microsoft R Server VM image is also available, enabling developers to rapidly provision the server on the cloud.

9.1

In the cloud, Microsoft Cognitive Services enable you to infuse your apps with cognitive intelligence. Today I am excited to announce that the Face API, Computer Vision API, and Content Moderator are now generally available in the Azure Portal. Here are some of the different types of intelligence that cognitive services can bring to your application:

  • Face API helps detect and compare human faces, organize faces into groups according to visual similarity, and identify previously tagged people in images.
  • Computer Vision API gives you the tools to understand the contents of any image: It creates tags that identify objects, beings like celebrities or actions in an image, and crafts coherent sentences to describe it. You can now detect landmarks and handwriting in images. Handwriting detection remains in preview.
  • Content Moderator provides machine-assisted moderation of text and images, augmented with human review tools.

Azure Data Lake Analytics (ADLA) is a breakthrough serverless analytics job service where you can easily develop and run massively parallel petabyte-scale data transformation programs that compose U-SQL, R, Python, and .NET. With no infrastructure to manage, you can process data on demand, scale instantly, and pay per job only. Furthermore, we’ve incorporated the technology that sits behind the Cognitive Services inside U-SQL directly as functions. Now you can process massive unstructured data, such as texthttps://www.microsoft.com/images, extract sentiment, age, and other cognitive features using Azure Data Lake, and query/analyze these by content. This enables what I call “Big Cognition—it’s not just extracting one piece of cognitive information at a time, and not just about understanding an emotion or whether there’s an object in an individual image, but rather it’s about integrating all the extracted cognitive data with other types of data, so you can perform powerful joins, analytics, and integrated AI.

Azure Data Lake Store (ADLS) is a no-limit cloud HDFS storage system that works with ADLA and other big data services for petabyte-scale data. We are announcing the general availability of Azure Data Lake Analytics and Azure Data Lake Store in the Azure North Europe region.

Yet another powerful integration of data and AI is the seamless integration of DocumentDB with Spark to enable machine learning and advanced analytics on top of globally distributed data. To recap, DocumentDB is a unique, globally distributed, limitless NoSQL database service in Azure designed for mission-critical applications. Designed as such from the ground up, it allows customers to distribute their data across any number of Azure regions worldwide, guarantees low read and write latencies, and offers comprehensive SLAs for data-loss, latency, availability, consistency, and throughput. You can use it as either your primary operational database or as an automatically indexed, virtually infinite data lake. The Spark connector understands the physical structure of DocumentDB store (indexing and partitioning) and enables computation pushdown for efficient processing. This service can significantly simplify the process of building distributed and intelligent applications at global scale.

DocumentDB

I’m also excited to announce the general availability of Azure Analysis Services. Built on the proven business intelligence (BI) engine in Microsoft SQL Server Analysis Services, it delivers enterprise-grade BI semantic modeling capabilities with the scale, flexibility, and management benefits of the cloud. Azure Analysis Services helps you integrate data from a variety of sources—for example, Azure Data Lake, Azure SQL DW, and a variety of databases on-premises and in the cloud—and transform them into actionable insights. It speeds time to delivery of your BI projects by removing the barrier of procuring and managing infrastructure. And by leveraging the BI skills, tools, and data your team has today, you can get more from the investments you’ve already made.

Stepping up performance and security

Performance and security are central to databases. SQL Server continues to lead in database performance benchmarks, and in every release we make significant improvements. SQL Server 2016 on Windows Server 2016 holds a number of records on the Transaction Processing Performance Council (TPC) benchmarks for operational and analytical workload performance, and SQL Server 2017 does even better. I’m also proud to announce that the upcoming version of SQL Server will run just as fast on Linux as on Windows, as you’ll see in the newly published 1TB TPC-H benchmark world record nonclustered data warehouse performance achieved with SQL Server 2017 on Red Hat Enterprise Linux and HPE ProLiant hardware.

SQL Server 2017 will also bring breakthrough performance, scale, and security features to data warehousing. With up to 100x faster analytical queries using in-memory Columnstores, PolyBase for single T-SQL querying across relational and Hadoop systems, capability to scale to hundreds of terabytes of data, modern reporting, plus mobile BI and more, it provides a powerful integrated data platform for all your enterprise analytics needs.

In the cloud, Azure SQL Database is bringing intelligence to securing your data and increasing database performance. Threat Detection in Azure SQL Database works around the clock, using machine learning to detect anomalous database activities indicating unusual and potentially harmful attempts to access or exploit databases. Simply turning on Threat Detection helps customers make databases resilient to the possibility of intrusion. Other features of Azure SQL Database such as auto-performance tuning automatically implement, tune, and validate performance to guarantee the most optimal query performance. Together, our intelligent database management features help make your database more secure and faster automatically, freeing up scarce DBA capacity for more strategic work.

Simple, flexible multiservice AI solutions in the cloud

We are very committed to simplifying the development of AI systems. Cortana Intelligence is a collection of fully managed big data and analytics services that can be composed together to build sophisticated enterprise-grade AI and analytics applications on Azure. Today we are announcing Cortana Intelligence solution templates that make it easy to compose services and implement common design patterns. These solutions templates have been built on best practice designs motivated by real-world customer implementations done by our engineering team, and include Personalized Offers (for example, for retail applications), Quality Assurance (for example, for manufacturing applications), and Demand Forecasting. These templates accelerate your time to value for an intelligent solution, allowing you to deploy a complex architecture within minutes, instead of days. The templates are flexible and scalable by design. You can customize them for your specific needs, and they’re backed by a rich partner ecosystem trained on the architecture and data models. Get started today by going to the Azure gallery for Cortana Intelligence solutions.

cis

Also, AppSource is a single destination to discover and seamlessly try business apps built by partners and verified by Microsoft. Partners like KenSci have already begun to showcase their intelligent solutions targeting business decision-makers in AppSource. Now partners can submit Cortana Intelligence apps at AppSource “List an app” page.

Cross-platform and open source flexibility

Whether on-premises or in the cloud, cross-platform compatibility is increasingly important in our customers’ diverse and rapidly changing data estates. SQL Server 2017 will be the first version of SQL Server compatible with Windows, Linux, and Linux-based container images for Docker. In addition to running on Windows Server, the new version will also run on Red Hat Enterprise Linux, SUSE Enterprise Linux Server, and Ubuntu. It can also run inside Docker containers on Linux or Mac, which can help your developers spend more time developing and less on DevOps.

Getting started

It has never been easier to get started with the latest advances in the intelligent data platform. We invite you to join us to learn more about SQL Server 2017 on Windows, Linux, and in Linux-based container images for Docker; Cognitive Services for smart, flexible APIs for AI; scalable data transformation and intelligence from Azure Data Lake Store and Azure Data Lake Analytics; the Azure SQL Database approach to proactive threat detection and intelligent database tuning; new solution templates from Cortana Intelligence; and precalibrated models for Linux, Hadoop, Spark, and Teradata in R Server 9.1.

Join our Data Amp event to learn more! You can go now to the Microsoft Data Amp online event for live coverage starting at 8 AM Pacific on April 19. You’ll also be able to stream the keynotes and watch additional on-demand technical content after the event ends. I look forward to your participation in this exciting journey of infusing intelligence and AI into every software application.

The post Delivering AI with data: the next generation of the Microsoft data platform appeared first on Microsoft SQL Server Blog.

]]>
http://approjects.co.za/?big=en-us/sql-server/blog/2017/04/19/delivering-ai-with-data-the-next-generation-of-microsofts-data-platform/feed/ 15
Five ways Microsoft helps you do amazing things with data in the cloud http://approjects.co.za/?big=en-us/sql-server/blog/2017/04/12/five-ways-microsoft-helps-you-do-amazing-things-with-data-in-the-cloud/ http://approjects.co.za/?big=en-us/sql-server/blog/2017/04/12/five-ways-microsoft-helps-you-do-amazing-things-with-data-in-the-cloud/#comments Wed, 12 Apr 2017 19:00:00 +0000 Microsoft can help you do amazing things with your data in the cloud! Here are five examples to help you get started. If you’d like more information about using the cloud to get the most from your data, please join us for the upcoming Microsoft Data Amp event on April 19 at 8 AM Pacific.

The post Five ways Microsoft helps you do amazing things with data in the cloud appeared first on Microsoft SQL Server Blog.

]]>
Twitter banner BDM

Microsoft can help you do amazing things with your data in the cloud! Here are five examples to help you get started. If you’d like more information about using the cloud to get the most from your data, please join us for the upcoming Microsoft Data Amp event on April 19 at 8 AM Pacific. The online event will showcase how data is the nexus between application innovation and artificial intelligence—how data and analytics powered by the most trusted and intelligent cloud can help companies differentiate and out innovate their competition.

1: Build data-driven apps that learn and adapt

Applications show intelligence when they can spot trends, react to events, predict outcomes or recommend choices—often leading to richer customer experiences, improved business process, or addressing issues before they arise. The three key ingredients to creating an intelligent app are:

  1. Ingest data in real time
  2. Query across historical and real-time data
  3. Analyze patterns and make predictions with machine learning

clip_image002With Azure, you can make your applications intelligent by establishing feedback loops, and applying big data and machine learning techniques to classify, predict, or otherwise analyze explicit and implicit signals. Today, apps for consumers and enterprises can deliver greater customer or business benefit by learning from user behavior and other signals.

Pier 1 Imports launched a mobile-friendly pier1.com, making shopping online easier. It enabled the selection of delivery options like direct shipment, picking up products in the local store, or a white-glove delivery option from any mobile device. “Although the Pier 1 Imports brand is the same as it has been for more than 50 years, we are continually getting better at identifying what our customer wants, using Microsoft Azure Machine Learning and resulting data insights,” Sharon Leite, EVP Sales and Customer Experience.

Get started with sample code

If you want to learn more about building an intelligent app, try the AdventureWorks Ski App. This sample application can be used to demonstrate the value of building intelligence into an existing application. Learn more by going to GitHub and watching the application being built here.

2: Run big cognition for human-like intelligence over petabyte scale

Microsoft’s Cognitive Services APIs allow developers to integrate vision, speech, language, knowledge and search APIs into your apps. To run these services over petabyte scale, we’ve integrated the capabilities directly into Azure Data Lake. You can join emotions from image content with any other type of data you have and do incredibly powerful analytics and intelligence over it. This is what we call “Big Cognition.” This goes beyond extracting one piece of cognitive information at a time, understanding an emotion or whether there’s an object in an image. Big Cognition joins all the extracted cognitive data with other types of data, so you can do some really powerful analytics with it.

On a global scale, Azure Data Lake is also being used by Carnival Corp., the world’s largest leisure travel company, which has a total of over 100 ships across 10 global cruise line brands, at its Fleet Operations Centers. “We chose to partner with Microsoft to kick off a project of the Internet of Things, because it was strategic for us to rely on a platform that would allow us to collect, analyse, and display data from sensors in a simple, integrated and immediate way on our ships and make them available both to the officers on board and to our operations centre on the ground,” says Franco Caraffi, IT Marine Systems Director of Costa Cruises.

Get started with sample code

We have demonstrated Big Cognition at Microsoft Ignite and PASS Summit, by showing a demo in which we used U-SQL inside Azure Data Lake Analytics to process a million images and understand what’s inside those images. You can watch this demo here and try it yourself using a sample project on GitHub or discover more ways to get started with Azure Data Lake on GitHub.

3: Deliver <10ms latency to any customer, anywhere on the planet

With today’s globally connected world, developers and organizations alike have three simple requirements for their customer-facing applications: millisecond performance across global distribution, and application availability—without hard tradeoffs. NoSQL can be a great technology for tackling these tough challenges, especially when facing increasing data volume and variety.

Most NoSQL technologies force customers to make binary choices among global performance, availability, and transactional consistency. With Azure DocumentDB, Microsoft’s fully managed NoSQL database service, you get four tunable consistency levels to reduce friction related to tradeoffs and unlock new application patterns previously not possible—without ever trading off availability or <10ms latency, which are guaranteed. For example, session consistency gives an ideal blend of performance and consistency for multitenant applications. Tenants are able to achieve strong consistency within the scope of their own session, without having to trade off performance for other tenants. IoT devices emit events at an extremely high rate. Thus, a scale-out database is required to handle heavy write ingestion to persist the full fidelity of unaggregated streams of events. The events from each generation of device looks slightly different as new capabilities and sensors are added. DocumentDB can uniquely ingest a high write of events with varying schema with automatic indexing—and serve it back out using rich queries with low latency, enabling applications to react with real-time anomaly detection.

Citrix delivers solutions used by more than 400,000 organizations and more than 100 million individuals globally. The Citrix web portal was getting a lot of traffic, which was good news, but it was running into challenges integrating the web identity into its SaaS portals. It turned to Azure Service Fabric and DocumentDB to run its Citrix Identity Platform to deliver against its availability, durability, and performance requirements.

Get started with sample code

There are so many great code samples available on GitHub for DocumentDB that we aggregated our 10 favorite GitHub samples into a single blog for you. Check out these samples across .NET, Node.js, and Python for an array of app scenarios and start playing with DocumentDB today.

4: Serve up a first-class search experience with just a few lines of code

Azure Search is a cloud search-as-a-service solution that delegates server and infrastructure management to Microsoft, leaving you with a ready-to-use service with which, using only a few lines of code, you can populate your data and then easily add a first-class search experience to your web, mobile, or cognitive-based application. Azure Search allows you to easily add a robust search experience to any application using a simple REST API or .NET SDK without managing search infrastructure or becoming an expert in search.

autoTRADER.ca, Canada’s largest automobile search site, uses Azure Search to help dealers advertise and inventory products, determine the best pricing, and provide market data on which vehicles are in high demand. “We’re really excited about using Azure Search for marketplace. It gives us an opportunity to provide better and better services to our customers with instant, seamless experiences across all devices,” says Shane Sullivan, director of Software Engineering.

Get started with sample code

Try the First Response app code on GitHub—an online collaboration platform built to support first responders—which lets police, fire fighters, and paramedics share critical data with each other in real time. This app scenario and demo and toolkit combine App Service, DocumentDB, and Search with Xamarin support for cross-device support into a real-time mobile app.

5: Scale your business, protect your margins

For software builders with existing packaged apps looking to also extend their business to SaaS or those building a new business app as SaaS, the number one question we get is, “how do I run and grow my business on a cloud while ensuring operating costs don’t accidently consume my margins?” When we dig into this app pattern with customers, the concern really boils down to how to manage the costs associated with isolating and managing your customers’ data while ensuring each customer gets the best performance despite varying performance demands. There are two challenges as a result of this: First, managing and maintaining an isolated database for each customer would require more staff as you grow; second, over-provisioning resources to ensure spikes in demand don’t cause a poor experience and overspending on operating costs. We dove into this problem with customers and as a result introduced SQL Database Elastic Pools—a unique solution to help you manage thousands of databases as one while maintaining isolation and security at dramatic cost savings.

SQL Database Elastic Pools are a simple, cost-effective solution for managing and scaling multiple databases that have varying and unpredictable usage demands. The databases in an elastic pool are on a single Azure SQL Database server and share a set number of resources at a set price.

SnelStart makes popular financial- and business-management software for small and medium-sized businesses in the Netherlands. Its 55,000 customers are serviced by a staff of 110 employees, including an IT staff of 35. By moving from desktop software to a software-as-a-service (SaaS) offering built on Azure, SnelStart made the most of built-in services, automating management using the familiar environment in C#, and optimizing performance and scalability by neither over- or under-provisioning businesses using elastic pools. “By using elastic pools, we can optimize performance based on the needs of our customers, without over-provisioning. If we had to provision based on peak load, it would be quite costly. Instead, the option to share resources between multiple, low-usage databases allows us to create a solution that performs well and is cost effective,” says Henry Been, solution architect.

Get started with sample code

We built this Contoso shopkeeper app to demonstrate just how easy it is to build a multitenant SaaS app using SQL Database Elastic Pools. You’ll see how easy it is to scale out to support your growing customer base with no scheme changes required for your app and also how easy it is to manage many databases as one.

Azure can help you do amazing things with data in the cloud. Organizations have used Azure to transform their business, providing compelling customer experiences while managing costs. Try one of these five new amazing things you can do with Azure today! And to learn more, join us for the upcoming Microsoft Data Amp event on April 19 at 8 AM Pacific.

The post Five ways Microsoft helps you do amazing things with data in the cloud appeared first on Microsoft SQL Server Blog.

]]>
http://approjects.co.za/?big=en-us/sql-server/blog/2017/04/12/five-ways-microsoft-helps-you-do-amazing-things-with-data-in-the-cloud/feed/ 1
Announcing the Next Generation of Databases and Data Lakes from Microsoft http://approjects.co.za/?big=en-us/sql-server/blog/2016/11/16/announcing-the-next-generation-of-databases-and-data-lakes-from-microsoft/ Wed, 16 Nov 2016 15:35:58 +0000 This post was authored by Joseph Sirosh, Corporate Vice President of the Microsoft Data Group. For the past two years, we’ve unveiled several of our cutting-edge technologies and innovative solutions at Connect(); which will be livestreaming globally from New York City starting November 16.

The post Announcing the Next Generation of Databases and Data Lakes from Microsoft appeared first on Microsoft SQL Server Blog.

]]>
This post was authored by Joseph Sirosh, Corporate Vice President of the Microsoft Data Group.

Microsoft Connect() 2016

For the past two years, we’ve unveiled several of our cutting-edge technologies and innovative solutions at Connect(); which will be livestreaming globally from New York City starting November 16. This year, I am thrilled to announce the next generation of SQL Server and Azure Data Lake, and several new capabilities to help developers build intelligent applications.

1. Next release of SQL Server with Support for Linux and Docker (Preview)

I am excited to announce the public preview of the next release of SQL Server which brings the power of SQL Server to both Windows – and for the first time ever – Linux. Now you can also develop applications with SQL Server on Linux, Docker, or macOS (via Docker) and then deploy to Linux, Windows, Docker, on-premises, or in the cloud.  This represents a major step in our journey to making SQL Server the platform of choice across operating systems, development languages, data types, on-premises and the cloud.  All major features of the relational database engine, including advanced features such as in-memory OLTP, in-memory columnstores, Transparent Data Encryption, Always Encrypted, and Row-Level Security now come to Linux. Getting started is easier than ever. You’ll find native Linux installations (more info here) with familiar RPM and APT packages for Red Hat Enterprise Linux, Ubuntu Linux, and SUSE Linux Enterprise Server. The public preview on Windows and Linux will be available on Azure Virtual Machines and as images available on Docker Hub, offering a quick and easy installation within minutes.  The Windows download is available on the Technet Eval Center.

We have also added significant improvements into R Services inside SQL Server, such as a very powerful set of machine learning functions that are used by our own product teams across Microsoft. This brings new machine learning and deep neural network functionality with increased speed, performance and scale, especially for handling a large corpus of text data and high-dimensional categorical data. We have just recently showcased SQL Server running more than one million R predictions per second and encourage you all to try out R examples and machine learning templates for SQL Server on GitHub.

The choice of application development stack with the next release of SQL Server is absolutely amazing – it includes .NET, Java, PHP, Node.JS, etc. on Windows, Linux and Mac (via Docker). Native application development experience for Linux and Mac developers has been a key focus for this release. Get started with the next release of SQL Server on Linux, macOS (via Docker) and Windows with our developer tutorials that show you how to install and use the next release of SQL Server on macOS, Docker, Windows, RHEL and Ubuntu and quickly build an app in a programming language of your choice.

SQL Server

2. SQL Server 2016 SP1

We are announcing SQL Server 2016 SP1 which is a unique service pack – for the first time we introduce consistent programming model across SQL Server editions. With this model, programs written to exploit powerful SQL features such as in-memory OLTP, in-memory columnstore analytics, and partitioning will work across Enterprise, Standard and Express editions. Developers will find it easier than ever to take advantage of innovations such as in memory databases and advanced analytics – you can use these advanced features in the Standard Edition and then step up to Enterprise for Mission Critical performance, scale and availability – without having to re-write your application.

Our software partners are excited about the flexibility that this change gives them to adopt advanced features while supporting multiple editions of SQL Server.

“With SQL Server 2016 SP1, we can run the same code entirely on both platforms and customers who need Enterprise scale buy Enterprise, and customers who don’t need that can buy Standard and run just fine. From a programming point of view, it’s easier for us and easier for them,” said Nick Craver, Architecture Lead at Stack Overflow.

To be even more productive with SQL Server, you can now take advantage of improved developer experiences on Windows, Mac and Linux for Node.js, Java, PHP, Python, Ruby, .NET core and C/C++. Our JDBC Connector is now published and available as 100% open source which gives developers more access to information and flexibility on how to contribute and work with the JDBC driver. Additionally, we’ve made updates to ODBC for PHP driver and launched a new ODBC for Linux connector, making it much easier for developers to work with Microsoft SQL-based technologies. To make it more seamless for all developers Microsoft VSCode users can also now connect to SQL Server, including SQL Server on Linux, Azure SQL Database and Azure SQL Data Warehouse.  In addition, we’ve released updates to SQL Server Management Studio, SQL Server Data Tools, and Command line tools which now support SQL Server on Linux.

Tools

3. Azure Data Lake Analytics and Store GA

Today, I am excited to announce the general availability of Azure Data Lake Analytics and Azure Data Lake Store.

Azure Data Lake Analytics is a cloud analytics service that allows you to develop and run massively parallel data transformations and processing programs in U-SQL, R, Python and .Net over petabytes of data with just a few lines of code. There is no infrastructure to manage, and you can process data on demand allowing you to scale in seconds, and only pay for the resources used. U-SQL is a simple, expressive, and super-extensible language that combines the power of C# with the simplicity of SQL. Developers can write their code either in Visual Studio or Visual Studio Code and the execution environment gives you debugging and optimization recommendations to improve performance and reduce cost.

Azure Data Lake Store is a cloud analytics data lake for enterprises that is secure, massively scalable and built to the open HDFS standard. You can store trillions of files, and single files can be greater than a petabyte in size. It provides massive throughput optimized to run big analytic jobs. It has data encryption in motion and at rest, single sign-on (SSO), multi-factor authentication and management of identities built-in through Azure Active Directory, and fine-grained POSIX-based ACLS for role-based access controls.

Azure Data Lake Petabytes of Data

Furthermore, we’ve incorporated the technology that sits behind the Microsoft Cognitive Services inside U-SQL directly. Now you can process any amount of unstructured data, e.g., text, images, and extract emotions, age, and all sorts of other cognitive features using Azure Data Lake and perform query by content. You can join emotions from image content with any other type of data you have and do incredibly powerful analytics and intelligence over it. This is what I call Big Cognition. It’s not just extracting one piece of cognitive information at a time, not just about understanding an emotion or whether there’s an object in an image, but rather it’s about joining all the extracted cognitive data with other types of data, so you can do some really powerful analytics with it. We have demonstrated this capability at Microsoft Ignite and PASS Summit, by showing a Big Cognition demo in which we used U-SQL inside Azure Data Lake Analytics to process a million images and understand what’s inside those images. You can watch this demo (starting at minute 38) and try it yourself using a sample project on GitHub.

4. DocumentDB Emulator

We live on a Planet of the Apps, and the best back-end system to build modern intelligent mobile or web apps is Azure DocumentDB – planet-scale, globally distributed managed NoSQL service, with 99.99% availability and guarantees for low latency and consistency, all of which is backed by an enterprise grade security and SLA.

Today I am happy to announce a public preview of DocumentDB Emulator which provides a local development experience for the Azure DocumentDB. Using the DocumentDB Emulator, you can develop and test your application locally without an internet connection, without creating an Azure subscription, and without incurring any costs. This has long been the most requested feature on the user voice site, so we are thrilled to roll this out to everyone.

Furthermore, we’ve added .NET Core support in DocumentDB. The .Net Core is a lightweight and modular platform to create applications and services that run on Linux, Mac and Windows. With DocumentDB support for .Net Core, developers can now use .Net Core to build cross platform applications and services that use DocumentDB API.

Planet of the Apps

5. Other Announcements

  • Today we also are announcing the General Availability of R Server for Azure HDInsightHDInsight is the only fully managed Cloud Hadoop offering that provides optimized open source analytic clusters for Spark, Hive, Map Reduce, HBase, Storm, and R Server backed by a 99.9% SLA. Running Microsoft R Server as a service on top of Apache Spark, customers can achieve unprecedented scale and performance by combining enterprise-scale analytics in R with the power of Spark. With transparently parallelized analytic functions, it’s now possible to handle up to 1000x more data with up to 50x faster speeds than open source R – helping you train more accurate models for better predictions than previously possible. Plus, because R Server is built to work with the open source R language, all of your R scripts can run without significant changes.
  • We are also announcing the public preview of Kafka for HDInsightan enterprise-grade, open-source streaming ingestion service which is cost-effective, easy to provision, manage and use. This service enables you to build real-time solutions like IoT, fraud detection, click-stream analysis, financial alerts, and social analytics. Using out-of-the-box integration with Storm for HDInsight or Spark Stream for HDInsight, you can architect powerful streaming pipelines to drive intelligent real-time actions.
  • Another exciting news is the availability of Operational Analytics for Azure SQL Database. It’s the first fully managed Hybrid Transactional and Analytical Processing (HTAP) database service in the cloud. The ability to run both analytics (OLAP) and OLTP workloads on the same database tables at the same time allows developers to build a new level of analytical sophistication into their applications.  Developers can eliminate the need for ETL and a data warehouse in some cases (using one system for OLAP and OLTP, instead of creating two separate systems), helping to reduce complexity, cost, and data latency. The in-memory technologies in Azure SQL DB helps achieve phenomenal performance – e.g., 75,000 transactions per second for order processing (11X performance gain) and reduced query execution time from 15 seconds down to 0.26 (57X performance gain). This capability is now a standard feature of Azure SQL DB at no additional cost.

We are making our products and innovations more accessible to all developers – on any platform, on-premises and in the cloud. We are building for a future where our data platform is dwarfed by the aggregate value of the solutions built on top of it. This is the true measure of success of a platform – when the number and the value created by the apps built on top is far larger than the platform itself.

The live broadcast of Connect(); begins on November 16th at 9:45am EST, and continues with interactive Q&A and immersive on-demand content. Join us to learn more about these amazing innovations.

@josephsirosh

The post Announcing the Next Generation of Databases and Data Lakes from Microsoft appeared first on Microsoft SQL Server Blog.

]]>
Eight scenarios with Apache Spark on Azure that will transform any business http://approjects.co.za/?big=en-us/sql-server/blog/2016/08/29/eight-scenarios-with-apache-spark-on-azure-that-will-transform-any-business/ Mon, 29 Aug 2016 15:00:23 +0000 This post was authored by Rimma Nehme, Technical Assistant, Data Group. Since its birth in 2009, and the time it was open sourced in 2010, Apache Spark has grown to become one of the largest open source communities in big data with over 400 organizations from 100 companies contributing to it.

The post Eight scenarios with Apache Spark on Azure that will transform any business appeared first on Microsoft SQL Server Blog.

]]>
This post was authored by Rimma Nehme, Technical Assistant, Data Group.

Spark-Azure

Since its birth in 2009, and the time it was open sourced in 2010, Apache Spark has grown to become one of the largest open source communities in big data with over 400 organizations from 100 companies contributing to it. Spark stands out for its ability to process large volumes of data 100x faster, because data is persisted in-memory. Azure cloud makes Apache Spark incredibly easy and cost effective to deploy with no hardware to buy, no software to configure, with a full notebook experience to author compelling narratives, and integration with partner business intelligence tools. In this blog post, I am going to review of some of the truly game-changing usage scenarios with Apache Spark on Azure that companies can employ in their context.

Scenario #1: Streaming data, IoT and real-time analytics

Apache Spark’s key use case is its ability to process streaming data. With so much data being processed on a daily basis, it has become essential for companies to be able to stream and analyze it all in real time. Spark Streaming has the capability to handle this type of workload exceptionally well. As shown in the image below, a user can create an Azure Event Hub (or an Azure IoT Hub) to ingest rapidly arriving data into the cloud; both Event and IoT Hubs can intake millions of events and sensor updates per second that can then be processed in real-time by Spark.

Scenario 1_Spark Streaming

Businesses can use this scenario today for:

  • Streaming ETL: In traditional ETL (extract, transform, load) scenarios, the tools are used for batch processing, and data must be first read in its entirety, converted to a database compatible format, and then written to the target database. With Streaming ETL, data is continually cleaned and aggregated before it is pushed into data stores or for further analysis.
  • Data enrichment: Streaming capability can be used to enrich live data by combining it with static or ‘stationary’ data, thus allowing businesses to conduct more complete real-time data analysis. Online advertisers use data enrichment to combine historical customer data with live customer behavior data and deliver more personalized and targeted ads in real-time and in the context of what customers are doing. Since advertising is so time-sensitive, companies have to move fast if they want to capture mindshare. Spark on Azure is one way to help achieve that.
  • Trigger event detection: Spark Streaming can allow companies to detect and respond quickly to rare or unusual behaviors (“trigger events”) that could indicate a potentially serious problem within the system. For instance, financial institutions can use triggers to detect fraudulent transactions and stop fraud in its tracks. Hospitals can also use triggers to detect potentially dangerous health changes while monitoring patient vital signs and sending automatic alerts to the right caregivers who can then take immediate and appropriate action.
  • Complex session analysis: Using Spark Streaming, businesses can use events relating to live sessions, such as user activity after logging into a website or application, can be grouped together and quickly analyzed. Session information can also be used to continuously update machine learning models. Companies can then use this functionality to gain immediate insights as to how users are engaging on their site and provide more real-time personalized experiences.

Scenario #2: Visual data exploration and interactive analysis

Using Spark SQL running against data stored in Azure, companies can use BI tools such as Power BI, PowerApps, Flow, SAP Lumira, QlikView and Tableau to analyze and visualize their big data. Spark’s interactive analytics capability is fast enough to perform exploratory queries without sampling. By combining Spark with visualization tools, complex data sets can be processed and visualized interactively. These easy-to-use interfaces then allow even non-technical users to visually explore data, create models and share results. Because wider audience can analyze big data without preconceived notions, companies can test new ideas and visualize important findings in their data earlier than ever before. Companies can identify new trends and new relationships that were not apparent before and quickly drill down into them, ask new questions and find ways to innovate in new and smarter ways.

Scenario 2_Spark visual data exploration and interactive analysis

This scenario is even more powerful when interactive data discovery is combined with predictive analytics (more on this later in this blog). Based on relationships and trends identified during discovery, companies can use logistic regression or decision tree techniques to predict the probability of certain events in the future (e.g., customer churn probability). Companies can then take specific, targeted actions to control or avert certain events.

Scenario #3: Spark with NoSQL (HBase and Azure DocumentDB)

This scenario provides scalable and reliable Spark access to NoSQL data stored either in HBase or our blazing fast, planet-scale Azure DocumentDB, through “native” data access APIs. Apache HBase is an open-source NoSQL database that is built on Hadoop and modeled after Google BigTable. DocumentDB is a true schema-free managed NoSQL database service running in Azure designed for modern mobile, web, gaming, and IoT scenarios. DocumentDB ensures 99% of your reads are served under 10 milliseconds and 99% of your writes are served under 15 milliseconds. It also provides schema flexibility, and the ability to easily scale a database up and down on demand.

The Spark with NoSQL scenario enables ad-hoc, interactive queries on big data. NoSQL can be used for capturing data that is collected incrementally from various sources across the globe. This includes social analytics, time series, game or application telemetry, retail catalogs, up-to-date trends and counters, and audit log systems. Spark can then be used for running advanced analytics algorithms at scale on top of the data coming from NoSQL.

Scenario 3_Spark NoSQL

Companies can employ this scenario in online shopping recommendations, spam classifiers for real time communication applications, predictive analytics for personalization, and fraud detection models for mobile applications that need to make instant decisions to accept or reject a payment. I would also include in this category a broad group of applications that are really “next-gen” data warehousing, where large amounts of data needs to be processed inexpensively and then served in an interactive form to many users globally. Finally, internet of things scenarios fit in here as well, with the obvious difference that the data represents the actions of machines instead of people.

Scenario #4: Spark with Data Lake

Spark on Azure can be configured to use Azure Data Lake Store (ADLS) as an additional storage. ADLS is an enterprise-class, hyper-scale repository for big data analytic workloads. Azure Data Lake includes all the capabilities required to make it easy for developers, data scientists, and analysts in an enterprise environment to store data of any size, shape and speed, and do all types of processing and analytics across platforms and languages. Because ADLS is a file system compatible with Hadoop Distributed File System (HDFS), it makes it very easy to combine it with Spark for running computations at scale using pre-existing Spark queries.

Scenario 4_Spark with Data Lake

The data lake scenario arose because new types of data needed to be captured and exploited by companies, while still preserving all of the enterprise-level requirements like security, availability, compliance, failover, etc. Spark with data lake scenario enables a truly scalable advanced analytics on healthcare data, financial data, business-sensitive data, geo-location coordinates, clickstream data, server log, social media, machine and sensor data. If companies want an easy way of building data pipelines, have unparalleled performance, insure their data quality, manage access control, perform change data capture (CDC) processing, get enterprise-level security seamlessly and have world-class management and debugging tools, this is the scenario they need to implement.

Scenario #5: Spark with SQL Data Warehouse

While there is still a lot of confusion, Spark and big data analytics is not a replacement for traditional data warehousing. Instead, Spark on Azure can complement and enhance a company’s data warehousing efforts by modernizing the company’s approaches to analytics. A data warehouse can be viewed as an ‘information archive’ that supports business intelligence (BI) users and reporting tools for mission-critical functions of company. My definition of mission-critical is any system that supports revenue generation or cost control. If such a system fails, companies would have to manually perform these tasks to prevent loss of revenue or increased cost. Big data analytics systems like Spark help augment such systems by running more sophisticated computations, smarter analytics and delivering deeper insights using larger and more diverse datasets.

Azure SQL Data Warehouse (SQLDW) is a cloud-based, scale-out database capable of processing massive volumes of data, both relational and non-relational. Built on our massively parallel processing (MPP) architecture, SQLDW combines the power of the SQL Server relational database with Azure cloud scale-out capabilities. You can increase, decrease, pause, or resume a data warehouse in seconds with SQLDW. Furthermore, you save costs by scaling out CPU when you need it and cutting back usage during non-peak times. SQLDW is the manifestation of elastic future of data warehousing in the cloud.

Scenario 5_Spark with SQLDW

Some of the use cases of Spark with SQLDW scenario may include: using data warehouse to get a better understanding of its customers across product groups, then using Spark for predictive analytics on top of that data. Running advanced analytics using Spark on top of the enterprise data warehouse containing sales, marketing, store management, point of sale, customer loyalty, and supply chain data, then run advanced analytics using Spark to drive more informed business decisions at the corporate, regional, and store levels. Using Spark with the data warehousing data, companies can literally do anything from risk modeling, to parallel processing of large graphs, to advanced analytics, text processing – all on top of their elastic data warehouse.

Scenario #6: Machine Learning using R Server, MLlib

Another and probably one of the most prominent Spark use cases in Azure is machine learning. By storing datasets in-memory during a job, Spark has great performance for iterative queries common in machine learning workloads. Common machine learning tasks that can be run with Spark in Azure include (but are not limited to) classification, regression, clustering, topic modeling, singular value decomposition (SVD) and principal component analysis (PCA) and hypothesis testing and calculating sample statistics.

Typically, if you want to train a statistical model on very large amounts of data, you need three things:

  • Storage platform capable of holding all of the training data
  • Computational platform capable of efficiently performing the heavy-duty mathematical computations required
  • Statistical computing language with algorithms that can take advantage of the storage and computation power

Microsoft R Server, running on HDInsight with Apache Spark provides all three things above. Microsoft R Server runs within HDInsight Hadoop nodes running on Microsoft Azure. Better yet, the big-data-capable algorithms of ScaleR takes advantage of the in-memory architecture of Spark, dramatically reducing the time needed to train models on large data. With multi-threaded math libraries and transparent parallelization in R Server, customers can handle up to 1000x more data and up to 50x faster speeds than open source R. And if your data grows or you just need more power, you can dynamically add nodes to the Spark cluster using the Azure portal. Spark in Azure also includes MLlib for a variety of scalable machine learning algorithms, or you can use your own libraries. Some of the common applications of machine learning scenario with Spark on Azure are listed in a table below.

Vertical Sales and Marketing Finance and Risk Customer and Channel Operations and Workforce
Retail

Demand forecasting

Loyalty programs

Cross-sell and upsell

Customer acquisition

Fraud detection

Pricing strategy

Personalization

Lifetime customer value

Product segmentation

Store location demographics

Supply chain management

Inventory management

Financial Services

Customer churn

Loyalty programs

Cross-sell and upsell

Customer acquisition

Fraud detection

Risk and compliance

Loan defaults

Personalization

Lifetime customer value

Call center optimization

Pay for performance

Healthcare

Marketing mix optimization

Patient acquisition

Fraud detection

Bill collection

Population health

Patient demographics

Operational efficiency

Pay for performance

Manufacturing

Demand forecasting

Marketing mix optimization

Pricing strategy

Perf risk management

Supply chain optimization

Personalization

Remote monitoring

Predictive maintenance

Asset management

Scenario 6_Spark Machine Learning

Examples with just a few lines of code that you can try out right now:

Scenario #7: Putting it all together in a notebook experience

For data scientists, we provide out-of-the-box integration with Jupyter (iPython), the most popular open source notebook in the world. Unlike other managed Spark offerings that might require you to install your own notebooks, we worked with the Jupyter OSS community to enhance the kernel to allow Spark execution through a REST endpoint.

We co-led “Project Livy” with Cloudera and other organizations to create an open source Apache licensed REST web service that makes Spark a more robust back-end for running interactive notebooks.  As a result, Jupyter notebooks are now accessible within HDInsight out-of-the-box. In this scenario, we can use all of the services in Azure mentioned above with Spark with a full notebook experience to author compelling narratives and create data science collaborative spaces. Jupyter is a multi-lingual REPL on steroids. Jupyter notebook provides a collection of tools for scientific computing using powerful interactive shells that combine code execution with the creation of a live computational document. These notebook files can contain arbitrary text, mathematical formulas, input code, results, graphics, videos and any other kind of media that a modern web browser is capable of displaying. So, whether you’re absolutely new to R or Python or SQL or do some serious parallel/technical computing, the Jupyter Notebook in Azure is a great choice.

Scenario 7_Spark with Notebook

You can also use Zeppelin notebooks on Spark clusters in Azure to run Spark jobs. Zeppelin notebook for HDInsight Spark cluster is an offering just to showcase how to use Zeppelin in an Azure HDInsight Spark environment. If you want to use notebooks to work with HDInsight Spark, I recommend that you use Jupyter notebooks. To make development on Spark easier, we support IntelliJ Spark Tooling which introduces native authoring support for Scala and Java, local testing, remote debugging, and the ability to submit Spark applications to the Azure cloud.

Scenario #8: Using Excel with Spark

As a final example, I wanted to describe the ability to connect Excel to Spark cluster running in Azure using the Microsoft Open Database Connectivity (ODBC) Spark Driver. Download it here.

Scenario 8_Spark with Excel

Excel is one of the most popular clients for data analytics on Microsoft platforms. In Excel, our primary BI tools such as PowerPivot, data-modeling tools, Power View, and other data-visualization tools are built right into the software, no additional downloads required. This enables users of all levels to do self-service BI using the familiar interface of Excel. Through a Spark Add-in for Excel users can easily analyze massive amounts of structured or unstructured data with a very familiar tool.

Conclusion

Above, I’ve described some of the amazing, game-changing scenarios for real-time big data processing with Spark on Azure. Any company across the globe, from a huge enterprise to a small startup can take their business to the next level with these scenarios and solutions. The question is, what are you waiting for?

The post Eight scenarios with Apache Spark on Azure that will transform any business appeared first on Microsoft SQL Server Blog.

]]>
SQL Server 2016: Everything built-in http://approjects.co.za/?big=en-us/sql-server/blog/2015/10/28/sql-server-2016-everything-built-in/ Wed, 28 Oct 2015 15:00:00 +0000 This post was authored by Joseph Sirosh, Corporate Vice President of the Data Group at Microsoft. Announcing SQL Server 2016 CTP 3.0, Azure Data Lake preview and much more.

The post SQL Server 2016: Everything built-in appeared first on Microsoft SQL Server Blog.

]]>
This post was authored by Joseph Sirosh, Corporate Vice President of the Data Group at Microsoft.

Announcing SQL Server 2016 CTP 3.0, Azure Data Lake preview and much more.

We live in the age of data, and the ability to extract actionable intelligence from data is driving a fundamental transformation in how we live, work and play. This year, at the PASS Summit, we have several exciting announcements about new products and capabilities that will drive this transformation even further:

  • We are announcing the Community Technology Preview (CTP) 3.0 of SQL Server 2016. To experience the new, exciting features in SQL Server 2016 and the new rapid release model, download the preview. CTP 3.0 includes new innovations for mission-critical performance with In-Memory OLTP and real-time Operational Analytics, first in-market Always Encrypted, built-in SQL Server R Services, JSON support, and federated query from relational to Hadoop with PolyBase, and active archive of cold data to Azure with Stretch Database. This preview also includes new Business Intelligence (BI) capabilities for SQL Server Analysis Services and SQL Server Reporting Services, and we plan to include mobile BI capabilities in the coming months to deliver end-to-end BI solutions for on-premises implementations.
  • Azure Data Lake. Previously disclosed at the Strata conference, Azure Data Lake offers unbelievable analytic processing power and an exabyte-scale big data store as a fully managed service. It includes all the capabilities required to make it easy for developers, data scientists and analysts to store data of any size, shape and speed, and do all types of processing and analytics across platforms and languages. Part of the Cortana Analytics Suite, Azure Data Lake includes new previews available today for the Store and the Analytics service and also includes Azure HDInsight, which is already generally available.
  • We are also pleased to announce the public preview of In-Memory OLTP and general availability of Operational Analytics in Azure SQL Database. In-Memory OLTP dramatically improves transaction processing performance, our In-Memory Columnstore and In-Memory OLTP can naturally be used together in the same cloud solution for high throughput transaction processing and real-time operational analytics.

With the upcoming release of SQL Server 2016, our best SQL Server release in history, and the recent availability of the Cortana Analytics Suite, Microsoft is offering unmatched innovation across on-premises and the cloud to help you turn data into intelligent action.

SQL Server, an industry leader, now packs an even bigger punch

SQL Server 2016 builds on this leadership, and will come packed with powerful built-in features. As the least vulnerable database for six years in a row, SQL Server 2016 offers security that no other database can match. It also has the data warehouse with the highest price-performance, and offers end-to-end mobile BI solutions on any device at a fraction of the cost of other vendors. It provides tools to go beyond BI with in-database Advanced Analytics, integrating the R language and scalable analytics functions from our recent acquisition of Revolution Analytics.

Our cloud-first product development model means that new features get hardened at scale in the cloud, delivering proven on-premises experience. In addition, we offer a consistent experience across on-premises and cloud with common development and management tools and common T-SQL.

Security with Always Encrypted

The Always Encrypted feature in SQL Server 2016 CTP 3.0, an industry-first, is based on technology from Microsoft Research and helps protects data at rest and in motion. Using Always Encrypted, SQL Server can perform operations on encrypted data and – best of all – the encryption key resides with the application in the customers’ trusted environment. It offers unparalleled security.

One example of a customer that’s already benefitting from this new feature is Financial Fabric, an ISV that offers a service called DataHub to hedge funds. The service enables a hedge fund to collect data ranging from transactions to accounting and portfolio positions from multiple parties such as prime brokers and fund administrators, store it all in one central location, and make it available via reports and dashboards.

“Data protection is fundamental to the financial services industry and our stakeholders, but it can cause challenges with data driven business models,” said Subhra Bose, CEO, Financial Fabric. “Always Encrypted enables the storage and processing of sensitive data within and outside of business boundaries, without compromising data privacy in both on-premises and cloud databases. At Financial Fabric we are providing DataHub services with “Privacy by Design” for our client’s data, thanks to Always Encrypted in SQL Server 2016. We see this as a huge competitive advantage because this technology enables data science in Financial Services and gives us the tools to ensure we are compliant with jurisdictional regulations.”

Mission Critical Performance

With an expanded surface area, you can use the high performance In-Memory OLTP technology in SQL Server with a significantly greater number of applications. We are excited to introduce the unique capabilities of combine in-memory analytics (columnstore) with in-memory OLTP and traditional relational store in the same database to achieve real-time operational analytics. We have also made significant performance and scale improvements across all components in the SQL Server core engine.

Insights on All Your Data

You’ll find significant improvements in both SQL Server Analysis Services (SSAS) and SQL Server Reporting Services (SSRS) that help deliver business insights faster and improve productivity for BI developers and analysts. The enhanced DirectQuery enables high-performing access to external data sources like SQL Server Columnstore. This capability enhances the use of SSAS as a semantic model over your data for consistency across reporting and analysis without storing the data in Analysis Services.

SQL Server Reporting Services 2016 offers a modernized experience for paginated reports and updated tools as well as new capabilities to more easily design stunning documents. To get more from your investments in SSRS and to provide easy access to on-premises reports to everyone in your organization, you can now pin paginated reports items to the Power BI dashboard. In coming months, we will add new Mobile BI capabilities to Reporting Services, allowing you to create responsive, interactive BI reports optimized for mobile devices.

PolyBase, available today with the Analytic Platform System, is now built into SQL Server, expanding the power to extract value from unstructured and structured data using your existing T-SQL skills. PolyBase CTP 3.0 improvements including better performance and scale out PolyBase nodes to use other SQL Server instances.

Advanced Analytics

CTP 3.0 introduces a new workload for Advanced Analytics with built-in SQL Server R Services. SQL Server R Services bridges the gap between data scientists and DBAs by enabling you to embrace the highly popular open source R language in SQL Server to build intelligent applications and discover new insights about your business. The SQL Server database is the best place to run Advanced Analytics because you can leverage industry leading technologies such as the In-Memory Columnstore and Parallelized R Services for fast predictive in-database analytics.

SQL Developers can learn R skills and build intelligent applications, while Data Scientists can leverage powerful SQL Database tools to create value through predictive and prescriptive enhancements to their applications. SQL Server 2016 enables intelligent applications to be built by hosting analytical models in the database, while reducing complexity and overall costs by moving expensive analytic computations close to the data.

New Hybrid Scenario using Stretch Database

Stretch Database enables stretching a single database between on-premises and Azure. This will enable our customers to take advantage of the cloud economics of lower cost compute and storage without being forced into an all-or-nothing database move. Stretch Database is transparent to your application, and the trickle of data to Azure can be paused and restarted without downtime. You can use Always Encrypted with Stretch Database to extend data in a more secure manner for greater peace of mind.

Azure Data Lake Store and Analytics Service available in preview today

Last month we announced a new and expanded Azure Data Lake that makes big data processing and analytics simpler and more accessible. Azure Data Lake includes the Azure Data Lake Store, a single repository where you can easily capture data of any size, type and speed, Azure Data Lake Analytics, a new service built on Apache YARN that dynamically scales so you can focus on your business goals, not on distributed infrastructure, and Azure HDInsight, our fully managed Apache Hadoop cluster service. Azure Data Lake is an important part of the Cortana Analytics Suite and a key component of Microsoft’s big data and advanced analytics portfolio.

The Azure Data Lake service includes U-SQL, a language that unifies the benefits of SQL with the expressive power of user code. U-SQL’s scalable distributed query capability enables you to efficiently analyze data in the store and across SQL Servers in Azure, Azure SQL Database and Azure SQL Data Warehouse. Customers can use Azure Data Lake tools for Visual Studio, which simplifies authoring, debugging and optimization and provides an integrated development environment for analytics.

ASOS.com, the UK’s largest independent online fashion and beauty retailer, has been using Azure Data Lake to improve customer experience on their website. “At ASOS we are committed to putting the customer first. As a global fashion destination for 20-somethings we need to stay abreast of customer behaviour on our site, enabling us to optimize their shopping experience across all platforms of ASOS.com and wherever they are in the world. Microsoft Azure Data Lake Analytics assists in processing large amounts of unstructured clickstream data to track and optimize their experience. We have been able to get productive immediately using U-SQL because it was easy to use, extend and view and monitor the jobs all within Visual Studio” said Rob Henwood, Enterprise Architect at ASOS.com.

Azure SQL Database In-Memory OLTP and Operational Analytics

Today, we are releasing our next generation in-memory technologies to Azure with the public preview of In-Memory OLTP and real-time Operational Analytics in Azure SQL Database. In-Memory OLTP in the Azure SQL Database preview includes the expanded surface area available in SQL Server 2016, enabling more applications to benefit from higher performance. By bringing this technology to the cloud, customers will be able to take advantage of in-memory OLTP and Operational Analytics in a fully managed database-as-a-service with 99.99% SLA.

Combined with the releases earlier this month of Always Encrypted, Transparent Data Encryption, support for Azure Active Directory, Row-Level security, Dynamic Data Masking and Threat Detection, Azure SQL Database provides unparalleled data security in the cloud with fast performance. As part of our intelligent capabilities, SQL Database also has built-in advisors to help customers get started quickly with in-memory OLTP to optimize performance.

It’s never been easier to capture, transform, mash-up, analyze and visualize any data, of any size, at any scale, in its native format using familiar tools, languages and frameworks in a trusted environment, both on-premises and in the cloud. Share your feedback on the new SQL Server 2016 capabilities using Microsoft’s Connect tool. If you have questions, join discussion forums on SQL Server 2016 at MSDN and Stack Overflow.

Learn more about the Azure Data Lake Store and Analytics service today and try the new In-Memory and security previews of Azure SQL Database now.

Finally, don’t forget to join us, either live or via the livestream feed, at the PASS Summit 2015 keynote and foundational sessions. And be sure to take advantage all the great sessions at the PASS Summit this week.

Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

The post SQL Server 2016: Everything built-in appeared first on Microsoft SQL Server Blog.

]]>
Microsoft expands Azure Data Lake to unleash big data productivity http://approjects.co.za/?big=en-us/sql-server/blog/2015/09/28/microsoft-expands-azure-data-lake-to-unleash-big-data-productivity/ Mon, 28 Sep 2015 13:01:00 +0000 By T. K. “Ranga” Rengarajan, corporate vice president, Data Platform In July of this year, Satya Nadella shared our broad vision for big data and analytics when he announced Cortana Analytics.

The post Microsoft expands Azure Data Lake to unleash big data productivity appeared first on Microsoft SQL Server Blog.

]]>
By T. K. “Ranga” Rengarajan, corporate vice president, Data Platform

In July of this year, Satya Nadella shared our broad vision for big data and analytics when he announced Cortana Analytics. Building on this vision, today we’re announcing a new and expanded Azure Data Lake that makes big data processing and analytics simpler and more accessible. The expanded Microsoft Azure Data Lake includes the following:

  • Azure Data Lake Store, previously announced as Azure Data Lake, will be available in preview later this year. The Data Lake Store provides a single repository where you can easily capture data of any size, type and speed without forcing changes to your application as data scales. In the store, data can be securely shared for collaboration and is accessible for processing and analytics from HDFS applications and tools.

  • Azure Data Lake Analytics, a new service built on Apache YARN that dynamically scales so you can focus on your business goals, not on distributed infrastructure. This service will be available in preview later this year and includes U-SQL, a language that unifies the benefits of SQL with the expressive power of user code. U-SQL’s scalable distributed query capability enables you to efficiently analyze data in the store and across SQL Servers in Azure, Azure SQL Database and Azure SQL Data Warehouse.

  • Azure HDInsight, our fully managed Apache Hadoop cluster service with a broad range of open source analytics engines including Hive, Spark, HBase and Storm. Today, we are announcing general availability of managed clusters on Linux with an industry-leading 99.9% uptime SLA. HDInsight will be able to take advantage of capabilities in the Store for increased throughput, scale and security.

Supporting the Azure Data Lake:

  • Azure Data Lake Tools for Visual Studio, provide an integrated development environment that spans the Azure Data Lake, dramatically simplifying authoring, debugging and optimization for processing and analytics at any scale.

  • Leading Hadoop ISV applications that span security, governance, data preparation and analytics can be easily deployed from the Azure Marketplace on top of Azure Data Lake.

Azure Data Lake

Azure Data Lake includes all the capabilities required to make it easy for developers, data scientists, and analysts to store data of any size, shape and speed, and do all types of processing and analytics across platforms and languages. It removes the complexities of ingesting and storing all of your data while making it faster to get up and running with batch, streaming, and interactive analytics.  Azure Data Lake works with existing IT investments for identity, management, and security for simplified data management and governance. It also integrates seamlessly with operational stores and data warehouses so you can extend current data applications.  We’ve drawn on the experience of working with enterprise customers and running some of the largest scale processing and analytics in the world for Microsoft businesses like Office 365, Xbox Live, Azure, Windows, Bing and Skype. Azure Data Lake solves many of the productivity and scalability challenges that prevent you from maximizing the value of your data assets with a service that’s ready to meet your current and future business needs.

“Hortonworks and Microsoft have partnered closely over many years to further the Hadoop platform for big data analytics, including contributions to YARN, Hive, and other Apache projects,” said Rob Bearden, CEO at Hortonworks. “Azure Data Lake, including Azure HDInsight powered by Hortonworks Data Platform, demonstrates our shared commitment to make it easier for everyone to work with big data.”


 

Azure Data Lake Store – A hyper-scale repository for big data processing and analytic workloads

The value of a data lake resides in the ability to develop solutions across data of all types – unstructured, semi-structured and structured. This begins with the Azure Data Lake Store, a single repository to capture and access any type of data for high-performance processing and analytics and low latency workloads with enterprise-grade security. For example, data can be ingested in real-time from sensors and devices for IoT solutions, or from online shopping websites into the store without the restriction of fixed limits on account or file size unlike current offerings in the market. As part of Azure Data Lake, the store supports development of your big data solutions with the language or framework of your choice. The store in Azure Data Lake is HDFS compatible so Hadoop distributions like Cloudera, Hortonworks®, and MapR can readily access the data for processing and analytics.

“Cloudera is pleased to be working closely with Microsoft to integrate our enterprise data hub with the Azure Data Lake Store,” said Mike Olson, founder and chief strategy officer at Cloudera. “Cloudera on Azure benefits from the Data Lake Store which acts as a cloud-based landing zone for data in your enterprise data hub. Because the store is compatible with WebHDFS, Cloudera can leverage Data Lake and provide customers with a secure and flexible big data solution.”

Azure Data Lake Analytics – a new distributed processing and analytics service

Azure Data Lake Analytics lets you focus on the logic of your application, not the distributed infrastructure running it. Instead of deploying, configuring and tuning hardware, you write queries to transform your data and extract valuable insight. Built on Apache YARN, and designed for the cloud, the analytics service can handle jobs of any scale instantly by simply setting the dial for how much power you need. The analytics service for Azure Data Lake is cost-efficient because you only pay for your job when it is running, and support for Azure Active Directory lets you manage access and roles simply and integrates with your on-premises identity system
 
We know that many developers and data scientists struggle to be successful with big data using existing technologies and tools.  Code-based solutions offer great power, but require significant investments to master, while SQL-based tools make it easy to get started but are difficult to extend. We’ve faced the same problems inside Microsoft and that’s why we introduced, U-SQL, a new query language that unifies the ease of use of SQL with the expressive power of C#.  The U-SQL language is built on the same distributed runtime that powers the big data systems inside Microsoft. Millions of SQL and .NET developers can now process and analyze all of their data with the skills they already have. The U-SQL support in Azure Data Lake Tools for Visual Studio includes state of the art support for authoring, debugging and advanced performance analysis features for increased productivity when optimizing jobs running across thousands of nodes.

“U-SQL was especially helpful because we were able to get up and running using our existing skills with .NET and SQL,” says Sam Vanhoutte, Chief Technology Officer at Codit. “This made big data easy because we didn’t have to learn a whole new paradigm. With Azure Data Lake, we were able to process data coming in from smart meters and combine it with the energy spot market prices to give our customers the ability to optimize their energy consumption and potentially save hundreds of thousands of dollars.”

Azure HDInsight – Fully Managed Hadoop, Spark, Storm and HBase

Azure Data Lake also includes HDInsight, our Apache Hadoop-based service that allows you spin up any number of nodes in minutes. As one of the fastest growing services in Azure, HDInsight gives you the breadth of the Hadoop ecosystem in a managed service that’s monitored and supported by Microsoft. Furthering our commitment to productivity, we’ve updated our Visual Studio Tools for authoring, advanced debugging, and tuning for Hive queries and Storm topologies running in HDInsight.

Today, we are announcing the general availability of HDInsight on Linux. We work closely with Hortonworks and Canonical to provide the HDP™ distribution on the Ubuntu Operating System that powers the Linux version of HDInsight in the Data Lake. This is another strategic step by Microsoft to meet customers where they are and make it easier for you run Hadoop workloads in the cloud.

Leading Hadoop ISVs on the Azure Data Lake

There are a growing set of leading data management applications for Azure Data Lake. This includes applications that provide end-to-end big data analytics like Datameer, technologies that address big data security and governance like Dataguise and BlueTalon, unified stream and batch with DataTorrent, and tools that give business users the ability to visualize and analyze data in compelling ways like AtScale and Zoomdata. Support from our partners ensures that you have the best applications available as you get started with Azure Data Lake.

We will continue to invest in solutions for big data processing and analytics to make it easier for everyone to work with data of any type, size and speed using the tools, languages and frameworks they want to in a trusted cloud, hybrid or on premise environment. Our goal is to make big data technology simpler and more accessible to the greatest number of people possible. This includes developers, data scientists, analysts, application developers, and also businesspeople and mainstream IT managers..

You can hear more about these announcements during my keynote at our free, virtual event AzureCon tomorrow or on-demand, and at Strata + Hadoop World in NYC.

The post Microsoft expands Azure Data Lake to unleash big data productivity appeared first on Microsoft SQL Server Blog.

]]>