Azure Data Explorer News and Insights | Microsoft Fabric Blog

Democratizing FinOps: Transform your practice with FOCUS and Microsoft Fabric

Michael Flanakin — Tue, 28 Nov 2023 17:00:00 +0000

Cloud computing has revolutionized the way you build, deploy, and scale applications and services. While you have unprecedented flexibility, agility, and scalability, you also face greater challenges in managing cost, security, and compliance. While IT security and compliance are often managed by central teams, cost is a shared responsibility across executive, finance, product, and engineering teams, which is what makes managing cloud cost such a challenge. Having the right tools to enable cross-group collaboration and make data-driven decisions is critical.

Fortunately, you have everything you need in the Microsoft Cloud to implement a streamlined FinOps practice that brings people together and connects them to the data they need to make business decisions. And with new developments like Copilot in Microsoft Cost Management and Microsoft Fabric, there couldn’t be a better time to take a fresh look at how you manage cost within your organization and how you can leverage the FinOps Framework and the FinOps Open Cost and Usage Specification (FOCUS) to accelerate your FinOps efforts.

There’s a lot to cover in this space, so I’ll split this across a series of blog posts. In this first blog post, I’ll introduce the core elements you’ll need to lay the foundation for the rest of the series.

Microsoft Cost Management

Monitor, allocate, and optimize cloud costs with transparency, accuracy, and efficiency

Explore Cost Management

No-code extensibility with Cost Management exports

As your FinOps team grows to cover new services, endpoints, and datasets, you may find they spend more time integrating disparate APIs and schemas than driving business goals. This complexity also keeps simple reports and alerts just out of reach from executive, finance, and product teams. And when your stakeholders can’t get the answers they need, they push more work on to engineering teams to fill those gaps, which again, takes away from driving business goals.

We envision a future where FinOps teams can empower all stakeholders to stay informed and get the answers they need through turn-key integration and AI-assisted tooling on top of structured guidance and open specifications. And this all starts with Cost Management exports—a no-code extensibility feature that brings data to you.

As of today, you can sign up for a limited preview of Cost Management expands where you can export five new datasets directly into your storage account without a single line of code. In addition to the actual and amortized cost and usage details you get today, you’ll also see:

Cost and usage details aligned to FOCUS
Price sheets
Reservation details
Reservation recommendations
Reservation transactions

Of note, the FOCUS dataset includes both actual and amortized costs in a single dataset, which can drive additional efficiencies in your data ingestion process. You’ll benefit from reduced data processing times and more timely reporting on top of reduced storage and compute costs due to fewer rows and less duplication of data.

Beyond the new datasets, you’ll also discover optimizations that deliver large datasets more efficiently, reduced storage costs by updating rather than creating new files each day, and more. All exports are scheduled at the same time, to ensure scheduled refreshes of your reports will stay in sync with the latest data. Coupled with file partitioning, which is already available and recommended today, and data compression, which you’ll see in the coming months, the exports preview removes the need to write complex code to extract, transfer, and load large datasets reliably via APIs. This better enables all FinOps stakeholders to build custom reports to get the answers they need without having to learn a single API or write a single line of code.

To learn about all the benefits of the exports preview—yes, there’s more—read the full synopsis in Cost Management updates. And to start exporting your FOCUS cost and usage, price sheet, and reservation data, sign up for the exports preview today.

FOCUS democratizes cloud cost analytics

In case you’re not familiar, FOCUS is a groundbreaking initiative to establish a common provider and service-agnostic format for billing data that empowers organizations to better understand cost and usage patterns and optimize spending and performance across multiple cloud, software as a service (SaaS), and even on-premises service offerings. FOCUS provides a consistent, clear, and accessible view of cost data, explicitly designed for FinOps needs. As the new “language” of FinOps, FOCUS enables practitioners to collaborate more efficiently and effectively with peers throughout the organization and even maximize transferability and onboarding for new team members, getting people up and running quicker.

FOCUS 0.5 was originally announced in June 2023, and we’re excited to be leading the industry with our announcement of native support for the FOCUS 1.0 preview as part of Cost Management exports on November 13, 2023. We believe FOCUS is an important step forward for our industry, and we look forward to our industry partners joining us and collaboratively evolving the specification alongside FinOps practitioners from our collective customers and partners.

FOCUS 1.0 preview adds new columns for pricing, discounts, resources, and usage along with prescribed behaviors around how discounts are applied. Soon, you’ll also have a powerful new use case library, which offers a rich set of problems and prebuilt queries to help you get the answers you need without the guesswork. Armed with FOCUS and the FinOps Framework, you have a literal playbook on how to understand and extract answers out of your data effortlessly, enabling you to empower FinOps stakeholders regardless of how much knowledge or experience they have, to get the answers they need to maximize business value with the Microsoft Cloud.

For more details about FOCUS or why we believe it’s important, see FOCUS: A new specification for cloud cost transparency. And stay tuned for more updates as we dig into different scenarios where FOCUS can help you.

Microsoft Fabric and Copilot enable self-service analytics

So far, I’ve talked about how you can leverage Cost Management exports as a turn-key solution to extract critical details about your costs, prices, and reservations using FOCUS as a consistent, open billing data format with its use case library that is a veritable treasure map for finding answers to your FinOps questions. While these are all amazing tools that will accelerate your FinOps efforts, the true power of democratizing FinOps lies at the intersection of Cost Management and FOCUS with a platform that enables you to provide your stakeholders with self-serve analytics and alerts. And this is exactly what Microsoft Fabric brings to the picture.

Microsoft Fabric is an all-in-one analytics solution that encompasses data ingestion, normalization, cleansing, analysis, reporting, alerting, and more. I could write a separate blog post about how to implement each FinOps capability in Microsoft Fabric, but but to get you acclimated, let me introduce the basics.

Your first step to leveraging Microsoft Fabric starts in Cost Management, which has done much of the work for you by exporting details about your prices, reservations, and cost and usage data aligned to FOCUS.

Once exported, you’ll ingest your data into a Fabric lakehouse, SQL, or KQL database table and create a semantic model to bring data together for any reports and alerts you’ll want to create. The database option you use will depend on how much data you have and your reporting needs. Below is an example using a KQL database, which uses Azure Data Explorer under the covers, to take advantage of the performance and scale benefits as well as the powerful query language.

Fabric offers several ways to quickly explore data from a semantic model. You can explore data by simply selecting the columns you want to see, but I recommend trying the auto-create a report option which takes that one step further by generating a quick summary based on the columns you select. As an example, here’s an auto-generated summary of the FOCUS EffectiveCost broken down by ChargePeriodStart, ServiceCategory, SubAccountName, Region, PricingCategory, and CommitmentDiscountType. You can apply quick tweaks to any visual or switch to the full edit experience to take it even further.

Those with a keen eye may notice the Copilot button at the top right. If we switch to edit mode, we can take full advantage of Copilot and even ask it to create the same summary:

Copilot starts to get a little fancier with the visuals and offers summarized numbers and a helpful filter. I can also go further with more specific questions about commitment-based discounts:

Of course, this is barely scratching the surface. With a richer semantic model including relationships and additional details, Copilot can go even further and save you time by giving you the answers you need and building reports with less time and hassle.

In addition to having unparalleled flexibility in reporting on the data in the way you want, you can also create fine-grained alerts in a more flexible way than ever before with very little effort. Simply select the visual you want to measure and specify when and how you want to be alerted:

This gets even more powerful when you add custom visuals, measures, and materialized views that offer deeper insights.

This is just a glimpse of what you can do with Cost Management and Microsoft Fabric together. I haven’t even touched on the data flows, machine learning capabilities, and the potential of ingesting data from multiple cloud providers or SaaS vendors also using FOCUS to give you a full, single pane of glass for your FinOps efforts. You can imagine the possibilities of how Copilot and Fabric can impact every FinOps capability, especially when paired with rich collaboration and automation tools like Microsoft Teams, Power Automate, and Power Apps that can help every stakeholder accomplish more together. I’ll share more about these in a future blog post or tutorial.

Next steps to accomplish your FinOps goals

I hope you’re as excited as I am about the potential of low- or even no-code solutions that empower every FinOps stakeholder with self-serve analytics. Whether you’re in finance seeking answers to complex questions that require transforming, cleansing, and joining multiple datasets, in engineering looking for a solution for near-real-time alerts and analytics that can react quickly to unexpected changes, or a FinOps team that now has more time to pursue something like unit cost economics to measure the true value of the cloud, the possibilities are endless. As someone who uses Copilot often, I can say that the potential of AI is real. Copilot saves me time in small ways throughout the day, enabling me to accomplish more with less effort. And perhaps the most exciting part is knowing that the more we leverage Copilot, the better it will get at automating tasks that free us up to solve bigger problems. I look forward to Copilot familiarizing itself with FOCUS and the use case library to see how far we’re able to go with a natural language description of FinOps questions and tasks.

And of course, this is just the beginning. We’re on the cusp of a revolutionary change to how organizations manage and optimize costs in the cloud. Stay tuned for more updates in the coming months as we share tutorials and samples that will help you streamline and accomplish FinOps tasks in less time. In the meantime, familiarize yourself with Microsoft Fabric and Copilot and learn more about how you can accomplish your FinOps goals with an end-to-end analytics platform.

The post Democratizing FinOps: Transform your practice with FOCUS and Microsoft Fabric appeared first on Microsoft Fabric Blog.

Microsoft Fabric, explained for existing Synapse users

Bogdan Crivat — Mon, 20 Nov 2023 16:31:04 +0000

Earlier this year, at Microsoft Build, we introduced, in Public Preview, Microsoft Fabric, “the biggest data product announcement since SQL Server”. Today, we are announcing the General Availability of Microsoft Fabric.

Arun explains in detail why we all believe Microsoft Fabric will redefine the current analytics landscape. I will focus here on what it means for customers that are using the current Platform-as-a-Service (PaaS) version of Synapse, explaining what it means for your current investments (spoiler: we fully support them), but also how to think about the future.

What happens with PaaS Azure Synapse Analytics

The PaaS offering of Azure Synapse Analytics is an enterprise analytics service designed to accelerate time to insight across data warehouses and big data systems. It brings together the SQL technologies used in enterprise data warehousing, Azure Data Factory pipelines, Apache Spark technologies for big data, and Azure Data Explorer for log and time series analytics.

Microsoft has no current plans to retire Azure Synapse Analytics. Customers can continue to deploy, operate, and expand the PaaS offering of Azure Synapse Analytics. Rest assured, should these plans change, Microsoft will provide you with advanced notice and will adhere to the support commitments in our Modern Lifecycle Policy in order to ensure our customers’ needs are met.

The evolution of Microsoft’s big data analytics products

The next versions of our big data analytics products are now a core part of Microsoft Fabric.

Fabric opens new architectural horizons for our analytical engines. Fabric offers a unified storage abstraction for all your data, OneLake, organized into a logical data mesh, with federated governance and granular control and an intuitive, personalized data hub. All Fabric engines separate storage from compute, and store data in OneLake using a single, open data format.

On this new foundation, we can invent new, unprecedented ways of deploying pipelines, data warehousing, data engineering, data science, observability and real-time analytics technologies, to ultimately simplify and increase the efficiency of our customers’ solutions. Fabric allows us, and our customers, to do more. This is why most of our innovation efforts will be focused on Fabric.

How to think about your current Azure PaaS Synapse Analytics solutions

As mentioned above, there is no immediate need to change anything, as the current platform is fully supported by Microsoft. Your existing solutions will keep working. Your in-progress deployments can continue, all with our full support.

However, you probably have already started thinking about a Microsoft Fabric future for your analytics solutions. The following steps may help you with this thought process.

Understand Microsoft Fabric

Microsoft Fabric represents a significant upgrade to all our analytics engines. All of them are improved, faster, and more scalable. And there is a lot to learn about the new engines and how to best use them. Fabric reimagines collaboration and empowers the business users in an unprecedented way. But it is much more than just better engines or just better integration.

The unified, open-source data format means that there is no need to copy data from one engine to another. You can shape data using the technology of your choice, then query it with any other technology.

Fabric introduces completely new ways to make your data part of your analytics landscape. Shortcuts (within Azure, or cross cloud), database mirroring, seamless access to Dataverse and M365 data, all these solutions are designed to remove friction and costs.

Understanding these technologies will enable you to make the best out of Fabric, in terms of efficiency, agility and costs.

Our teams have worked hard to produce detailed documentation for all the Fabric concepts, and the best complement for the documentation is hands-on experience. The easiest way to understand Fabric in depth is to try the product: Microsoft Fabric free trial . Arun’s blog spells out clearly how to learn more about Microsoft Fabric.

Understand what it means for your solution

Your analytics solution may use different technologies and engines. Fabric is a complete analytics platform, so you will find, inside Microsoft Fabric, new and enhanced analytics capabilities of the products with which you are familiar today.

Fabric brings new capabilities, that have no parallel with the current PaaS Synapse Analytics offering. The Fabric SQL Engine can operate, with equal performance, scale, and security, over any OneLake artifact (warehouses, lakehouses, mirrored databases). It also supports cross-artifact operations removing the need for extra copies while Power BI, for example, in DirectLake mode, can now analyze real time streaming data, or Spark output.

All these changes enable simpler, more efficient solutions, removing the need for intermediate steps and multiple data copies. Your solution can get significantly simpler and cheaper.

Below, I use one example of common PaaS Azure Synapse Analytics architectures, together with a possibly more efficient solution in Fabric, to demonstrate such potential simplifications.

Example 1: Data Lake, from Synapse to Fabric

Today, you may prepare your data in an Azure Data Lake Storage Gen2 (ADLSg2) lakehouse (typically using Spark, Synapse or Azure Databricks), then use a pipeline to load data into a Synapse SQL Dedicated Pool, then use Power BI or some other BI tool for your report.

You can keep your current solution intact, and upgrade to Fabric engines.

In Fabric, however, this solution can be simplified:

A Data Engineering Lakehouse, in Microsoft Fabric, allows you to use your current ADLSg2 data, as prepared with Synapse Spark or Azure Datrabricks (via shortcuts).
The SQL Analytics Endpoint allows you to apply the security rules from the Dedicated Pool directly over the Lakehouse. There is no need for a dedicated capacity, nor for the pipeline copying from the lake to your warehouse.
Using the new DirectLake mode, Power BI can now operate directly over the Lakehouse, with performance similar to Import. Your other BI tools can continue to operate over the SQL Analytics Endpoint.
By migrating your Notebooks and Spark Jobs to Fabric Spark, your Lakehouse data will be automatically optimized for all the other Fabric engines (while also being stored in an open format)

To learn more about the Lakehouse pattern in Microsoft Fabric, please visit Lakehouse end-to-end scenario: overview and architecture – Microsoft Fabric | Microsoft Learn

Assess our migration tools and processes

We are investing significant development efforts in migration processes and tooling. And our migration efforts are prioritizing current PaaS Synapse Analytics customers.

The processes and tools we are designing are intended to minimize the friction, disruption and cost for our existing customers.

As you will see in the section on Migration Resources, we are developing tools to:

Use your data in-place whenever possible
Reuse code investments (pipelines, notebooks) when possible
Migrate code (stored procedures, views, notebooks)

These investments are not complete. We will keep posting updates to our migration tools. Join the fast-growing Fabric community , and our specialists as well as external experts will be ready to work with you. The Fabric Ideas forum, on the community site, is the best way to suggest new features, and it is closely monitored by the Microsoft Fabric product teams.

Develop, then plan to deploy a migration strategy

After having learned about Fabric and evaluating the product, you will have developed enough confidence in the new Fabric engines and the migration technology to move your solution to Fabric. For some of you this may happen soon, for others it may take years.

There is no rush – we will keep supporting your existing solutions – but we are ready for you to migrate whenever the time is right.

When you are ready to move your solution to Fabric, you will be able to exchange your existing 1- or 3-year Synapse Reserved Instance (RI) purchases for 1 year Fabric RI purchases to continue to apply your reservation discounts in Fabric. Additionally, if you want to increase your RI commitment for your Fabric portfolio you will have access to discounts of >40% over the Fabric Pay-as-you-go pricing.

In the next sections, the product leaders explain how to think about Fabric from the perspective of different PaaS Synapse Analytics workloads.

Data Factory Pipelines

Data Factory in Microsoft Fabric brings Power Query and Azure Data Factory together into a modern trusted data integration experience, that empowers data and business professionals to extract, load, and transform data for their organization. In addition, powerful data orchestration capabilities enable you to build simple to complex data workflows, that orchestrate the steps needed for your data integration needs.

Key concepts in Data Factory in Microsoft Fabric include:

Get Data and Transformation with Dataflow Generation 2 is an evolution of Dataflow in Power BI. Dataflow Generation 2 is re-architected to leverage Fabric compute engines for data processing and transformation. This enables Dataflow Generation 2 to ingest and transform data at any scale.
Data Orchestration with Data Pipelines – For customers familiar with Azure Data Factory (ADF), data pipelines in Microsoft Fabric use the same technology that powers Azure Data Factory. As part of the GA of Fabric, data pipelines in Microsoft Fabric will have most of the activities available in ADF.See here a list of activities that will be part of data pipelines in Fabric. SSIS activity will be added to data pipelines by Q2 CY2024.
Enterprise-ready Data Movement – Whether it is petabyte-scale data to small data, Data Factory provides a serverless and intelligent data movement platform that enables you to move data between diverse data sources and data destinations reliably. With support for 170+ connectors, Data Factory in Fabric enables you to move data between multi-clouds, data sources on-premises, and within virtual networks (VNet). Intelligent throughput optimization enables the data movement platform to automatically detect the size of the compute needed for data movement.

To enable customers to upgrade to Microsoft Fabric from Azure Data Factory (ADF), we will be supporting the following:

Data pipelines activities – For many of the activities that you use in ADF, we have added these into Data Factory in Fabric. In addition, we have added new activities (e.g. Teams, Outlook) for notifications. See here for a list of activities that are available in Data Factory in Fabric.
OneLake/Lakehouse connector in Azure Data Factory – For many ADF customers, you can now integrate with Microsoft Fabric, and bring data into the Fabric Onelake
Azure Data Factory Mapping Dataflow to Fabric – We have put together a guide for ADF customers who are looking at building new data transformations in Fabric.Find out more at https://aka.ms/datafactoryfabric/docs/guideformappingdataflowusers

In addition, customers looking at migrating their ADF mapping dataflows to Fabric, you can leverage sample code from the Fabric Customer Advisory Team (Fabric CAT) to convert mapping dataflows to Spark code. Find out more at https://github.com/sethiaarun/mapping-data-flow-to-spark

As part of Data Factory in Fabric roadmap, we will be working towards the preview of the following by Q2 CY2024:

Mounting of Azure Data Factory in Fabric – This enables customers to be able to mount their existing Azure Data Factory in Microsoft Fabric. All ADF pipelines will work as-it-is, and continue running on Azure, while enabling you to explore Fabric, and work out an upgrade plan.
Upgrade from Azure Data Factory pipelines to Fabric – We will be working with customers and the community on learning how we can best support upgrades of data pipelines from ADF to Fabric. As part of this, we will deliver an upgrade experience that empowers you to test your existing data pipelines in Fabric using mounting and upgrading the data pipelines.

Learn more about how you can upgrade to Data Factory in Fabric – https://aka.ms/datafactoryfabric/upgradetofabric

Synapse Data Warehouse

Fabric Data Warehouse is the next generation of data warehousing in Microsoft Fabric. It is the first transactional data warehouse to natively support an open data format enabling data engineers and business users to collaborate seamlessly without compromising security or governance. Just like the previous data warehouse generation, SQL provides multi-table ACID transactional guarantees. It is built on the well-established SQL Server Query Optimizer and Distributed Query Processing engine but comes with major improvements that address many of the challenges customers face in enabling workloads associated with modern analytics. These improvements were driven by rearchitecting the data warehouse by leveraging IP from both Dedicated and Serverless SQL Pools along with:

Separation of storage and compute: data is stored in OneLake and is clearly separated from the compute used by the SQL engine. There is an elastic allocation of compute resources based on demand, as well as use of distinct compute resources for different workload types on top of the same data.
Leveraging the infinite compute capabilities of Azure Cloud: giving us the capability of going beyond a limited topology offered by the Synapse Gen2 architecture.
Support for open data format: allowing a single copy of the data to be used by all the Fabric workloads such as Data Science, Data Engineering, and Power BI.

With this new architecture, the new engine enables numerous new capabilities that were not possible in either Dedicated and Serverless SQL Pools such as:

Cross database querying without any ETL or data movement.
Cloning without creating copies of the data.
Autoscaling enabling elastic scale up and down of the compute nodes with dynamic resource allocation tailored to data volume, usage, or query complexity.
Enabling a pay for what you use pricing model.
No knobs performance via automated query optimizations, statistics, and data distributions.

All of this with the concepts familiar to SQL users such as Views, Stored Procedures, SQL security (row-level security, column-level security, dynamic data masking) and full benefits of the T-SQL tooling ecosystem.

These architectural changes cannot be backported to either one of the old engines. Because of the open format, your data warehouses cannot be upgraded in place either. Data stored in a proprietary format in Gen2 needs to be extracted and stored in the open format of Fabric.

A migration can be done at your own pace when you are ready to leverage these new capabilities. To enable this, we have added the following available to you now:

Ability to export your Dedicated SQL Pool to a SQL Project and import it in Fabric.
PowerShell scripts are available in GitHub that convert Gen2 DDL to Fabric supported DDL.
Detailed migration documents with best practices. Find out more at the Azure Synapse dedicated SQL pools to Fabric Migration Guidance whitepaper

In addition, we have also started working on an in-product Migration Assistant that will automatically detect and convert your Synapse Gen2 code to Fabric Data Warehouse code. It will also redirect your endpoints, so you don’t have to worry about application migration. We anticipate this to be available in CY24.

Synapse Data Engineering

Fabric Data Engineering is our big data analytics workload in Fabric, empowering data engineers to leverage the power of Apache Spark to transform their data at scale and build out a lakehouse architecture. The Fabric Data Engineering experience targets users of Apache Spark pools in the Azure Synapse Analytics world. Here are some of the key takeaways regarding the Fabric Data Engineering experience:

Runtime for big data workloads

Every Fabric workspace comes pre-wired with a ‘starter pool’ (default Spark cluster) with a Fabric Runtime that contains up to date versions of Spark, Delta, Java and Python. Just like in Azure Synapse Analytics, customers can also create their own custom clusters with their own configurations and libraries if they want.

The Apache Spark experience in Fabric also contains many new and exciting enhancements:

Starter pools in Fabric are automatically kept live meaning users can enjoy sessions that start within ~15 seconds
High concurrency mode in Fabric means multiple notebooks can be attached to a single session, accelerating the start-up times and reducing costs
Spark clusters start all the way from a single node, further reducing the costs of getting started with Spark

Simplified lakehouse architecture

Every Fabric workspace also comes pre-wired with OneLake, our SaaSified data lake for the organization. Users can easily create lakehouse items, which are the perfect container for bringing in all your data into OneLake using Spark, dataflows and pipelines. Existing data can be easily included with no data movement through the use of shortcuts. We will also automatically discover metadata of Delta tables for you, making it super easy to start working with existing data with zero friction. Additionally, we have reduced the price for Spark in Fabric by almost 40% vs. the retail price of Synapse Spark.

Here are some other exciting things to keep in mind about the lakehouse in Fabric:

Every lakehouse comes with a built in SQL endpoint and Power BI dataset. This means that as soon as you transform your data with Spark, you can start querying it using our SQL engine and Power BI, with no data movement necessary
Spark (along with every other Fabric engine) will automatically write the data into the lakehouse with v-order enabled, automatically optimizing it for BI reporting

First Class Developer Experiences

The Synapse Data Engineering experience brings in familiar authoring tools, including notebooks for interactive querying experiences and Spark Job Definitions for submitting batch jobs. These capabilities come with a variety of new enhancements and users even have some new authoring experiences to look forward to:

Notebooks in Fabric include numerous usability improvements including auto-save, real time collaboration and commenting, a built-in file system as well as native file format support when checking into git. Users can also make use of light-weight scheduling (in addition to using the pipeline activity).
Spark Job Definitions come with retry policy support, making it easier to continuously run long running streaming jobs
Native VS Code support makes it easy to work with your Data Engineering items (notebooks, Spark Jobs, lakehouse) all in your favorite IDE, including full debugging support
The newly released environment item streamlines the packaging of all of your Spark configurations, libraries, cluster settings and more, and simplifies the re-usability of your hardware and software environment across your code artifacts.

To summarize, with Synapse Data Engineering, you can start building on top of your existing Azure Synapse Spark investments quickly and incrementally. Start by leveraging shortcuts to existing data in your data lake and bringing-in your notebooks using the import capability. We are starting work on an in-product migration assistant but in the meantime, please use our newly published Azure Synapse Spark to Fabric Migration Guidance whitepaper.

Synapse Data Science

Synapse Data Science empowers data scientists to explore their data, build and operationalize their predictive models. Coming from the Azure Synapse Analytics world, you will see many familiar constructs such as Python and R being baked into the runtime including many popular ML packages, the ability to install your own third party & custom libraries as well as the availability of SynapseML, our open source library for creating massively scalable ML pipelines.

Fabric Data Science offers a variety of new capabilities data scientists can look forward to:

Model & Experiment tracking

Data scientists are able to leverage experiments and models as readily available in items in the Fabric workspace. Support for ML models and experiments allows users to manage models and track experiment runs using standard MLFLow APIs. Comparison experiences make it easy to compare different experiment runs and auto logging helps capture key metrics automatically as users author code to train models.

Model batch scoring

To operationalize their ML models, users can leverage the scalable PREDICT function for distributed batch scoring on Spark. This capability exists in Azure Synapse today and so existing Synapse users should feel right at home. The Fabric Data Science experience provides low code UI for scoring data and tight integration with the lakehouse, making it easy to enrich data and surface it in Power BI reports with zero friction.

Data Exploration & Enrichments

Fabric Data Science offers many innovative solutions in the space of exploring and transforming your data. These include:

Data Wrangler – a low code UI for carrying out data transformations that automatically generate Python code
Semantic Link – a library enabling seamless connectivity to the Power BI semantic model through data science tools like notebooks
Pre-built AI models – newly released public preview capability providing built-in access to Azure AI services like text analytics and translation services

The migration path for a data scientist in Azure Synapse Analytics is like that of a Spark data engineer – they will need to consider their notebooks, Spark pools and data. We recommend starting with the Azure Synapse Spark to Fabric Migration Guidance whitepaper.

Synapse Real-time Analytics

Synapse Real-time Analytics is a robust platform tailored to deliver real-time data insights and observability analytics capabilities for a wide range of data types. This includes observability time-based data like logs, events, and telemetry data. It’s the true streaming experience in Fabric! Building on the same foundation as Azure Synapse Data Explorer, Synapse Real-time Analytics equips both citizen data scientists and professional data engineers with a suite of features and tools to fully unleash the potential of their data.

Rapid Deployment

Experience unmatched efficiency by creating a database, ingesting data, running queries, and generating Power BI reports, all within a 5-minute timeframe. Real-time Analytics puts speed at the forefront, allowing you to dive into data analysis without delay.

Get Data

For an authentic streaming experience in Fabric, the “Get Data” feature has received a modern facelift with an intuitive design and user-friendly interface. It simplifies data ingestion, accepting any data format or structure from various sources in either streaming or batch mode. Your data becomes query able within seconds.

Query Versatility

Whether you’re a Kusto Query Language (KQL) enthusiast or prefer traditional SQL, Real-time Analytics accommodates your needs. This service enables you to generate quick KQL or SQL queries, ensuring that you can work in your preferred language and obtain results swiftly. It doesn’t matter if you’re working with a small dataset (a few gigabytes), a medium-sized one (a few terabytes), or even massive datasets (in the petabytes range).

Data Exploration

Fabric Real-Time Analytics offers a multitude of innovative solutions for exploring and visualizing your data, including:

KQL Queryset: A workbench for creating, managing, and sharing your queries.

Power BI Report: A one-click option to generate a Power BI report on top of any query or table.

Notebook: Seamlessly connect your Fabric Notebook with the KQL Database for data ingestion and querying.

NL2KQL (Coming Soon): Write your query in natural language, and Fabric will generate and execute the corresponding KQL query for you.

Real-Time Dashboard (Coming Soon): The Fabric Real-Time Dashboard is a collection of tiles that enable native export of Kusto Query Language (KQL) queries as visuals. This allows for easy query modification and visual formatting, enhancing data exploration and delivering superior query and visualization performance.

Fabric Real-Time Analytics is your gateway to real-time insights and a streamlined data analysis experience. Whether you’re pioneering new data horizons or looking to optimize your data analytics solutions, this service is your trusted partner. Stay ahead in the data game and embark on your journey with Fabric Real-Time Analytics today.

For more information on Fabric Real-Time Analytics, visit the general availability blog.

Migration planning

Fabric KQL databases are 100% compliant with Azure Data Explorer (ADX) and Azure Synapse Data Explorer (Preview) and our powered by the same technology. It means that all current applications, SDK, integrations, and tools that work with ADX will continue to work smoothly with Fabric KQL databases.

There is a broad set of capabilities to support mixed environments and migrations, some are available now and some will light up in the next months.

Available now:
- Full binary compatibility of APIs, SDKs and tools.
- Create a database shortcut to host a read only, in place, up to date instance of the database in Fabric.
Coming over the next months:
- Migrate an Azure Synapse Data Explorer pool from a Synapse workspace and attach it to a Fabric workspace
- Attach an Azure Data Explorer cluster to a Fabric workspace
- Sync Azure Data Explorer user queries and dashboards into a Fabric workspace query sets and dashboards

Migration Resources

Azure Data Factory

Azure Synapse DW to Fabric Migration Guidance

Azure Synapse Spark to Fabric Migration Guidance

Azure Synapse Data Explorer and Azure Data Explorer to Fabric Migration Guidance

The post Microsoft Fabric, explained for existing Synapse users appeared first on Microsoft Fabric Blog.