Data Engineering News and Insights | Microsoft Fabric Blog http://approjects.co.za/?big=en-us/microsoft-fabric/blog/tag/data-engineering/ Sat, 14 Feb 2026 00:04:31 +0000 en-US hourly 1 http://approjects.co.za/?big=en-us/microsoft-fabric/blog/wp-content/uploads/2026/03/cropped-favicon-32x32.png Data Engineering News and Insights | Microsoft Fabric Blog http://approjects.co.za/?big=en-us/microsoft-fabric/blog/tag/data-engineering/ 32 32 Sessions you won’t want to miss at FabCon Vienna http://approjects.co.za/?big=en-us/microsoft-fabric/blog/2025/07/28/sessions-you-wont-want-to-miss-at-fabcon-vienna/ Mon, 28 Jul 2025 15:00:00 +0000 From September 15 to 18, FabCon Vienna will feature over 130 sessions, 150 expert speakers, 10 hands-on workshops, and 45 exhibitors.

The post Sessions you won’t want to miss at FabCon Vienna appeared first on Microsoft Fabric Blog.

]]>
Following last year’s sold-out debut in Stockholm, the Microsoft Fabric Community Conference is returning to Europe in Vienna, Austria! From September 15 to 18, FabCon Vienna will feature over 130 sessions, 150 expert speakers, 10 hands-on workshops, and 45 exhibitors. FabCon Vienna is your opportunity to dive deep into the latest Microsoft Fabric capabilities, hear directly from Microsoft product leaders and community experts, explore new features, and gain practical insights you can bring back to your organization.

This year’s agenda is packed with sessions tailored to every stage of your Fabric journey. Explore key sessions across Power BI, AI, databases, security and governance, and Microsoft OneLake, and get a first look at the newest features and what’s coming next on the roadmap. Whether you’re looking to sharpen your skills, dive into data stewardship best practices, or get started with Microsoft Copilot in Fabric, you’ll find sessions designed to meet you where you are and help you go further.

To make the most of your time at FabCon Vienna, look through our list of sessions you won’t want to miss. We also highly recommend attending core notes from the teams building Microsoft Fabric. These sessions offer strategic insights into what’s new, what’s coming, and how to maximize your experience at the event.

Fabric core note sessions

Power BI

Chat with your data through AI-powered search and analytics

Session speakers: Lada Hill and Eun Hee Kim

Discover how Microsoft Fabric Copilot is changing the way users explore data in Power BI. This session dives into the Chat with your Data experience, showing how to ask smarter questions, uncover insights faster, and get more value from your reports. Hear from the Power BI product team on how to optimize your prompts and make the most of Copilot’s capabilities. Plus, get a sneak peek at upcoming features that will take AI-powered analytics even further.


Power BI DataViz World Championship – European Edition

Join us for a high-energy, live competition where four standout data creators go head-to-head in a timed Power BI visualization challenge. Using the same dataset, each competitor will build compelling reports that showcase creativity, storytelling, and technical skill. A panel of celebrity judges will evaluate the results and crown the FabCon Viz Champion, with the winner’s work featured across the community. Whether you’re a Power BI pro or just love great data stories, this is your front-row seat to inspiration, innovation, and a little friendly competition.

The latest in AI

Fabric and Azure AI Foundry playing nicely together

Session speaker: Grímur Sæmundsson

Explore how Microsoft Fabric and Azure AI Foundry work together to streamline employee assessments in the public sector. This session walks through a real-world solution in which Fabric handles data processing and Azure OpenAI enhances analysis and feedback generation. Learn how retrieval-augmented generation is used to embed guidelines, and see Notebooks, Semantic Link, and PySpark in action to retrieve and prepare data. You’ll walk away with practical insights into using LLMs and Fabric to automate complex evaluation workflows.

Databases

SQL Server 2025: The AI-ready enterprise Database Connected with Microsoft Fabric

Session speakers: Bob Ward and Uros Milanovic

Discover what’s new in SQL Server 2025—now with built-in AI, enhanced performance, and deep integration with Azure and Microsoft Fabric. Learn how SQL enables AI applications both on-premises and in the cloud, with consistent capabilities from ground to cloud to Fabric. This session covers key features designed for modern database developers, making it easier than ever to build intelligent, connected apps.

Real-Time Intelligence

Unlock the power of Digital Twin solutions with Real-Time Intelligence

Session speakers: Chafia Aouissi and Jomit Vaghela

Explore how Microsoft Fabric’s Digital Twin Builder helps you design AI-ready digital twin solutions using real-time data, ontology management, and contextualization. Learn how to map, model, and analyze real-world systems for deeper insights, predictive maintenance, and smarter decision-making. Whether you’re just getting started or looking to scale, this session offers practical guidance on building and optimizing digital twins with Fabric Real-Time Intelligence.

Data warehouse and data engineering

Accelerating Fabric Migration: New Assistant Tools for Data Engineering and Warehousing

Session speakers: Jenny Jiang and Ancy Philip

Learn how Microsoft’s new migration assistants simplify moving from Synapse to Microsoft Fabric. This session covers tools for Spark and Data Warehouse migrations, highlighting key features, feature parity, and differences to guide your strategy. See live demos, explore upcoming capabilities, and leave with practical tips to ensure a smooth and efficient migration to Fabric.


Mastering Microsoft Fabric Data Warehousing: Tips & Tricks You Need to Know

Session speaker: Kristyna Ferris

Learn practical tips to optimize performance and manage your Microsoft Fabric data warehouse more effectively. This session covers creating case-insensitive warehouses, monitoring and tuning query performance, and stopping rogue queries that threaten capacity. Packed with real-world examples and actionable guidance, you’ll leave with strategies you can apply immediately to keep your data warehouse stable and efficient.


Revolutionizing external data access in Fabric Data Warehouse

Session speakers: Jovan Popovic and Twinkle Cyril

Discover how Microsoft Fabric Data Warehouse transforms external data access with new capabilities for reading and integrating data without ingestion. Learn to use external tables and OPENROWSET to query Delta Lake, parquet, and CSV files directly from OneLake, Lakehouse, and real-time analytics sources. This session highlights key enhancements to external tables, COPY INTO, and virtualization techniques—showcasing how Fabric unifies warehouse and lakehouse concepts into an open, modern platform.


Workspace strategy for Data Engineering in Microsoft Fabric

Session speaker: Ásgeir Gunnarsson

Choosing the right workspace strategy is critical to building scalable data engineering solutions in Microsoft Fabric. This session examines different approaches—single workspace, per stage, or per workload—and how factors like team size, DevOps practices, and security requirements influence your decision. Using the Medallion architecture as a guide, we’ll explore common challenges, practical workarounds, and key considerations to help you start strong and avoid a costly rework later.

Security and governance

Govern, manage, and protect your data in Microsoft Fabric

Session speakers: Yaron Canari and Adi Regev

Learn how Microsoft Fabric helps organizations govern, manage, and protect their analytics data with built-in compliance and security features. This session covers local governance tools within Fabric and how they integrate with Microsoft Purview for broader, enterprise-wide control. Gain practical insights into securing your data estate while staying compliant and in control.


Fabric security: Everything you need to know!

Session speakers: Kasper de Jonge and Anton Fritz

Microsoft Fabric offers a SaaS-first approach to data that includes powerful security features out of the box—but do you know what you’re getting? This session explores how Fabric handles authentication, inbound access, data storage, and user-level permissions. Learn how to secure your data estate, control access, and integrate governance with Microsoft Purview. Walk away ready to engage your security team with confidence.

Microsoft OneLake

Deep dive into Delta (Parquet) and OneLake: Unpacking the storage behind Microsoft Fabric

Session speaker: Steve Campbell

Explore the core storage technologies that power Microsoft Fabric—OneLake, Delta, and Parquet—and learn how they work together to enable scalable, lake-centric analytics. This session breaks down Delta’s key features like ACID transactions, schema evolution, and time travel, without diving into heavy code or jargon. With real-world examples and visual aids, you’ll gain the foundational knowledge to make smart architectural decisions and optimize storage performance in your Fabric solutions. Perfect for data engineers, analysts, and IT pros familiar with Fabric but new to its storage underpinnings.

Additional can’t miss sessions

Git good: Best practices for CI/CD and collaboration in Microsoft Fabric

Session speaker: Peer Grønnerup

Take your Fabric projects to the next level with practical strategies for CI/CD, Git integration, and team collaboration. Learn how to structure repos, automate deployments with Fabric CLI and fabric-cicd, and build pipelines using Azure DevOps or GitHub Actions. Peer, a Fabric expert with over 15 years of experience in data and BI, will share real-world tips, branching strategies, and ready-to-use templates to help you scale workflows and maintain quality.


We’re at capacity—now what?

Session speaker: Frederik Declerck

Fabric capacities simplify data operations and cost control—but hitting limits can still catch teams off guard. In this session, we’ll demystify bursting, smoothing, and how background activity can unexpectedly max out your capacity. Learn how to diagnose issues using tools like the Capacity Metrics app and Monitoring Hub, and explore real-world strategies for short and long-term capacity management. We’ll also cover workload optimization, capacity planning, and new features like Autoscale Billing and surge protection to help you stay ahead of demand.

Explore more sessions and save your spot at FabCon Vienna

If you’re looking to see even more sessions and explore the full program, check out the complete schedule. You’ll find deep dives, hands-on workshops, and keynotes covering every corner of Microsoft Fabric and the future of AI-powered analytics.

A reminder that the European Microsoft Fabric Community Conference 2025 is an in-person only event. Don’t miss the opportunity to learn about Fabric and see firsthand how Microsoft can help your organization prepare for the era of AI. Sign up for the FabCon Vienna conference using the code MSCUST to save €200 off your registration. We’ll see you in Vienna!

The post Sessions you won’t want to miss at FabCon Vienna appeared first on Microsoft Fabric Blog.

]]>
Chart your course as a Microsoft Fabric Data Engineer with curated skilling and certifications http://approjects.co.za/?big=en-us/microsoft-fabric/blog/2025/06/04/chart-your-course-as-a-microsoft-fabric-data-engineer-with-curated-skilling-and-certifications/ Wed, 04 Jun 2025 15:00:00 +0000 With Microsoft Fabric's unified platform and integrated AI capabilities, professionals are equipped to design and manage cutting-edge data solutions that drive business success.

The post Chart your course as a Microsoft Fabric Data Engineer with curated skilling and certifications appeared first on Microsoft Fabric Blog.

]]>
Organizations are constantly seeking more efficient ways to manage, analyze, and derive insights from their ever-growing data assets. With a unified analytics platform like Microsoft Fabric, they’re able to streamline their data processes—from collection to analysis—to make their data AI-ready and maintain a competitive edge. Central to this ecosystem are Fabric Data Engineers who design and manage advanced data solutions, ensuring that businesses can leverage their data effectively. In a recent episode of the Azure Essentials Show, we shed light on this pivotal role, complementing a suite of skilling resources available on Microsoft Learn, including the career-boosting Fabric Data Engineer Associate certification. 

Leverage insights and analysis as a Fabric Data Engineer 

Microsoft Fabric unifies data tools, streamlining collection, storage, processing, and analysis of structured and unstructured data. Its AI capabilities enable advanced analytics, intelligent applications, and predictive insights, helping businesses stay competitive. 

Becoming a Fabric Data Engineer offers strong career prospects, as these professionals build data pipelines that transform raw data into valuable insights. Fabric simplifies complex workflows, enhancing business intelligence and AI applications. With demand for data engineers rising, expertise in Fabric provides a competitive edge, making it easier to implement advanced solutions and drive innovation in a data-driven world. 

Start your journey with the latest Azure Essentials Show episode 

From guided learning paths to interactive labs and detailed documentation, Microsoft Learn offers a structured approach to mastering the skills needed to excel as a Fabric Data Engineer. As the demand for skilled data engineers continues to rise, a recent installment of the Azure Essentials Show explores this market trend and introduces viewers to the wealth of learning resources we have available.

The show’s hosts walk through these resources, demonstrating how they cater to learners at different skill levels—from beginners just starting their data journey to experienced professionals looking to upskill in Microsoft Fabric. Here’s a rundown of what’s included: 

Elevate your Microsoft Fabric data engineering skills: Prepare for Exam DP-700 

Our official Plan on Microsoft Learn, Elevate your Microsoft Fabric data engineering skills: Prepare for Exam DP-700, is designed to prepare you for the DP-700 Fabric Data Engineer Associate certification exam. It offers a learning outcome and milestone-based approach that encourages continuous learning, and includes essential, curated Microsoft Learn modules. After completing it, you should be able to: 

  • Describe the core features and capabilities of lakehouses in Microsoft Fabric. 
  • Use Apache Spark DataFrames to analyze and transform data. 
  • Use Real-Time Intelligence to ingest, query, and process streams of data. 
  • Request your DP-700 exam voucher. 

Microsoft Fabric Data Engineer certification 

To validate your newfound expertise, passing our Microsoft Certified: Fabric Data Engineer Associate certification exam equips you with an industry-recognized credential. This certification attests to your proficiency in data loading patterns, data architectures, and orchestration processes within Microsoft Fabric. Earning this certification not only enhances your credibility but also opens up advanced career opportunities in the data engineering field.

Enhance your Microsoft Fabric analytics engineering skills: Prepare for Exam DP-600 

In another Plan on Microsoft Learn, Enhance your Microsoft Fabric analytics engineering skills: Prepare for Exam DP-600, you’ll learn about Fabric through the lens of data analytics. From mastering the basics to advanced data processing and management, this Plan covers everything you need to ace the DP-600 Certification exam. After completing the Plan, you should be able to: 

  • Understand end-to-end analytics, including real-time intelligence. 
  • Gain proficiency in using Apache Spark for ingesting data with Dataflows Gen2. 
  • Learn how to create, manage, and optimize data warehouses. 

We’re also excited to offer a limited number of free Microsoft certification exam vouchers for the DP-600 exam! To submit your request form, click here for the full eligibility rules.

Implement operational databases in Microsoft Fabric

Excited to see what else you can discover about Fabric on Microsoft Learn? We also have a learning path called ‘Implement operational databases in Microsoft Fabric‘ that will guide you through the comprehensive process of creating and managing SQL databases within the Fabric environment. This course covers a range of important topics, including data modeling, query optimization, and performance tuning, all tailored to Fabric’s unique SQL capabilities. 

You’ll learn how to provision an SQL database, configure security settings, and perform essential database operations. Additionally, the course delves into advanced techniques for optimizing database performance and ensuring efficient data management. By the end of the learning path, you will have gained the expertise needed to effectively manage SQL databases within Microsoft Fabric, enabling you to leverage its powerful features for your organization’s data needs. 

Join the future of data analysis with Fabric 

Embarking on a career as a Fabric Data Engineer offers a pathway to be at the forefront of data innovation. With Microsoft Fabric’s unified platform and integrated AI capabilities, professionals in this role are equipped to design and manage cutting-edge data solutions that drive business success. To delve deeper into this exciting field, explore the featured episode of the Azure Essentials Show and consider pursuing the Fabric Data Engineer certification to validate and enhance your expertise. 

background pattern

Microsoft Fabric

Experience the next generation in analytics.

Don’t miss the FabCon Europe Super Early Bird discount 

And learning doesn’t stop there. Join us at FabCon Vienna this September to keep up the momentum!

European Microsoft Fabric Community Conference 2025

Register now

FabCon Vienna brings to Austria the smashing success of last year’s Stockholm conference, with a wealth of cutting-edge learning opportunities from the world of data, analytics, and AI. Both Microsoft product team members, as well as community experts will be leading sessions. You’ll get endless chances all week to engage with the Fabric and data communities through thoughtful discussions, attendee mixers, and interactive experiences. The lowest early-bird pricing expires at the end of May, so register for the FabCon conference today. Use the code MSCUST to save an additional €200! 

The post Chart your course as a Microsoft Fabric Data Engineer with curated skilling and certifications appeared first on Microsoft Fabric Blog.

]]>
European Fabric Community Conference 2024: Building an AI-powered data platform http://approjects.co.za/?big=en-us/microsoft-fabric/blog/2024/09/25/european-fabric-community-conference-2024-building-an-ai-powered-data-platform/ Wed, 25 Sep 2024 07:00:00 +0000 Get a firsthand look at the latest capabilities we are bringing to the Microsoft Fabric platform.

The post European Fabric Community Conference 2024: Building an AI-powered data platform appeared first on Microsoft Fabric Blog.

]]>

Thank you to everyone joining us at the first annual European Microsoft Fabric Community Conference this week in Stockholm, Sweden! Besides seeing the beautiful views of Old Town, attendees are getting an immersive analytics and AI experience across 120 sessions, 3 keynotes, 10 workshops, an expo hall, community lounge, and so much more. They are seeing firsthand the latest capabilities we are bringing to the Fabric platform. For those unable to attend, this blog will highlight the most significant announcements that are already changing the way our customers interact with Fabric. 

Decorative image of abstract art

Microsoft Fabric

Learn how to set up Fabric for your business and discover resources that help you take the first steps

Over 14,000 customers have invested in the promise of Microsoft Fabric to accelerate their analytics including industry-leaders like KPMG, Chanel, and Grupo Casas Bahia. For example, Chalhoub Group, a regional luxury retailer with over 750 experiential retail stories, used Microsoft Fabric to modernize its analytics and streamline its data sources into one platform, significantly speeding up their processes.

“It’s about what the technology enables us to achieve—a smarter, faster, and more connected operational environment.”

—Mark Hourany, Director of People Analytics, Chalhoub Group

Check out the myriad ways customers are using Microsoft Fabric to unlock more value from their data:

New capabilities coming to Microsoft Fabric

Since launching Fabric, we’ve released thousands of product updates to create a more complete data platform for our customers. And we aren’t slowing down anytime soon. We’re thrilled to share a new slate of announcements that are applying the power of AI to help you accelerate your data projects and get more done.

Specifically, these updates are focused on making sure Fabric can provide you with: 

  1. AI-powered development: Fabric can give teams the AI-powered tools needed for any data project in a pre-integrated and optimized SaaS environment.
  1. An AI-powered data estate: Fabric can help you access your entire multi-cloud data estate from a single, open data lake, work from the same copy of data across analytics engines, and use that data to power AI innovation 
  1. AI-powered insights: Fabric can empower everyone to better understand their data with AI-powered visuals and Q&A experiences embedded in the Microsoft 365 apps they use every day. 

Let’s look at the latest features and integrations we are announcing in each of these areas. 

AI-powered development

With Microsoft Fabric, you have a single platform that can handle all of your data projects with role-specific tools for data integration, data warehousing, data engineering, data science, real-time intelligence, and business intelligence. All of your data teams can work together in the same pre-integrated, optimized experience, and get started immediately with an intuitive UI and low code tools. All the workloads access the same unified data lake, OneLake, and work from a single pool of capacity to simplify the experience and ease collaboration. With built-in security and governance, you can secure your data from any intrusion and ensure only the right people have access to the right data. And as we continue to infuse Copilot and other AI experiences across Fabric, you can not only use Fabric for any application, but also accelerate time to production. In the video below, check out how users can take advantage of Copilot to create end-to-end solutions in Fabric: 

Today, I’m thrilled to share several new enhancements and capabilities coming to the platform and each workload in Fabric.

Fabric platform

We’re building platform-wide capabilities to help you more seamlessly manage DevOps, tackle projects of any scale and complexity. First, we’re updating the UI for deployment pipelines, in preview, to be more focused, easier to navigate, and have a smoother flow, now in preview. Next, we’re introducing the Terraform provider for Fabric, in preview, to help customers ensure deployments and management tasks are executed accurately and consistently. The Terraform provider enables users to automate and streamline deployment and management processes using a declarative configuration language. We are also adding support for Azure service principal in Microsoft Fabric REST APIs to help customers automate the deployment and management of Fabric environments. You can manage principal permissions for Fabric workspaces, as well as the creation and management of Fabric artifacts like eventhouses and lakehouses.

We’re excited to announce the general availability of Fabric Git integration. Sync Fabric workspaces with Git repositories, leverage version control, and collaborate seamlessly using Azure DevOps or GitHub. We are also extending our integration with Visual Studio Code (VS Code). You can now debug Fabric notebooks with the web version of VS Code and integrate Fabric environments as artifacts with the Synapse VS Code extension—allowing you to explore and manage Fabric environments from within VS Code. To learn more about these updates, read the Fabric September 2024 Update blog.

Security and governance

To help organizations govern the massive volumes of data across their data estate, we’re adding more granular data management capabilities including item tagging and enhancements to domains—both of which are now in preview. We’re introducing the ability to apply tags to Fabric items, helping users more easily find and use the right data. Once applied, data consumers can view, search, and filter by the applied tags across various experiences. We’re also enhancing domains and subdomains with more controls for admins including the ability to define a default sensitivity label, domain level export and sharing settings, and insights for admins, on tenant domains. Finally, for data owners, we’re adding the ability to search for data by domain, to filter workspaces by domain, and to view domain details in a data item’s location.

Over the past year, we’ve launched a myriad of security features designed to secure your data at every step of the analytics journey. Two of our network security features, trusted workspace access, and managed private endpoints, were previously only available in F64 or higher capacities. We’re excited to share that, based on your feedback, we are making these features available in all Fabric capacities. We’re also making managed private endpoints available in trial capacities as part of this release.

We’re also announcing deeper integration with Microsoft Purview, Microsoft’s unified data security, data governance, and compliance solution. Coming soon, security admins will be able to use Microsoft Purview Information Protection sensitivity labels to manage who has access to Fabric items with certain labels—similar to Microsoft 365. Also coming soon, we are extending support for Microsoft Purview Data Loss Prevention (DLP) policies, so security admins can apply DLP policies to detect the upload of sensitive data, like social security numbers, to a lakehouse in Fabric. If detected, the policy will trigger an automatic audit activity, can alert the security admin, and can even show a custom policy tip to data owners to remedy themselves. These capabilities will be available at no additional cost during preview in the near term, but will be part of a new Purview pay-as-you-go consumptive model, with pricing details to follow in the future. Learn more about how to secure your Fabric data with Microsoft Purview by watching the following video: 

You can also complement and extend the built-in governance in Fabric by seamlessly connecting your Fabric data to the newly reimagined Purview Data Governance solution—now generally available. This new solution delivers an AI-powered, business-friendly, and unified solution that can seamlessly connect to data sources within Fabric and across your data estate to streamline and accelerate the activation of your modern data governance practice. Purview integrations enable Fabric customers to discover, secure, govern, and manage Fabric items from a single pane of glass within Purview for an end-to-end approach to their data estate. Learn more about these Microsoft Purview innovations.  

Workload enhancements and updates

We’re also making significant updates across the six core workloads in Fabric: Data Factory, Data Engineering, Data Warehouse, Data Science, Real-Time Intelligence, and Microsoft Power BI.

Data Factory

In the Data Factory workload, built to help you solve some of the most complex data integration scenarios, we are simplifying the data ingestion experience with copy job, transforming the dataflow capability, and releasing enhancements for data pipelines. With copy job, now in preview, you can ingest data at petabyte scale, without creating a dataflow or data pipeline. Copy job supports full, batch, and incremental copy from any data sources to any data destinations. Next, we are releasing the Copilot in Fabric experience for Dataflows Gen2 into general availability—empowering everyone to design dataflows with the help of an AI-powered expert. We’re also releasing Fast Copy in Dataflows Gen2 into general availability, enabling you to ingest large amounts of data using the same high-performance backend for data movement used in Data Factory (e.g., “copy” activity in data pipelines, or copy job). Lastly for Dataflows Gen2, we are introducing incremental refresh into preview, allowing you to limit refreshes to just new or updated data to reduce refresh times.

Along with the dataflow announcements, we’re announcing an array of enhancements for data pipelines in Fabric, including the general availability of the on-premises data gateway integration, the preview of Fabric user data functions in data pipelines, the preview of invoke remote pipeline to call Azure Data Factory (ADF) and Synapse pipelines from Fabric, and a new session tag parameter for Fabric Spark notebook activity to enable high-concurrency Notebook runs. Additionally, we’ve made it easier to bring ADF pipelines into Fabric by linking your existing pipelines to your Fabric workspace. You’ll be able to fully manage your ADF factories directly from the Fabric workspace UI and convert your ADF pipelines into native Fabric pipelines with an open-source GitHub project. 

Data Engineering

For the Data Engineering workload, we’re updating the native execution engine for Fabric Spark and releasing upgraded Fabric Runtime 1.3 into general availability. The native execution engine enhances Spark job performance by running queries directly on lakehouse infrastructure, achieving up to four times faster performance compared to traditional Spark based on the TPC-DS 1TB benchmark. The native execution engine can now, in preview, support Fabric Runtime 1.3, which together can further enhance the performance of Spark jobs and queries for both data engineering and data science projects. This engine has been completely rewritten to offer superior query performance across data processing; extract, transform, load (ETL); data science, and interactive queries. We are also excited to announce a new acceleration tab and UI enablement for the native execution engine.

Additionally, we are announcing an extension of support in Spark to mirrored databases, providing a consistent and convenient way to access and explore databases seamlessly with the Spark engine. You can easily add data sources, explore data, perform transformations, and join your data with other lakehouses and mirrored databases. Finally, we are excited to launch T-SQL notebooks into public preview. The T-SQL notebook enables SQL developers to author and run T-SQL code with a connected Fabric data warehouse or SQL analytics endpoint, allowing them to execute complex T-SQL queries, visualize results in real-time, and document analytical process within a single, cohesive interface. 

Data Warehouse

We are excited to announce the Copilot in Fabric experience for Data Warehouse is now in preview. This AI assistant experience can help developers generate T-SQL queries for data analysis, explain and add in-line code comments for existing T-SQL queries, fix broken T-SQL code, and answer questions about general data warehousing tasks and operations. Learn more about the Copilot experience for Data Warehouse here. And as mentioned above, we are announcing T-SQL notebooks—allowing you to create a notebook item directly from the data warehouse editor in Fabric and use the rich capabilities of notebooks to run T-SQL queries.

Real-Time Intelligence

In May 2024, we launched a new workload called Real-Time Intelligence that combined Synapse Real-Time Analytics and Data Activator with a range of additional new features, currently in preview, to help organizations make better decisions with up-to-the-minute insights. We are excited to share new capabilities, all in preview, to help you better ingest, analyze, and visualize your real-time data.

First, we’re announcing the launch of the new Real-Time hub user experience; a redesigned and enhanced experience with a new left navigation, a new page called “My Streams” to create and access custom streams, and four new eventstream connectors: Azure SQL Managed Instance – change data capture (MI CDC), SQL Server on Virtual Machine – change data capture (VM CDC), Apache Kafka, and Amazon MSK Kafka. These new sources empower you to build richer, more dynamic eventstreams in Fabric. We’re also enhancing eventstream capabilities by supporting eventhouse as a new destination for your data streams. Eventhouses, equipped with KQL databases, are designed to analyze large volumes of data, particularly in scenarios that demand real-time insight and exploration.

Screenshot of the user interface of the Real-Time hub in Microsoft Fabric. The Real-Time hub is a single place for all data in motion in Fabric and this image shows numerous real-time data sources with filters to help you find specific data sources.

We’re also pleased to announce an upgrade to the Copilot in Fabric experience in Real-Time Intelligence, which translates natural language into KQL, helping you better understand and explore your data stored in Eventhouse. Now, the assistant supports a conversational mode, allowing you to ask follow-up questions that build on previous queries within the chat. With the addition of multi-variate anomaly detection, it’s even easier to discover the unknowns in your high-volume, high-granularity data. You can also have Copilot create a real-time dashboard instantly based on the data in your table, providing immediate insights you can share in your organization.

Finally, we are upgrading the Data Activator experience to make it easier to define a variety of rules to act in response to changes in your data over time, and the richness of our rules have improved to include more complex time window calculations and responding to every event in a stream. You can set up alerts from all your streaming data, Power BI visuals, and real-time dashboards and now even set up alerts directly on your KQL queries. With these new enhancements, you can make sure action is taken the moment something important happens.

Learn more about all of these workload enhancements in the Fabric September 2024 Update blog.

Power BI

We’re thrilled to announce new capabilities across Power BI that will make it easier to track and use the KPIs that matter most to you, create organizational apps, and work with Direct Lake semantic models. 

First, we are announcing the preview of Metric sets which will allow users to promote consistent and reliable metrics in large organizations across Fabric, making it easier for end users to discover and use standardized metrics from corporate models. With Metric sets, trusted creators within an organization can develop standardized metrics, which incorporate essential business logic from Power BI. These creators can organize the metrics into collections, promote and certify them, and make them easily discoverable for end users and other creators. These endorsed and promoted metrics can then be used to build Power BI reports, improving data quality across the organization, and can also be reused in other Fabric solutions, such as notebooks.

A screenshot that shows the new Metric sets experience in Power BI. The image highlights an example metric called Sales Excellence and specifically shows the Revenue Won total and figures associated with the metric.

We’re improving organizational apps in Power BI, a key tool for packaging and securely distributing Power BI reports to your organization. Now in preview, you can create multiple organizational apps in each workspace, and they can contain other Fabric items like notebooks and real-time dashboards. The app interface can even be customized, giving you more control over the color, navigation style, and landing experience.

We’re also making it easier to work with Direct Lake semantic models with new version history for semantic models, similar to the experience found across the Microsoft 365 apps. Power BI users can also now live edit Direct Lake semantic models right from Power BI Desktop. And we’re excited to announce a capability widely asked for by Power BI users: a dark mode in Power BI Desktop. 

A screenshot that shows the dark mode in Power BI desktop. The Power BI Desktop has a blank canvas with a dark background.

Finally, we’re announcing the general availability of OneLake integration for semantic models in Import mode. OneLake integration automatically writes data imported into your semantic models to Delta Lake tables in OneLake so that you can enjoy the benefits of Fabric without any migration effort. Once added to a lakehouse in OneLake, you can use T-SQL, Python, Scala, PySpark, Spark SQL, or R on these Delta tables to consume this data and add business value. All of this value comes at no additional cost as data stored in OneLake for Power BI import semantic models is included in the price of your Power BI licensing.

Learn more about the Power BI announcements in the Power BI September 2024 Feature blog. Also see the AI-powered insights section below for new Copilot experiences for Power BI creators and consumers.

AI-powered data estate

With OneLake, Fabric’s unified data lake, you can create a truly AI-powered data estate to fuel your AI innovation and data culture. OneLake’s shortcuts and mirroring capabilities enable you to access your entire multi-cloud data estate from a single, intuitively organized data lake. With your data in OneLake, you can then work from a single copy across analytics engines, whether you are using Spark, T-SQL, KQL, or Analysis Services and even access that data from other apps like Microsoft Excel or Teams. Today, we are thrilled to share even more capabilities and enhancements coming to OneLake that can help you better connect to and manage your data estate.

One of the biggest benefits of OneLake is the ability to create shortcuts to your data sources, which virtualizes data in OneLake without moving or duplicating it. We are pleased to announce that shortcuts for Google Cloud Services (GCS) and S3-compatible sources are now generally available. These shortcuts also support the on-premise data gateway, which you can use to connect to your on-premise S3 compatible sources as well as GCS buckets that are protected by a virtual private cloud. We’ve also made enhancements to the REST APIs for OneLake shortcuts, including adding support for all current shortcut types and introducing a new list operation. With these improvements, you can programmatically create and manage your OneLake shortcuts.

We’re also excited to announce further integration with Azure Databricks with the ability to access Databricks Unity Catalog tables directly from OneLake—now in preview. Users can just provide the Azure Databricks workspace URL and select the catalog, and Fabric creates a shortcut for every table in the selected catalog, keeping the data in sync in near real-time. Once your Azure Databricks Catalog item is created, it behaves the same as any other item in Fabric, so you can access the table through SQL endpoints, notebooks, or Direct Lake mode for Power BI reports. Learn more about the OneLake shortcut and Azure Databricks announcements in the Fabric September 2024 Updates blog.

At Microsoft Build last May, we announced an expanded partnership with Snowflake that gives our customers the flexibility to easily connect and work across our tools. Today, I’m excited to share progress on this partnership with the upcoming preview of shortcuts to Iceberg tables. In the coming weeks, Microsoft Fabric engines will be able to consume Iceberg data with no movement or duplication using OneLake shortcuts. Simply point to an Iceberg dataset from Snowflake or another Iceberg-compatible service, and OneLake virtualizes the table as a Delta Lake table for broad compatibility across Fabric engines. This means you can work with a single copy of your data across Snowflake and Fabric. With the ability to write Iceberg data to OneLake from Snowflake, Snowflake customers will have the flexibility to store Iceberg data in OneLake and use it across Fabric.

Finally, we’ve released mirroring support for Snowflake databases into general availability—providing a seamless, no-ETL experience for integrating existing Snowflake data with the rest of your data in Microsoft Fabric. With this capability, you can continuously replicate Snowflake data directly into Fabric OneLake in near real-time, while maintaining strong performance on your transactional workloads. Learn more about Snowflake mirroring in Fabric.

AI-powered insights

With your data teams using the AI-enhanced tools in Fabric to accelerate development of insights across your data estate, you then need to ensure these insights reach those who can use them to inform decisions. With easy-to-understand Power BI reports and AI-powered Q&A experiences, Fabric bridges the gap between data and business results to help you foster a culture that empowers everyone to find data-driven answers.

We’re announcing a richer Copilot experience in Power BI to help create reports in a clearer, more transparent way. This new experience, now in preview, includes improved conversational abilities between you and Copilot that makes it easier to provide more context to Copilot initially so you can get the report you need on the first try. Copilot will even provide report outlines to improve transparency on data fields being used. We are also releasing the ability to auto-generate descriptions for measures into general availability. Lastly, report viewers can now use Copilot to summarize a report or page right from the Power BI mobile app, now in preview.

We’re also enhancing email subscriptions for reports by extending dynamic per recipient subscriptions to include both paginated and Power BI reports. With dynamic subscriptions, you can set up a single email subscription that delivers customized reports to each recipient based on the data in the semantic model. For reports that are too large for email format, we are also giving you the ability to deliver Power BI and paginated report subscriptions to a OneDrive or SharePoint location for easy access. Finally, you can now create print-ready, parameterized paginated reports using the Get Data experience in Power BI Report Builder—accessing over 100 data sources.

Learn more about all of the Power BI announcements in the Power BI September 2024 Feature blog

Start building your Fabric skills

We are grateful so many of you have decided to grow your skills with Microsoft Fabric. In the past six months alone, more than 17,000 individuals have earned the Fabric Analytics Engineer Associate certification, making it the fastest growing certification in Microsoft’s history. Today, we’re excited to announce a brand-new certification for data engineers coming in late October. The new Microsoft Certified: Fabric Data Engineer Associate certification will help you prove your skills with data ingestion, transformation, administration, monitoring, and performance optimization in Fabric. 

Our portfolio of Microsoft Credentials for Fabric also includes four Microsoft Applied Skills, which are a complement to Microsoft certifications and free of cost. Applied Skills test your ability to complete a real-world scenario in a lab environment and provide you with formal credentials that showcase your technical skills to employers. For Fabric, we have Applied Skills credentials covering implementing lakehouses, data warehouses, data science and real-time intelligence solutions. 

Visit the Fabric Career Hub to get the best free resources to help you get certified and the latest certification exam discounts. Don’t forget to also join the vibrant Fabric community to connect with like-minded data professionals, get all your Fabric technical questions answered, and stay current on the latest product updates, training programs, events, and more. 

And if you want to test your skills, explore Fabric, and win prizes, you can also register for the Microsoft Fabric and AI Learning Hackathon. To learn more, you can join our Ask Me Anything event on October 8. 

Join us at Microsoft Ignite

We are excited to bring even more innovation to the Microsoft Fabric platform at Microsoft Ignite this year. Join us from November 19 through November 21, 2024 either in person in Chicago or online. You will see firsthand the latest solutions and capabilities across all of Microsoft and connect with experts, community leaders, and partners who can help you modernize and manage your own intelligent apps, safeguard your business and data, accelerate productivity, and so much more. 

Explore additional resources for Microsoft Fabric

If you want to learn more about Microsoft Fabric: 

The post European Fabric Community Conference 2024: Building an AI-powered data platform appeared first on Microsoft Fabric Blog.

]]>
Microsoft Fabric, explained for existing Synapse users https://blog.fabric.microsoft.com/en-us/blog/microsoft-fabric-explained-for-existing-synapse-users?ft=All Mon, 20 Nov 2023 16:31:04 +0000 Today, we are announcing the General Availability of Microsoft Fabric.

The post Microsoft Fabric, explained for existing Synapse users appeared first on Microsoft Fabric Blog.

]]>
Earlier this year, at Microsoft Build, we introduced, in Public Preview, Microsoft Fabric, “the biggest data product announcement since SQL Server”. Today, we are announcing the General Availability of Microsoft Fabric.

Arun explains in detail why we all believe Microsoft Fabric will redefine the current analytics landscape. I will focus here on what it means for customers that are using the current Platform-as-a-Service (PaaS) version of Synapse, explaining what it means for your current investments (spoiler: we fully support them), but also how to think about the future.

What happens with PaaS Azure Synapse Analytics

The PaaS offering of Azure Synapse Analytics is an enterprise analytics service designed to accelerate time to insight across data warehouses and big data systems. It brings together the SQL technologies used in enterprise data warehousing, Azure Data Factory pipelines, Apache Spark technologies for big data, and Azure Data Explorer for log and time series analytics.

Microsoft has no current plans to retire Azure Synapse Analytics. Customers can continue to deploy, operate, and expand the PaaS offering of Azure Synapse Analytics. Rest assured, should these plans change, Microsoft will provide you with advanced notice and will adhere to the support commitments in our Modern Lifecycle Policy in order to ensure our customers’ needs are met.

The evolution of Microsoft’s big data analytics products

The next versions of our big data analytics products are now a core part of Microsoft Fabric.

Fabric opens new architectural horizons for our analytical engines. Fabric offers a unified storage abstraction for all your data, OneLake, organized into a logical data mesh, with federated governance and granular control and an intuitive, personalized data hub. All Fabric engines separate storage from compute, and store data in OneLake using a single, open data format.

On this new foundation, we can invent new, unprecedented ways of deploying pipelines, data warehousing, data engineering, data science, observability and real-time analytics technologies, to ultimately simplify and increase the efficiency of our customers’ solutions. Fabric allows us, and our customers, to do more. This is why most of our innovation efforts will be focused on Fabric.

How to think about your current Azure PaaS Synapse Analytics solutions

As mentioned above, there is no immediate need to change anything, as the current platform is fully supported by Microsoft. Your existing solutions will keep working. Your in-progress deployments can continue, all with our full support.

However, you probably have already started thinking about a Microsoft Fabric future for your analytics solutions. The following steps may help you with this thought process.

Understand Microsoft Fabric

Microsoft Fabric represents a significant upgrade to all our analytics engines. All of them are improved, faster, and more scalable. And there is a lot to learn about the new engines and how to best use them. Fabric reimagines collaboration and empowers the business users in an unprecedented way. But it is much more than just better engines or just better integration.

The unified, open-source data format means that there is no need to copy data from one engine to another. You can shape data using the technology of your choice, then query it with any other technology.

Fabric introduces completely new ways to make your data part of your analytics landscape. Shortcuts (within Azure, or cross cloud), database mirroring, seamless access to Dataverse and M365 data, all these solutions are designed to remove friction and costs.

Understanding these technologies will enable you to make the best out of Fabric, in terms of efficiency, agility and costs.

Our teams have worked hard to produce detailed documentation for all the Fabric concepts, and the best complement for the documentation is hands-on experience. The easiest way to understand Fabric in depth is to try the product:  Microsoft Fabric free trial . Arun’s blog spells out clearly how to learn more about Microsoft Fabric.

Understand what it means for your solution

Your analytics solution may use different technologies and engines. Fabric is a complete analytics platform, so you will find, inside Microsoft Fabric, new and enhanced analytics capabilities of the products with which you are familiar today.

Fabric brings new capabilities, that have no parallel with the current PaaS Synapse Analytics offering. The Fabric SQL Engine can operate, with equal performance, scale, and security, over any OneLake artifact (warehouses, lakehouses, mirrored databases). It also supports cross-artifact operations removing the need for extra copies while Power BI, for example, in DirectLake mode, can now analyze real time streaming data, or Spark output.

All these changes enable simpler, more efficient solutions, removing the need for intermediate steps and multiple data copies. Your solution can get significantly simpler and cheaper.

Below, I use one example of common PaaS Azure Synapse Analytics architectures, together with a possibly more efficient solution in Fabric, to demonstrate such potential simplifications.

Example 1: Data Lake, from Synapse to Fabric

Today, you may prepare your data in an Azure Data Lake Storage Gen2 (ADLSg2) lakehouse (typically using Spark, Synapse or Azure Databricks), then use a pipeline to load data into a Synapse SQL Dedicated Pool, then use Power BI or some other BI tool for your report.

thumbnail image 5 of blog post titled 

       Building the Lakehouse - Implementing a Data Lake Strategy with Azure Synapse

You can keep your current solution intact, and upgrade to Fabric engines.

In Fabric, however, this solution can be simplified:

  • A Data Engineering Lakehouse, in Microsoft Fabric, allows you to use your current ADLSg2 data, as prepared with Synapse Spark or Azure Datrabricks (via shortcuts).
  • The SQL Analytics Endpoint allows you to apply the security rules from the Dedicated Pool directly over the Lakehouse. There is no need for a dedicated capacity, nor for the pipeline copying from the lake to your warehouse.
  • Using the new DirectLake mode, Power BI can now operate directly over the Lakehouse, with performance similar to Import. Your other BI tools can continue to operate over the SQL Analytics Endpoint.
  • By migrating your Notebooks and Spark Jobs to Fabric Spark, your Lakehouse data will be automatically optimized for all the other Fabric engines (while also being stored in an open format)
Diagram

Description automatically generated

To learn more about the Lakehouse pattern in Microsoft Fabric, please visit Lakehouse end-to-end scenario: overview and architecture – Microsoft Fabric | Microsoft Learn

Assess our migration tools and processes

We are investing significant development efforts in migration processes and tooling. And our migration efforts are prioritizing current PaaS Synapse Analytics customers.

The processes and tools we are designing are intended to minimize the friction, disruption and cost for our existing customers.

As you will see in the section on Migration Resources, we are developing tools to:

  • Use your data in-place whenever possible
  • Reuse code investments (pipelines, notebooks) when possible
  • Migrate code (stored procedures, views, notebooks)

These investments are not complete. We will keep posting updates to our migration tools. Join the fast-growing Fabric community , and our specialists as well as external experts will be ready to work with you. The Fabric Ideas forum, on the community site, is the best way to suggest new features, and it is closely monitored by the Microsoft Fabric product teams.

Develop, then plan to deploy a migration strategy

After having learned about Fabric and evaluating the product, you will have developed enough confidence in the new Fabric engines and the migration technology to move your solution to Fabric. For some of you this may happen soon, for others it may take years.

There is no rush – we will keep supporting your existing solutions – but we are ready for you to migrate whenever the time is right.

When you are ready to move your solution to Fabric, you will be able to exchange your existing 1- or 3-year Synapse Reserved Instance (RI) purchases for 1 year Fabric RI purchases to continue to apply your reservation discounts in Fabric. Additionally, if you want to increase your RI commitment for your Fabric portfolio you will have access to discounts of >40% over the Fabric Pay-as-you-go pricing.

In the next sections, the product leaders explain how to think about Fabric from the perspective of different PaaS Synapse Analytics workloads.

Data Factory Pipelines

Data Factory in Microsoft Fabric brings Power Query and Azure Data Factory together into a modern trusted data integration experience, that empowers data and business professionals to extract, load, and transform data for their organization. In addition, powerful data orchestration capabilities enable you to build simple to complex data workflows, that orchestrate the steps needed for your data integration needs.

Key concepts in Data Factory in Microsoft Fabric include:

  • Get Data and Transformation with Dataflow Generation 2 is an evolution of Dataflow in Power BI. Dataflow Generation 2 is re-architected to leverage Fabric compute engines for data processing and transformation. This enables Dataflow Generation 2 to ingest and transform data at any scale.
  • Data Orchestration with Data Pipelines – For customers familiar with Azure Data Factory (ADF), data pipelines in Microsoft Fabric use the same technology that powers Azure Data Factory. As part of the GA of Fabric, data pipelines in Microsoft Fabric will have most of the activities available in ADF.See here a list of activities that will be part of data pipelines in Fabric. SSIS activity will be added to data pipelines by Q2 CY2024.
  • Enterprise-ready Data Movement – Whether it is petabyte-scale data to small data, Data Factory provides a serverless and intelligent data movement platform that enables you to move data between diverse data sources and data destinations reliably. With support for 170+ connectors, Data Factory in Fabric enables you to move data between multi-clouds, data sources on-premises, and within virtual networks (VNet). Intelligent throughput optimization enables the data movement platform to automatically detect the size of the compute needed for data movement.

To enable customers to upgrade to Microsoft Fabric from Azure Data Factory (ADF), we will be supporting the following:

  • Data pipelines activities – For many of the activities that you use in ADF, we have added these into Data Factory in Fabric. In addition, we have added new activities (e.g. Teams, Outlook) for notifications. See here for a list of activities that are available in Data Factory in Fabric.
  • OneLake/Lakehouse connector in Azure Data Factory – For many ADF customers, you can now integrate with Microsoft Fabric, and bring data into the Fabric Onelake
  • Azure Data Factory Mapping Dataflow to Fabric – We have put together a guide for ADF customers who are looking at building new data transformations in Fabric.Find out more at https://aka.ms/datafactoryfabric/docs/guideformappingdataflowusers

In addition, customers looking at migrating their ADF mapping dataflows to Fabric, you can leverage sample code from the Fabric Customer Advisory Team (Fabric CAT) to convert mapping dataflows to Spark code. Find out more at https://github.com/sethiaarun/mapping-data-flow-to-spark

As part of Data Factory in Fabric roadmap, we will be working towards the preview of the following by Q2 CY2024:

  • Mounting of Azure Data Factory in Fabric – This enables customers to be able to mount their existing Azure Data Factory in Microsoft Fabric. All ADF pipelines will work as-it-is, and continue running on Azure, while enabling you to explore Fabric, and work out an upgrade plan.
  • Upgrade from Azure Data Factory pipelines to Fabric – We will be working with customers and the community on learning how we can best support upgrades of data pipelines from ADF to Fabric. As part of this, we will deliver an upgrade experience that empowers you to test your existing data pipelines in Fabric using mounting and upgrading the data pipelines.

Learn more about how you can upgrade to Data Factory in Fabric – https://aka.ms/datafactoryfabric/upgradetofabric

Synapse Data Warehouse

Fabric Data Warehouse is the next generation of data warehousing in Microsoft Fabric. It is the first transactional data warehouse to natively support an open data format enabling data engineers and business users to collaborate seamlessly without compromising security or governance. Just like the previous data warehouse generation, SQL provides multi-table ACID transactional guarantees. It is built on the well-established SQL Server Query Optimizer and Distributed Query Processing engine but comes with major improvements that address many of the challenges customers face in enabling workloads associated with modern analytics. These improvements were driven by rearchitecting the data warehouse by leveraging IP from both Dedicated and Serverless SQL Pools along with:

  • Separation of storage and compute: data is stored in OneLake and is clearly separated from the compute used by the SQL engine. There is an elastic allocation of compute resources based on demand, as well as use of distinct compute resources for different workload types on top of the same data.
  • Leveraging the infinite compute capabilities of Azure Cloud: giving us the capability of going beyond a limited topology offered by the Synapse Gen2 architecture.
  • Support for open data format: allowing a single copy of the data to be used by all the Fabric workloads such as Data Science, Data Engineering, and Power BI.

With this new architecture, the new engine enables numerous new capabilities that were not possible in either Dedicated and Serverless SQL Pools such as:

  1. Cross database querying without any ETL or data movement.
  2. Cloning without creating copies of the data.
  3. Autoscaling enabling elastic scale up and down of the compute nodes with dynamic resource allocation tailored to data volume, usage, or query complexity.
  4. Enabling a pay for what you use pricing model.
  5. No knobs performance via automated query optimizations, statistics, and data distributions.

All of this with the concepts familiar to SQL users such as Views, Stored Procedures, SQL security (row-level security, column-level security, dynamic data masking) and full benefits of the T-SQL tooling ecosystem.

These architectural changes cannot be backported to either one of the old engines. Because of the open format, your data warehouses cannot be upgraded in place either. Data stored in a proprietary format in Gen2 needs to be extracted and stored in the open format of Fabric.

A migration can be done at your own pace when you are ready to leverage these new capabilities. To enable this, we have added the following available to you now:

In addition, we have also started working on an in-product Migration Assistant that will automatically detect and convert your Synapse Gen2 code to Fabric Data Warehouse code. It will also redirect your endpoints, so you don’t have to worry about application migration. We anticipate this to be available in CY24.

Synapse Data Engineering

Fabric Data Engineering is our big data analytics workload in Fabric, empowering data engineers to leverage the power of Apache Spark to transform their data at scale and build out a lakehouse architecture. The Fabric Data Engineering experience targets users of Apache Spark pools in the Azure Synapse Analytics world. Here are some of the key takeaways regarding the Fabric Data Engineering experience:

Runtime for big data workloads

Every Fabric workspace comes pre-wired with a ‘starter pool’ (default Spark cluster) with a Fabric Runtime that contains up to date versions of Spark, Delta, Java and Python. Just like in Azure Synapse Analytics, customers can also create their own custom clusters with their own configurations and libraries if they want.

The Apache Spark experience in Fabric also contains many new and exciting enhancements:

  • Starter pools in Fabric are automatically kept live meaning users can enjoy sessions that start within ~15 seconds
  • High concurrency mode in Fabric means multiple notebooks can be attached to a single session, accelerating the start-up times and reducing costs
  • Spark clusters start all the way from a single node, further reducing the costs of getting started with Spark

Simplified lakehouse architecture

Every Fabric workspace also comes pre-wired with OneLake, our SaaSified data lake for the organization. Users can easily create lakehouse items, which are the perfect container for bringing in all your data into OneLake using Spark, dataflows and pipelines. Existing data can be easily included with no data movement through the use of shortcuts. We will also automatically discover metadata of Delta tables for you, making it super easy to start working with existing data with zero friction. Additionally, we have reduced the price for Spark in Fabric by almost 40% vs. the retail price of Synapse Spark.

Here are some other exciting things to keep in mind about the lakehouse in Fabric:

  • Every lakehouse comes with a built in SQL endpoint and Power BI dataset. This means that as soon as you transform your data with Spark, you can start querying it using our SQL engine and Power BI, with no data movement necessary
  • Spark (along with every other Fabric engine) will automatically write the data into the lakehouse with v-order enabled, automatically optimizing it for BI reporting

First Class Developer Experiences

The Synapse Data Engineering experience brings in familiar authoring tools, including notebooks for interactive querying experiences and Spark Job Definitions for submitting batch jobs. These capabilities come with a variety of new enhancements and users even have some new authoring experiences to look forward to:

  • Notebooks in Fabric include numerous usability improvements including auto-save, real time collaboration and commenting, a built-in file system as well as native file format support when checking into git. Users can also make use of light-weight scheduling (in addition to using the pipeline activity).
  • Spark Job Definitions come with retry policy support, making it easier to continuously run long running streaming jobs
  • Native VS Code support makes it easy to work with your Data Engineering items (notebooks, Spark Jobs, lakehouse) all in your favorite IDE, including full debugging support
  • The newly released environment item streamlines the packaging of all of your Spark configurations, libraries, cluster settings and more, and simplifies the re-usability of your hardware and software environment across your code artifacts.

To summarize, with Synapse Data Engineering, you can start building on top of your existing Azure Synapse Spark investments quickly and incrementally. Start by leveraging shortcuts to existing data in your data lake and bringing-in your notebooks using the import capability. We are starting work on an in-product migration assistant but in the meantime, please use our newly published Azure Synapse Spark to Fabric Migration Guidance whitepaper.

Synapse Data Science

Synapse Data Science empowers data scientists to explore their data, build and operationalize their predictive models. Coming from the Azure Synapse Analytics world, you will see many familiar constructs such as Python and R being baked into the runtime including many popular ML packages, the ability to install your own third party & custom libraries as well as the availability of SynapseML, our open source library for creating massively scalable ML pipelines.

Fabric Data Science offers a variety of new capabilities data scientists can look forward to:

Model & Experiment tracking

Data scientists are able to leverage experiments and models as readily available in items in the Fabric workspace. Support for ML models and experiments allows users to manage models and track experiment runs using standard MLFLow APIs. Comparison experiences make it easy to compare different experiment runs and auto logging helps capture key metrics automatically as users author code to train models.

Model batch scoring

To operationalize their ML models, users can leverage the scalable PREDICT function for distributed batch scoring on Spark. This capability exists in Azure Synapse today and so existing Synapse users should feel right at home. The Fabric Data Science experience provides low code UI for scoring data and tight integration with the lakehouse, making it easy to enrich data and surface it in Power BI reports with zero friction.

Data Exploration & Enrichments

Fabric Data Science offers many innovative solutions in the space of exploring and transforming your data. These include:

  • Data Wrangler – a low code UI for carrying out data transformations that automatically generate Python code
  • Semantic Link – a library enabling seamless connectivity to the Power BI semantic model through data science tools like notebooks
  • Pre-built AI models – newly released public preview capability providing built-in access to Azure AI services like text analytics and translation services

The migration path for a data scientist in Azure Synapse Analytics is like that of a Spark data engineer – they will need to consider their notebooks, Spark pools and data. We recommend starting with the Azure Synapse Spark to Fabric Migration Guidance whitepaper.

Synapse Real-time Analytics

Synapse Real-time Analytics is a robust platform tailored to deliver real-time data insights and observability analytics capabilities for a wide range of data types. This includes observability time-based data like logs, events, and telemetry data. It’s the true streaming experience in Fabric! Building on the same foundation as Azure Synapse Data Explorer, Synapse Real-time Analytics equips both citizen data scientists and professional data engineers with a suite of features and tools to fully unleash the potential of their data.

Rapid Deployment

Experience unmatched efficiency by creating a database, ingesting data, running queries, and generating Power BI reports, all within a 5-minute timeframe. Real-time Analytics puts speed at the forefront, allowing you to dive into data analysis without delay.

Get Data

For an authentic streaming experience in Fabric, the “Get Data” feature has received a modern facelift with an intuitive design and user-friendly interface. It simplifies data ingestion, accepting any data format or structure from various sources in either streaming or batch mode. Your data becomes query able within seconds.

Query Versatility

Whether you’re a Kusto Query Language (KQL) enthusiast or prefer traditional SQL, Real-time Analytics accommodates your needs. This service enables you to generate quick KQL or SQL queries, ensuring that you can work in your preferred language and obtain results swiftly. It doesn’t matter if you’re working with a small dataset (a few gigabytes), a medium-sized one (a few terabytes), or even massive datasets (in the petabytes range).

Data Exploration

Fabric Real-Time Analytics offers a multitude of innovative solutions for exploring and visualizing your data, including:

KQL Queryset: A workbench for creating, managing, and sharing your queries.

Power BI Report: A one-click option to generate a Power BI report on top of any query or table.

Notebook: Seamlessly connect your Fabric Notebook with the KQL Database for data ingestion and querying.

NL2KQL (Coming Soon): Write your query in natural language, and Fabric will generate and execute the corresponding KQL query for you.

Real-Time Dashboard (Coming Soon): The Fabric Real-Time Dashboard is a collection of tiles that enable native export of Kusto Query Language (KQL) queries as visuals. This allows for easy query modification and visual formatting, enhancing data exploration and delivering superior query and visualization performance.

Fabric Real-Time Analytics is your gateway to real-time insights and a streamlined data analysis experience. Whether you’re pioneering new data horizons or looking to optimize your data analytics solutions, this service is your trusted partner. Stay ahead in the data game and embark on your journey with Fabric Real-Time Analytics today.

For more information on Fabric Real-Time Analytics, visit the general availability blog.

Migration planning

Fabric KQL databases are 100% compliant with Azure Data Explorer (ADX) and Azure Synapse Data Explorer (Preview) and our powered by the same technology. It means that all current applications, SDK, integrations, and tools that work with ADX will continue to work smoothly with Fabric KQL databases.

There is a broad set of capabilities to support mixed environments and migrations, some are available now and some will light up in the next months.

  • Available now:
    • Full binary compatibility of APIs, SDKs and tools.
    • Create a database shortcut to host a read only, in place, up to date instance of the database in Fabric.
  • Coming over the next months:
    • Migrate an Azure Synapse Data Explorer pool from a Synapse workspace and attach it to a Fabric workspace
    • Attach an Azure Data Explorer cluster to a Fabric workspace
    • Sync Azure Data Explorer user queries and dashboards into a Fabric workspace query sets and dashboards

Migration Resources

Azure Data Factory

Azure Synapse DW to Fabric Migration Guidance

Azure Synapse Spark to Fabric Migration Guidance

Azure Synapse Data Explorer and Azure Data Explorer to Fabric Migration Guidance

The post Microsoft Fabric, explained for existing Synapse users appeared first on Microsoft Fabric Blog.

]]>