Microsoft Fabric Updates Blog

Recap of Data Factory Announcements at Fabric Community Conference Europe

Last week was such an exciting week for Fabric during the Fabric Community Conference Europe, filled with several product announcements and sneak previews of upcoming new features.

Thanks to all of you who participated in the conference, either in person or by being part of the many virtual conversations through blogs, Community forums, social media and other channels. Thank you also for all your product feedback and Ideas forum suggestions that help us defining the next wave of product enhancements.

We wanted to make sure you didn’t miss any of the Data Factory in Fabric announcements, by providing you with this recap of all new features.

General Availability announcements

Public Preview announcements

You can continue reading below for more information about each of these features.

Copilot for Dataflow Gen2

Copilot for Data Factory with Dataflow Gen 2 is now generally available, an experience that is part of Copilot in Fabric, where you can use natural language to transform data and generate code explanations to help you better understand previously generated queries and tasks.

Copilot in Fabric helps you enhance productivity, unlock profound insights, and facilitate the creation of custom AI experiences tailored to your data. As a component of the Copilot in Fabric experience, Copilot for Data Factory empowers you to use natural language to articulate your requirements for creating data integration solutions using Dataflow Gen2. Essentially, Copilot for Data Factory operates like a subject-matter expert (SME) collaborating with you to design your dataflows.

Video showcasing the Copilot for Dataflow Gen2 experience

Learn more: Announcing the General Availability of Copilot for Data Factory in Microsoft Fabric | Microsoft Fabric Blog | Microsoft Fabric

Fast Copy for Dataflow Gen2

Fast Copy in Dataflow Gen2 is now General Available! This powerful feature enables rapid and efficient ingestion of large data volumes, leveraging the same robust backend as the Copy Activity in Data pipelines.

With Fast Copy, you can experience significantly shorter data processing times and improved cost efficiency for your Dataflow Gen2. Additionally, it boosts end-to-end performance by allowing you to use Fast Copy to ingest data into staging, and then seamlessly transform it at scale using SQL DW compute.

Fast Copy supports numerous source connectors, including ADLS Gen2, Azure Blob Storage, Azure SQL DB, On-Premises SQL Server, Oracle database, Fabric Lakehouse, Fabric Warehouse, PostgreSQL, and Snowflake.

You can learn more about Dataflows Fast Copy here: Announcing the General Availability of Fast Copy in Dataflows Gen2 | Microsoft Fabric Blog | Microsoft Fabric

Data Pipelines accessing on-premises data using the On-premises data gateway

We are thrilled to announce the General Availability of on-premises connectivity for Data pipelines in Microsoft Fabric.

Using the On-premises Data Gateway, customers can connect to on-premises data sources using data pipelines with Data Factory in Microsoft Fabric. This enhancement significantly broadens the scope of data integration capabilities. In essence, by using an on-premises Data Gateway, organizations can keep databases and other data sources on their on-premises networks while securely integrating them and orchestrating them using data pipelines in Microsoft Fabric.

Learn more: Announcing the General Availability of Fabric Data Pipeline Support in the On-Premises Data Gateway | Microsoft Fabric Blog | Microsoft Fabric

Mirrored Database support for Snowflake

Mirroring for Snowflake is now generally available, a frictionless way to add your entire Snowflake databases into OneLake data estate. Setting up Mirroring is trivial and simple. Once Mirroring starts the replication process, the mirrored data is automatically kept up to date at near real-time in OneLake. With your Snowflake data landed into OneLake, the data is now available everywhere in Fabric and ready to accelerate your data potential.

Diagram of the mirroring mechanism for Snowflake as a source database

Learn more: Mirroring – Microsoft Fabric | Microsoft Learn

Incremental Refresh support for Dataflow Gen2

Incremental Refresh in Dataflow Gen2 is now in public preview! This powerful feature is designed to optimize your data processing by ensuring that only the source data that has changed since the last dataflow refresh is updated. This means faster dataflow refreshes and more efficient resource usage.

Incremental refresh settings can be configured for each of your queries via the contextual menu (right-click) on each of your queries in the Queries pane.

Screenshot of the incremental refresh dialog showcasing all available settings like bucket size

Key benefits of leveraging Incremental Refresh for your dataflows include:

  • Efficiency: Only the source data that has changed since the last refresh is processed, saving time and resources.
  • Performance: Faster dataflows due to reduced data processing and parallelism.
  • Scalability: Handle large datasets more effectively by processing data in smaller, more manageable chunks.

Learn more: Incremental refresh in Dataflow Gen2 – Microsoft Fabric | Microsoft Learn

New Azure Data Factory item in Fabric

Bring your existing Azure Data Factory (ADF) to your Fabric workspace!

We are introducing a new preview feature that allows you to connect to your existing ADF factories from your Fabric workspace. You will now be able to fully manage your ADF factories directly from the Fabric workspace UI! Once your ADF is linked to your Fabric workspace, you’ll be able to trigger, execute, and monitor your pipelines as you do in ADF but directly inside of Fabric.

Watch the feature in action in the following video:

Learn more: Bring Azure Data Factory to Fabric – Microsoft Fabric | Microsoft Learn

Copy Job item in Fabric

We are happy to announce the preview of Copy Job in Data Factory, elevating the data ingestion experience to a more streamlined and user-friendly process from any source to any destination. Now, copying your data is easier than ever before. Copy job supports various data delivery styles, including both batch copy and incremental copy, offering the flexibility to meet your specific needs.  

With Copy Job, you can enjoy the following benefits:

  • Simplicity: Seamless experience data copying with no compromises, making it easier than ever.
  • Efficiency: Enable incremental copying effortlessly, reducing manual intervention
  • Flexibility: Take full control of your data copying
  • Highly performant and scalable: Move Petabyte-scale data.

Learn more: What is Copy job (preview) in Data Factory – Microsoft Fabric | Microsoft Learn

Fabric User Data Functions support in Data Pipeline

Data pipelines in Fabric provide a simple interface to create and manage large data processing tasks by using Activities. Activities are the fundamental object that represents each step of a data processing task. Users can leverage several interconnected Activities to create large, elaborate data processing solutions.

Through a private preview, User Data Functions is now available as an Activity, allowing users to create custom code processing steps for their Data pipelines. Within the private preview, you can find them by going to Activities and selecting the Functions activity. After the Functions Activity is inserted in the Data pipeline, you will see the option to use Fabric User Data Functions in the settings tab.

Screenshot of the experience found within Fabric Data Pipelines to invoke Fabric User Data Functions

Learn more: Transform, Validate and Enrich Data with Python User Data Functions in Your Data Pipelines | Microsoft Fabric Blog | Microsoft Fabric

Invoke remote pipeline from ADF & Synapse

We’ve been working diligently to make the very popular Data pipeline activity known as “Invoke Pipeline” better and more powerful. Based on customer feedback, we continue to iterate on the possibilities and have now added the exciting ability to call pipelines from Azure Data Factory (ADF) or Synapse Analytics pipelines as a public preview!

This creates countless possibilities to utilize your existing ADF or Synapse pipelines inside of a Fabric pipeline by calling it inline through this new Invoke Pipeline activity. Use cases that include calling Mapping Data Flows or SSIS pipelines from your Fabric Data pipeline will now be possible as well.

Learn more: Exciting Enhancements Announced for Fabric Data Factory Pipelines! | Microsoft Fabric Blog | Microsoft Fabric

Spark Job environment parameters support for Data Pipeline

One of the most popular use cases in Fabric Data Factory today is automating and orchestrating Fabric Spark Notebook executions from your Data pipelines. A common request has been to reuse existing Spark sessions to avoid any session cold-start delays. We’ve delivered on that requirement by enabling “Session tags” as an optional parameter under “Advanced settings” in the Fabric Spark Notebook activity! Now you can tag your Spark session and reuse the existing session using that same tag to reuse an existing session and greatly reduce the overall processing time of your Data pipelines. 

Screenshot of the new Spark job environment parameters inside of Fabric Data Pipelines

Learn more: Exciting Enhancements Announced for Fabric Data Factory Pipelines! | Microsoft Fabric Blog | Microsoft Fabric

Thank You for your feedback, keep it coming!

We wanted to thank you for your support, usage, excitement, and feedback around Data Factory in Fabric. We’re very excited to continue learning from you regarding your Data Integration needs and how Data Factory in Fabric can be enhanced to empower you to achieve more with data.

Please continue to share your feedback and feature ideas with us via our official Community channels, and stay tuned to our public roadmap page for updates on what will come next:

Relaterte blogginnlegg

Recap of Data Factory Announcements at Fabric Community Conference Europe

oktober 31, 2024 av Jovan Popovic

Fabric Data Warehouse is a modern data warehouse optimized for analytical data models, primarily focused on the smaller numeric, datetime, and string types that are suitable for analytics. For the textual data, Fabric DW supports the VARCHAR type that can store up to 8KB of text, which is suitable for most of the textual values … Continue reading “Announcing public preview of VARCHAR(MAX) and VARBINARY(MAX) types in Fabric Data Warehouse”

oktober 29, 2024 av Dandan Zhang

Managed private endpoints allow Fabric experiences to securely access data sources without exposing them to the public network or requiring complex network configurations. We announced General Availability for Managed Private Endpoint in Fabric in May of this year. Learn more here: Announcing General Availability of Fabric Private Links, Trusted Workspace Access, and Managed Private Endpoints. … Continue reading “APIs for Managed Private Endpoint are now available”