Microsoft Fabric Updates Blog

Recap of Data Factory Announcements at Fabric Conference US 2025

We had such an exciting week for Fabric during the Fabric Conference US, filled with several product announcements and sneak previews of upcoming new features.

Thanks to all of you who participated in the conference, either in person or by being part of the many virtual conversations through blogs, Community forums, social media and other channels. Thank you also for all your product feedback and Ideas forum suggestions that help us defining the next wave of product enhancements.

We wanted to make sure you didn’t miss any of the Data Factory in Fabric announcements, by providing you with this recap of all new features:

You can continue reading below for more information about each of these capabilities.

VNET Gateway support for Data pipelines

Support for data pipeline functionality on the VNet data gateway is now available in preview. The VNet data gateway facilitates connections to data sources that are either behind firewalls or accessible within your virtual network. This feature enables the execution of data pipeline activities on the VNet data gateway, ensuring secure connections to data sources within the VNet. Unlike on-premises data gateways, VNet data gateways are managed by Microsoft, consistently updated, support auto-scaling, and deactivate when not in use, making it cost-effective.

To learn more, refer to the documentation: What is a virtual network (VNet) data gateway?
Best-in-class connectivity and enterprise data movement.

New and updated Connectors

New and updated Certified Connectors for Power BI and Dataflows

As a developer and data source owner, you can create connectors using the Power Query SDK and have them certify through the Data Factory Connector Certification Program. Certifying a Data Factory connector makes the connector available publicly, out-of-box, Microsoft Fabric Data Factory and Microsoft Power BI in the following experiences.

This month we are happy to list the newly updated certified connectors that are part of the Microsoft Data Factory Connector Certification Program. Be sure to check the documentation for each of these connectors so you can see what’s new with each of them.

New connectors

Updated connectors

Performance improvement in Salesforce connector in data pipelines

Salesforce is a critical data source for many organizations, housing valuable customers and business data. To enhance data movement efficiency, Data Factory has introduced performance optimization in the Salesforce connector for pipelines. Optimization allows you to fetch the data concurrently from Salesforce by leveraging the parallelism capability, thus significantly reducing extraction times for large datasets.

Lakehouse connector now supports deletion vector and column mapping for delta tables in data pipelines.

The Lakehouse connector in Data Factory has been upgraded to provide deeper integration with delta table. Two major new capabilities enhance data processing workflows:

1. Support for deletion of vector

Delta table uses deletion vectors to track deleted records efficiently without physically removing them from storage. With this new feature in the Lakehouse connector, users can:

  • Read Delta tables while respecting deletion vector, ensuring that deleted records are automatically excluded from queries.
  • Improve performance by leveraging soft deletions instead of physical file modifications, making data updates and maintenance more efficient.
  • Enable compliance with data retention policies by retaining historical data for auditability while ensuring deleted records are filtered out from active queries.

2. Column mapping support for delta tables

Delta table’s column mapping capability allows for more flexible schema evolution, ensuring that changes in table structure do not disrupt data workflows. With column mapping support in the Lakehouse connector, users can:

  • Read from an existing delta Lake table with column mapping name/id mode enabled.
  • Write to existing delta lake table with column mapping name/id mode enabled.
  • Auto-create table with column mapping name mode enabled when sink table does not exist and source dataset columns contain special chars & whitespaces.
  • Auto-create table with column mapping name mode enabled when table action is overwriting schema and source dataset columns contain special chars & whitespaces.

These enhancements ensure that data engineers can work with delta tables more efficiently, improving data governance, performance, and maintainability.

To learn more about how to Configure Lakehouse in a copy activity refer to our documentation.

Simplifying Data Ingestion with Copy Job

Copy Job is making data ingestion simpler, faster, and more intuitive than ever and is now generally available. Whether you need batch or incremental data movement, Copy Job provides the flexibility to meet your needs while ensuring a seamless experience.

Since its preview last September, Copy Job has rapidly evolved with several powerful enhancements. Let’s dive into what’s new!

Public API & CI/CD support

Fabric Data Factory now offers a robust Public API to automate and manage Copy Job efficiently. Plus, with Git Integration and Deployment pipelines, you can leverage your own Git repositories in Azure DevOps or GitHub and seamlessly deploy Copy Job with Fabric’s built-in CI/CD workflows.

VNET gateway support

Copy Job now supports the VNet data gateway in Preview! The VNet data gateway enables secure connections to data sources within your virtual network or behind firewalls. With this new capability, you can now execute Copy Job directly on the VNet data gateway, ensuring seamless and secure data movement.

Upsert to Azure SQL Database & overwrite to Fabric Lakehouse

By default, Copy Job appends data to ensure no changed data is lost. But now, you can also choose to upsert data directly into Azure SQL DB or SQL Server and overwrite data in Fabric Lakehouse tables. These options give you greater flexibility to tailor data ingestion to your specific needs.

Enhanced usability & monitoring

We’ve made Copy Job even more intuitive based on your feedback, with the following enhancements:

  • Column mapping for simple data modification to storage as destination store.
  • Data preview to help select the right incremental column.
  • Search functionality to quickly find tables or columns.
  • Real-time monitoring with an in-progress view of running Copy Jobs.
  • Customizable update methods & schedules before job creation.

More connectors, more possibilities!

More source connections are now available, giving you greater flexibility for data ingestion with Copy Job. And we’re not stopping here—even more connectors are coming soon!

What’s next?

We’re committed to continuously improving Copy Job to make data ingestion simpler, smarter, and faster. Stay tuned for even more enhancements!

Learn more about Copy Job in: What is Copy job in Data Factory

Mirroring for Azure SQL Database protected by a firewall

You now can mirror Azure SQL Databases protected by a firewall. Using either the VNet data gateway or the on-premises data gateway for mirroring is available. The data gateway facilitates secure connections to your source databases through a private endpoint or from a specific private network.

Learn more about Mirroring for Azure SQL Database from Microsoft Fabric Mirrored Databases from Azure SQL Database.

Mirroring for Azure Database for PostgreSQL Flexible Server

Database Mirroring now supports replication of your Azure Database for PostgreSQL Flexible Server into Fabric! Now you can continuously replicate data in near real-time from your Flexible Server instance to Fabric OneLake. This enables seamless data integration, allowing you to leverage Fabric’s analytics capabilities while ensuring your PostgreSQL data remains up to date. By mirroring your PostgreSQL data into Fabric, you can enhance reporting, analytics, and machine learning workflows without disrupting your operational database.

To learn more, please reference the PostgreSQL mirroring preview blog.

Open Mirroring UX improvements

We’ve made improvements to our end-to-end in-product experience for Open Mirroring. With these changes, you can now create a Mirror DB and start uploading or dragging and dropping parquet and CSV files. It’s now easier than ever to get started with building your own Open Mirror source and allow you to test our replication technology before productionizing with APIs. Once your files are uploaded, you can also upload changes and updates to the data with the __rowMarker__ field specified to our change data capabilities.

Save a new Dataflow Gen2 with CI/CD support from a Dataflow Gen1, Gen2, or Gen2 (CI/CD)

Customers often would like to recreate an existing dataflow as a new dataflow Gen2 (CI/CD), getting all the benefits of the new GIT and CI/CD integration capabilities. Today, to accomplish this, they need to create the new Dataflow Gen2 (CI/CD) item from scratch and copy-paste their existing queries or leverage the Export/Import Power Query template capabilities. This, however, is not only inconvenient due to unnecessary steps, but it also does not carry over additional dataflow settings.

Dataflows in Microsoft Fabric now includes a ‘Save as’ feature in preview, that in a single click lets you save an existing dataflow Gen1, Gen2 or Gen2 (CI/CD) as a new Dataflow Gen2 (CI/CD) item.

Incremental Refresh for Dataflow Gen2

Incremental Refresh for Dataflow Gen2 is now generally available!

Incremental Refresh for Dataflow Gen2 allows you to refresh only the buckets of data that have changed, rather than reloading the entire dataset on every dataflow refresh. This not only saves time but also reduces resource consumption, making your data operations more efficient and cost-effective.

These new capabilities are designed to help you to be successful with your data integration needs and be as efficient as possible. Try it out today in your fabric workspace!

Learn more about Incremental Refresh in Dataflow Gen2: Incremental refresh in Dataflow Gen2.

Check ongoing validation status of a Dataflow Gen2 with CI/CD support

When you click Save & run in Dataflow Gen2 with CI/CD support, the process that gets triggered is two-fold:

  1. Validation: it’s a background process where your Dataflow gets validated against a set of rules. If it passes all validations and no errors are returned, then it’ll be successfully saved.
  2. Run: Using the latest published version of the Dataflow, a refresh job gets triggered to run the Dataflow.

If you only wish to trigger the validation process, you only need to click the ‘Save’ button.

What if you want to check the status of the validation? You now have a new entry point in the home tab of the ribbon called Check validation which you can click at any time to give you information of the ongoing validation or the result of a previous validation run.

Be sure to give this a try whenever you want to check the results of a save validation.

Apache Airflow Job

The Apache Airflow job in Microsoft Fabric is now generally available, providing a fully integrated Apache Airflow runtime for developing, scheduling, and monitoring Python-based data workflows using Directed Acyclic Graphs (DAGs).

What’s new:

  • Introducing Fabric runtime versioning for Apache Airflow job – This includes Fabric runtime version 1.0, which comes with Apache Airflow 2.10.4 and Python 3 as the default runtime.
  • Public API – APIs are now available to interact with Apache Airflow jobs for seamless management.
  • Git Integration & Deployment pipeline support – Users can utilize their Git repositories (Azure DevOps/GitHub) and deploy with Fabric’s built-in CI/CD workflows.
  • Diagnostic logs – Users can access Apache Airflow generated logs through the Apache Airflow job UI for enhanced observability.

Learn more about Apache Airflow job in Microsoft Fabric in What is Apache Airflow job?

OneLake file triggers for pipelines

The Fabric Data Factory team is thrilled to announce that the pipeline trigger experience is now generally available (GA) and now includes access to files in OneLake!

This exciting new improvement to pipeline triggers in Fabric Data Factory means that you can now automatically invoke your pipeline when files or folders have files that arrive, delete, or rename!

We’ve previously supported Azure blob file events in Fabric Data Factory like ADF & Synapse but now that Fabric users are leveraging OneLake as the primary data hub, we’re excited to see the pipeline patterns that you’ll build using OneLake file triggers!

Variable libraries for pipelines

One of the most requested features in Fabric Data Factory has been support for modifying values when deploying workspace changes between environments using Fabric CICD. To accommodate this, ask, we have integrated pipelines into the new Fabric platform feature called Variable Libraries.

With Variable Libraries, you can assign variables to unique values based on different environments, i.e. dev, test, prod. Then when you promote your factory to high environments, you can use different values from the library providing the ability to change values when pipelines are promoted to new environments.

This new preview feature will be super useful not just for CICD but also generically allows you to replace hardcoded values with variables anywhere in your pipelines to achieve the same functionality as global parameters in Azure Data Factory as well.

Spark Job Definition pipeline activity parameter support

The Spark Job Definition (SJD) activity in Data Factory allows you to create connections to your Spark Job Definitions and run them from your data pipeline.

And we are excited to announce that parameterization is now supported in this activity!

You will find this update in the Advanced settings where you can configure your SJD parameters and run your Spark Job Definitions with the parameter values that you set, allowing you to override your SJD artifact configurations.

Azure Databricks jobs activity now supports parameters

Parameterizing data pipelines to support generic reusable pipeline models is extremely common in the big data analytics world. Fabric Data Factory provides end-to-end support for these patterns and is now extending this capability to the Azure Databricks pipeline orchestration activity. Now when you select ‘Jobs’ as the source of your ADB action, you can send parameters to your ADF job allowing maximum flexibility and power of your orchestration jobs.

User data functions in Data pipelines

User Data Functions are now available in preview within Data pipeline’s Functions activity. This new feature is designed to enhance your data processing capabilities by allowing you to create and manage custom functions tailored to your specific needs.

Key highlights

  • Custom functionality: User Data Functions enable you to define custom logic and calculations that can be reused across multiple Data pipelines. This allows for more flexible and efficient data processing.
  • Integration in data pipelines: You can add User Data Functions as activities within your Data pipelines. This is done by selecting the Functions activity in the pipeline editor, choosing your User Data Functions as the type, and providing any necessary input parameters.

Check out our documentation to learn more about how to User Data Functions in your data pipelines.

Data Factory pipelines now support up to 120 activities

We’ve increased the default activity limit from 80 activities to 120 activities!

You can now utilize an additional 40 activities to build more complex pipelines for better error handling, branching, and other control flow capabilities.

Ability to orchestrate a Dataflow Gen2 with CI/CD capabilities

You can now add Dataflow refresh activities to your pipelines in Fabric Data Factory that include the new version of Dataflows:

Check out our documentation on Dataflow Gen2 with CI/CD and Git integration to learn more.

Azure Data Factory Item – CI/CD and Public APIs support

CI/CD and REST APIs support is now available for the Azure Data Factory item (Mounting ADF).

Mirrored databases

The mirrored database’s CI/CD support is now Generally Available.

Learn more from CI/CD for mirrored databases.

The REST APIs support has been Generally Available including the SPN support. Check out our documentation on Mirroring Public REST APIs.

Copy Job

The Copy job item’s CI/CD and APIs support is now Generally Available, this includes SPN support for Copy Job.

Check out our documentation on CI/CD for Copy Job to learn more.

Parameterized connections in Data pipelines

Parameterization of data connections in Data pipelines allows you to specify values for connection placeholders dynamically. This means you can pre-create data connections for various sources, such as Azure Blob Storage, SQL Server or any other data source supported by data pipelines, and reference them through data pipeline’s dynamic expressions at runtime. This feature empowers you to create more flexible and adaptable data pipelines, capable of connecting to different instances of data connections of the same type, such as SQL Server, without altering the pipeline definition.

Key benefits:

  • Flexibility: Use the same data pipeline definition to dynamically connect to various instances of data connections.
  • Efficiency: Minimize the need for multiple pipeline definitions, reducing complexity and maintenance effort.
  • Scalability: Easily manage and scale your data integration processes by leveraging dynamic expressions to handle connection values.

How it works:

During the pipeline run, dynamic expressions within the data pipelines specify values for the connection placeholders, enabling seamless integration with pre-created data connections. This innovation ensures that your data pipelines are not only more efficient but also highly customizable to meet your specific requirements.

We believe this new feature will significantly enhance your data processing capabilities and streamline your workflows. We can’t wait for you to experience the benefits of parameterized connections in your data integration projects.

Table Name parameter support for data destinations

In Dataflow Gen2, you can create parameters, they serve as a way to easily store and manage a value that can be reused throughout your Dataflow.

Major feedback that we’ve heard from our users is the lack of this capability in the Data destination experience for Dataflow Gen2. Thanks to the feedback, we’re now introducing the first support for parameters in the data destination experience where you can set a parameter to be used for the Table name of your destination.

This is available to all destinations that support this field and we’re working on extending this support to other areas of the data destination experience. Try out this new capability and let us know what you think.

Enhanced capabilities for Copilot in Data Pipelines

In November 2024, we announced the preview of 3 innovative capabilities in Copilot for Data Factory (Data pipeline). Today, we are excited to make these features generally available, with enhancements to make your data integration even more efficient and effortless.

Check out the blog post on Efficiently build and maintain your Data pipelines with Copilot for Data Factory: new capabilities and experiences to learn more.

Effortlessly generate your data pipelines: Understand your business intent and effortlessly translate it into data pipeline activities to build your data integration solutions. In the enhanced capability of Copilot, we can easily build more complex Data pipeline activities e.g. switch activity, metadata driven pipeline, etc. You can also update your pipeline settings and configurations in batches with multiple activities!

Efficiently troubleshoot error messages in your data pipeline Copilot. Diagnose and resolve pipeline errors more intuitively by providing clear and actionable summary.

Easily understand your complex data pipelines: Understand your complex pipeline configurations effortlessly by getting a clear and intuitive summary provided by Copilot.

Thank you for your feedback, keep it coming!

We wanted to thank you for your support, usage, excitement, and feedback around Data Factory in Fabric. We look forward to continued learning from you regarding your Data Integration needs and how Data Factory in Fabric can be enhanced to empower you to achieve more with data.

Please continue to share your feedback and feature ideas with us via our official Community channels, and stay tuned to our public roadmap page for updates on what will come next:

Related blog posts

Recap of Data Factory Announcements at Fabric Conference US 2025

April 17, 2025 by Jovan Popovic

The BULK INSERT statement is generally available in Fabric Data Warehouse. The BULK INSERT statement enables you to ingest parquet or csv data into a table from the specified file stored in Azure Data Lake or Azure Blob storage: The BULK INSERT statement is very similar to the COPY INTO statement and enables you to … Continue reading “BULK INSERT statement is generally available!”

April 14, 2025 by Jonathan Garriss

We’re excited to unveil the Microsoft Fabric SKU estimator, now available in preview—an enhanced version of the previously introduced Microsoft Fabric Capacity Calculator. This advanced tool has been refined based on extensive user feedback to provide tailored capacity estimations for businesses. Designed to optimize data infrastructure planning, the Microsoft Fabric SKU Estimator helps customers and … Continue reading “Empowering businesses with smart capacity planning: Introducing the Microsoft Fabric SKU estimator (Preview)”