Microsoft Fabric Updates Blog

Announcing Public Preview: Incremental Refresh in Dataflow Gen2

Incremental Refresh in Dataflow Gen2 is now in public preview! This powerful feature is designed to optimize your data processing by ensuring that only the data that has changed since the last refresh is updated. This means faster dataflows and more efficient resource usage.

A drop down menu that highlights the incremental refresh option.

Key Features of Incremental Refresh

Incremental refresh in Dataflow Gen2 is a bit different from Power BI Dataflow. Since we now support data destinations we do not require you to setup a period of dates that you want to retain when configuring incremental refresh.

A config menu that allows the users to setup incremental refresh based on the settings the user provided.
  1. Easy Setup: Right click the query and select Incremental refresh to get started.
  2. Customizable Settings: Configure the necessary settings for incremental refresh.
    • Choose a Date/DateTime/DateTimeZone column to filter by.
    • Specify the data extraction period you want to check every refresh for updates
    • Define the bucket size, the bigger the bucket, the more data fits into it, but you give up some parallelism.
    • Provide a column that we can check for new or updated records in the bucket.
  3. Advanced Configuration: If needed, configure advanced settings to allow publishing even if the query does not fully folds.
  4. Data Destination Setup: Optionally, set up a data destination before the first incremental refresh to ensure it lands in the place you want to consume the data from.
  5. Publish and Automate: Publish your Dataflow Gen2 and let it automatically refresh incrementally with a pipeline or a schedule based on your settings.

How Incremental Refresh Works?

Incremental refresh divides data into buckets based on the Date, DateTime or DateTimeZone column.

Here’s a high-level overview of the process:

  1. Evaluate Changes: The dataflow compares the maximum value in the change detection column with the previous refresh. If the value has changed, the bucket is marked for processing.
  2. Retrieve Data: The dataflow retrieves data for the changed buckets in parallel, loading it into the staging area.
  3. Replace Data: The dataflow replaces the data in the destination with the new data, ensuring only the updated buckets are affected. Any historical data or data that is outside the range of buckets marked for processing is not touched or changed. This way you can retain long term history in your destination.

Benefits of Incremental Refresh

  • Efficiency: Only the data that has changed since the last refresh is processed, saving time and resources.
  • Performance: Faster dataflows due to reduced data processing and parallelism.
  • Scalability: Handle large datasets more effectively by processing data in smaller, manageable chunks.

Get Started Today!

We invite you to try out the public preview of Incremental Refresh in Dataflow Gen2. Follow the steps outlined above to set up incremental refresh and experience the benefits firsthand. Your feedback is invaluable to us as we continue to improve and enhance this feature.

Stay tuned for more updates and enhancements as we move towards general availability. Happy dataflowing!

Resources

Docs: https://aka.ms/DFgen2-IncrementalRefresh-DOCS  

Entradas de blog relacionadas

Announcing Public Preview: Incremental Refresh in Dataflow Gen2

diciembre 4, 2025 por Connie Xu

Notebook activity in Microsoft Fabric Data Factory pipelines now supports connection property—unlocking a more secure and production-ready way to run your notebooks. What’s New? With this update, you can configure Notebook activities to run as Service Principal (SPN) or Workspace Identity (WI). These authentication methods are our recommended approach for production environments, ensuring: Why it … Continue reading “Run Notebooks in Pipelines with Service Principal or Workspace Identity”

diciembre 1, 2025 por Ye Xu

Copy job is the recommended approach in Microsoft Fabric Data Factory for moving data from any sources to any destinations in a simplified and efficient way—whether you’re transferring data across clouds, from on-premises systems, or between services. With native support for multiple delivery patterns, including bulk copy, incremental copy, and change data capture (CDC) replication, … Continue reading “Simplifying Data Ingestion with Copy job – Replicate data from Dataverse through Fabric to multiple destinations”