Microsoft Fabric Updates Blog

Introducing Mirroring in Microsoft Fabric

As mentioned in Arun’s blog post, we are thrilled to announce that Mirroring in Microsoft Fabric is coming soon.

In today’s AI driven world, analytics platforms are only as good as their data. With the ever-increasing amount of data being collected in various applications, databases, and data warehouses in an enterprise, managing and ingesting data into a central platform for purposes of analytics and AI is a cumbersome and costly process. Databases and data warehouses use proprietary storage formats making the ability to create shortcuts to their data impossible. Data needs to be extracted, transformed, normalized, and made available in a central place for analytics. Even when this is implemented, data is not real-time making any insights stale pretty quicky resulting in users having to query the data in the source.

Mirroring provides a modern way of accessing and ingesting data continuously and seamlessly from any database or data warehouse into the Data Warehousing experience in Microsoft Fabric. This is all in near real-time thus giving users immediate access to changes in the source!

Let’s dive deeper into its capabilities.

Access and manage any database or data warehouse

Any database can be accessed and managed centrally from within Fabric without having to switch database clients. By just providing connection details, your database is instantly available in Fabric as a Mirrored database. Familiar database editors provide the ability to manage the database. Here is an example of the Azure Cosmos DB editor available within the Mirrored database in Fabric. Any edits go directly against the Azure Cosmos DB source.

A screenshot of a computer

Description automatically generated

Real-time data replication

There is no complex setup or ETL for data replication. With the same connection details, data is replicated in a reliable way in real-time. An initial snapshot is created after which data is kept in sync in near real-time with every transaction whether a new table is created, or new data is inserted/updated/deleted. Replication uses the database’s Change Data Capture (CDC ) technology, transforms it into appropriate Delta tables and lands it in OneLake. Intelligent logic determines when the source has changed, ensuring compute isn’t used unnecessarily before replicating the data. Granular controls enable configuring what is mirrored into Fabric. Detailed monitoring is also available to gain insights into mirroring operations and when the replica in Fabric OneLake was last refreshed. From here on, the data is ready for consumption immediately in any Fabric workload.

Data warehousing experiences simplified!

Every Mirrored database comes with default data warehousing experiences via a SQL Analytics Endpoint which houses the metadata of the Delta tables and points to the data in OneLake. Whether a SQL developer or citizen developer, one can query using the T-SQL editor which comes with full Intellisense or the visual query editor.

A screenshot of a computer

Description automatically generated

Cross-joining Mirrored databases, Warehouses, Lakehouses

As all data in Fabric, whether added as a shortcut to a lake in a Lakehouse or is data ingested through Data Factory, Warehouse or Spark is already in Delta format, data can be cross joined. Now with Mirroring, data across any database can be cross joined as well enabling querying across any database, warehouse or lakehouse whether that be data in Azure Cosmos DB, Azure SQL DB, Snowflake, MongoDB etc.

A screenshot of a computer

Description automatically generated

Data Science experiences unlocked!

Any data scientist can create a Lakehouse with a shortcut to the Mirrored database and use Notebooks to analyze and create models with the data.

A screenshot of a computer

Description automatically generated

Power BI Direct Lake mode

Power BI reports and semantic models can be built in Direct Lake mode getting the blazing fast performance of import mode but without duplicating the data. As Direct Lake mode also supports reading Delta tables right from OneLake, the Mirrored database is Power BI ready.

A screenshot of a computer

Description automatically generated

Mirroring Availability

Azure Cosmos DB, Azure SQL DB and Snowflake customers will be able to use Mirroring to mirror their data in OneLake and unlock all the capabilities of Fabric Warehouse, Direct Lake Mode, Notebooks and much more. SQL Server, Azure PostgreSQL, Azure MySQL, MongoDB and other databases and data warehouses will be coming in CY24. For participation in our early adopter program, submit your application here.

In the meantime, to try out Fabric, sign up for a free trial.

Related blog posts

Introducing Mirroring in Microsoft Fabric

April 23, 2024 by Misha Desai

At the recent Fabric Conference, we announced that both code-first automated machine learning (AutoML) and hyperparameter tuning are now in Public Preview, a key step in making machine learning more complete and widely accessible in the Fabric Data Science. Our system seamlessly integrates the open-source Fast Library for Automated Machine Learning & Tuning (FLAML), offering … Continue reading “Introducing Code-First AutoML and Hyperparameter Tuning: Now in Public Preview for Fabric Data Science”

April 18, 2024 by Santhosh Kumar Ravindran

We are excited to announce a new feature which has been a long ask from Synapse Spark customers, Optimistic Job Admission for Spark in Microsoft Fabric.This feature brings in more flexibility to optimize for concurrency usage (in some cases ~12X increase) and prevents job starvation. This job admission approach aims to reduce the frequency of … Continue reading “Introducing Optimistic Job Admission for Fabric Spark”