Fabric Change the Game: SQLDB mirror into Microsoft Fabric
Having previously showcased the export of Azure SQL Database data and its import into the Lakehouse via Parquet format Microsoft Fabric changing the game: Exporting data and building the Lakehouse | Microsoft Fabric Blog | Microsoft Fabric, the recent launch of the Azure SQL DB mirror option for Microsoft Fabric has piqued customer interest in general, and this post will delve into this fresh capability.
Step by Step:
Ref. for SQL DB Docs: Azure SQL Database documentation – Azure SQL | Microsoft Learn
Ref. for SQLDB Fabric Mirror: Microsoft Fabric mirrored databases from Azure SQL Database (Preview) – Microsoft Fabric | Microsoft Learn
1. Open Azure Portal and look for the Azure SQL DB that you want to connect to Microsoft Fabric, copy the server name in the Notepad as we will use in the next step, for example the Fig 1 – Config, shows sqldbfta.database.windows.net:
2. From Microsoft Fabric open your Workspace and look for the Mirror SQL Database option between the options available to be added in the Workspace – Fig. 2 Mirror:
3. Choose a name for the Mirror to be added, for example the Fig 3- Name shows the SQLAdventure name was used:
4. Next let’s add the connection details that were obtained from the first step, this information can be seen in the Fig. 4 – Mirror Config:
The example for the server is: sqldbfta.database.windows.net
The Database example name is: Adventureworks2017
As for the authentication I am using my own account to connect, though if preferred the authentication can also be configured with a contained database user. Steps are here (Database Principal – Tutorial: Configure Microsoft Fabric mirrored databases from Azure SQL Database (Preview) – Microsoft Fabric | Microsoft Learn)
5. The subsequent step involves determining the type of mirror to be executed, such as:
- Mirror all data means any new tables created after Mirroring is started will be mirrored.
- Optionally, choose only certain objects to mirror. Disable the Mirror all data option, then select individual tables from your database.
Fig. 5 – Mirror Fails shows that I missed something during the configuration steps:
6. Looking at the red message information, it seems I forgot to check the prerequisites before configuring the mirror. According to the docs. there are some considerations for Azure SQL DB before the mirroring, let’s review it:
- Enable System Assigned Managed Identity (SAMI) of your Azure SQL logical server. Meaning:
- To configure or verify that the SAMI is enabled, go to your logical SQL Server in the Azure portal. Under Security in the resource menu, select Identity.
- Under System assigned managed identity, select Status to On.
- Mirroring does not offer support for Azure SQL Database logical servers situated behind an Azure Virtual Network or private networking. This implies that you need to update your Azure SQL logical server firewall rules in order to accommodate this, check Allow public network access.
Following the advice from the docs. Fig. 6 – Identity, shows the Azure SQLDB identity configuration missed:
- Prior to initiating the replication process, since the example employs Adventure Works, I aimed to execute some random SELECT queries to assess their functionality. Based on the information I’ve gathered from the documentation, Zero ETL, Near-real time replication, Mirroring Azure SQL Databases in Fabric (microsoft.com))
Mirroring for Azure SQL Database is built on the SQL’s Change Data Capture (CDC) stack optimized for lake-centric architecture. CDC stores changes locally in the database whereas Mirroring reads data from the harvested database transaction log and publishes the change data to OneLake storage. This change data is transformed into appropriate delta tables landing into Fabric OneLake. Moreover, DDL’s like add/drop column are also supported on actively mirrored tables.
First, based on the schema of the table “Person.Person ” from AdventureWorks.The following table without Foreign Keys – FK was created, which would allow duplicated keys:
CREATE TABLE [Person].[Person_noFK](
[BusinessEntityID] [int] NOT NULL,
[PersonType] [nchar](2) NOT NULL,
[NameStyle] [dbo].[NameStyle] NOT NULL,
[Title] [nvarchar](8) NULL,
[FirstName] [dbo].[Name] NOT NULL,
[MiddleName] [dbo].[Name] NULL,
[LastName] [dbo].[Name] NOT NULL,
[Suffix] [nvarchar](10) NULL,
[EmailPromotion] [int] NOT NULL,
[AdditionalContactInfo] [xml](CONTENT [Person].[AdditionalContactInfoSchemaCollection]) NULL,
[Demographics] [xml](CONTENT [Person].[IndividualSurveySchemaCollection]) NULL,
[rowguid] [uniqueidentifier] NOT NULL,
[ModifiedDate] [datetime] NOT NULL,
CONSTRAINT [PK_Person_BusinessEntityID1] PRIMARY KEY nonCLUSTERED
(
[BusinessEntityID] ASC
)WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = on, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
I found a very useful script the generate inserts, so with that, I generated the data to be loaded from “Person.Person” and inserted in my new table that has the same structure, It’s important to note that this script is from a third-party source, so please exercise caution when using it. https://github.com/drumsta/sql-generate-insert
I added part of those inserts inside of a procedure that I have the intention to execute over and over again with some other queries running at the same time. So I use the tool SQLStress from Adam Machanic available at the Microsoft Store. This tool will execute Inserts and Selects over and over again while replicating the data.
Tool: SqlQueryStress – Microsoft Apps
- Now we can start the replication – Fig 7 – Replicating:
Monitoring:
From the SSMS I connected to SQL Azure DB and ran the following queries that I found in the docs:
exec sp_help_change_feed; sys.sp_help_change_feed (Transact-SQL) – SQL Server | Microsoft Learn
SELECT * FROM sys.dm_change_feed_log_scan_sessions; sys.dm_change_feed_log_scan_sessions (Transact-SQL) – SQL Server | Microsoft Learn
SELECT * FROM sys.dm_change_feed_errors; sys.dm_change_feed_errors (Transact-SQL) – SQL Server | Microsoft Learn
Fig 8 – ongoing, will show the replication monitoring query:
Seems everything worked as expected as Fig 9 – Queries:
It takes a few minutes to start but once it did I was able to compare with the Monitoring Option Available in the UI and the data inserted into the table, Fig 10 – Rowcount.
Once replication starts you will see the the Synapse Link background service inside of the sys.dm_exec_requests, Fig 11 – Requests:
The Monitoring UI option inside of Microsoft Fabric can be seen in the Fig 12- Replication Rowcount which shows the same number of replicated data that we got from the queries used to monitor on the SQL side:
Some tests I did:
After sometime I decided Deleted the whole table(as truncate would not be supported due the replication) and check what information the Monitoring UI would hold. As it holds the Total replicated, once my table was deleted, it continue reflect the 380.
So I decided to drop and recreate the table. To accomplish this, I paused the replication within the Microsoft Fabric mirror interface, as it logically cannot be done while the table is actively replicating. Subsequently, I recreated the table within SQL. Upon restarting the replication, it took a few minutes for the information to synchronize with Microsoft Fabric. However, I could observe the moment when the table ceased to exist in Fabric and reappeared once again.
Once the tables are replicated you will the data available and you can use the SQL Endpoint to build your queries on top of it, Fig 13 – Mirror shown.
Here is the data replicated from SQL Endpoint as the Fig 14 – SQL Endpoint, shows. Please note you can stop and restart the mirror at anytime:
As Microsoft Fabric is under the Onelake inside of this tenant. I added this mirrored database as a shortcut in a Lakehouse of a different workspace, as Fig 15 – Shortcut and Fig 16 – Tables shows the SQL Adventure will be available inside of the Lakehouse DataflowStagging that is inside of another Workspace:
Details about how to create a shortcut: Create a OneLake shortcut – Microsoft Fabric | Microsoft Learn)
So from other Workspace I could query the data that was mirrored inside Microsoft Fabric.
Summary:
In this blog post, I discuss the process of mirroring my Azure SQL Database into Microsoft Fabric. This mirroring enables data replication on demand, allowing for the flexibility to stop and restart replication as needed. Additionally, the mirroring process includes the replication of Data Definition Language (DDL) operations, as evidenced by my demonstration of dropping and recreating a table. Furthermore, I successfully queried the mirrored table from a different workspace by incorporating onelake shortcuts.