Microsoft Fabric Updates Blog

Medallion Architecture in Fabric Real-Time Intelligence

Introduction

Building a multi-layer, medallion architecture using Fabric Real-Time Intelligence (RTI) requires a different approach compared to traditional data warehousing techniques. But even transactional source systems can be effectively processed in RTI. To demonstrate, we’ll look at how sales orders (created in a relational database) can be continuously ingested and transformed through a RTI bronze, silver, and gold layer.

Pre-Requisites

Code samples and scripts are available in the GitHub repository Medallion Architecture in Fabric Real-Time Intelligence (RTI). To implement this example in your own environment, get started by creating the AdventureWorksLT sample database in Azure. Refer to the Deploy to Azure SQL Database  documentation for information on how to install the sample database. Then, enable Change Data Capture (CDC). The 00_CDCSetup_AdWorksLT.sql script (located in the repository CDCSetup folder) can be used to enable and configure CDC.

You’ll also need access to a workspace assigned to a Fabric-enabled capacity. Depending on your environment, you may be able to use a trial if an existing capacity is not available.

Medallion Architecture Primer

As big data and data lakes became popular, architectural patterns arose to help organize, process, and improve the quality of the data. The Medallion architecture defines three layers (bronze, silver, gold) where data can land and progress to serve different needs. The bronze (or raw) layer is typically used for initial ingestion of the data; data cleansing, validation, and transformation occurs in the silver layer – and additional modeling/aggregation takes place in a gold layer.

In Fabric RTI, a medallion architecture also contains these three layers. Data, whether it arrives in near real-time or in batches, is continuously processed and transformed along the bronze, silver, and gold layers – as shown in the following figure.

Diagram of a Medallion Architecture in Fabric Real-Time Intelligence, including bronze, silver, and gold layers.

Figure – Medallion Architecture in Fabric Real-Time Intelligence

With RTI, data isn’t typically updated or deleted (as often occurs in a traditional data warehouse). Instead, all events are inserted. Different processes are used to reflect updates and/or filter out deleted records – but without overwriting the data.

To demonstrate, we start with a data source – the AdventureWorksLT database. Change Data Capture (CDC) is enabled for two tables – SalesOrderHeader and SalesOrderDetail. In Fabric, an eventstream ingests and pushes these CDC events into a bronze eventhouse table. Update policies populate (and maintain) silver tables – and materialized views remove any duplicates. Finally, a ‘latest value’ view joins header and details into a single, denormalized gold layer.

Create the Fabric Eventhouse

Prior to connecting to the source database, we create an eventhouse in our Fabric workspace. An eventhouse contains one or more KQL databases; similar to SQL Server, a KQL database can contain multiple tables, views, and functions. It supports its own query language (also referred to as KQL). For those unfamiliar with KQL, SQL queries are also supported – and can be converted to KQL as needed.

Bonus tip: a KQL Database in Fabric has an entity diagram view, which shows the relationships among tables, functions, policies, views, etc. Peeking ahead, the entity diagram in our example will eventually look like the following:

A Fabric KQL Database entity diagram view

Figure – KQL Database entity diagram view

Implement the Bronze Layer: Data Ingestion with an Eventstream

An Eventstream can connect to and process a variety of data sources (it can also expose an EventHub compatible endpoint so applications can push events directly to the stream). In this example, we connect to a CDC-enabled Azure SQL DB.

While an eventstream can also filter, join, split, or aggregate incoming events, our eventstream simply pushes the event data into a single landing table in the eventhouse – rawCDCEvents.

Screenshot of the Eventstream designer in Fabric, showing an Azure SQL DB as a source - and a single KQL database table as the destination.

Figure – Eventstream designer

When configuring the Eventstream (notes and tips)–

  • When creating the source, and entering the initial list of Azure tables, do not include spaces between tables names (as of April 2025, there is a UI glitch that won’t let you save if spaces are present).
  • The designer can create the rawCDCEvents table on your behalf. Alternatively, you can create it ahead of time with a simple KQL Statement:

.create table rawCDCEvents (schema:dynamic,payload:dynamic);

The Silver Layer: Table Update Policies and Materialized Views in the Eventhouse

The following query shows an example of records that are pushed into the rawCDCEvents table (the KQL folder in the GitHub repository contains all KQL queries and commands).

A view of records in the rawCDCEvents table.  Shows two columns (schema and payload) which are JSON representations of CDC events from the eventstream.

Figure – A view of records in the rawCDCEvents table

The payload column is what we’re interested in – a JSON representation of a CDC event. KQL has good JSON support, so we can parse each event into something suitable for light transformation. To do this, we define a table update policy. Specifically, a query is encapsulated into a function; this function is then referenced in an update policy. The policy also specifies a target table.

Think of an update policy like a trigger in a database table; the policy runs every time data is ingested; there isn’t anything needed in terms of scheduling the update. And, though it is called an update policy, we are only appending new rows to the target table; there are no updates to existing rows. The following figures show the function and update policy used to populate the silverSalesOrderHeader table.

KQL Function definition, used in a Table Update Function to query for Sales Orders from the rawCDCEvents table.

Figure – Function used in Table Update Function.

KQL Table Update Policy Definition

Figure – Table Update Function

Note that the silver tables can contain multiple rows for a given Order Header or Order Detail. For example, when a new Order Header is created, a row with an op (i.e. operation) value of c is ingested. Later, if the Order Header is updated, another row with an op value of u is ingested.

A Materialized View, using the summarize operation and arg_max() aggregation function, provides a persisted, deduplicated query that makes it easier for downstream users to analyze orders. Similar to a materialized view in a data warehouse, the view is maintained automatically (as new data arrives).

Materialized View Definition; references the silverSalesOrderHeader table.

Figure – Materialized View Definition

Gold Layer: The latest (and denormalized) view of orders

Depending on the use case, creation of a gold layer may be done outside the eventhouse; for example, additional transformations may be done in a Power BI semantic model. For this scenario, we make use of a view to do a bit of final clean-up. Specifically, we join the two materialized views together – and filter our any deleted records.

Latest Value View definition; joins together the Sales Orders Header and Sales Order Detail materialized views.

Figure – Latest Value View

Generate Test Data

In addition to the scripts needed to configure CDC, the SQL script 01_NewOrders.sql (located in the CDCSetup folder) can be used to create a new Sales Order. Run this script (from a tool like SSMS or Azure Data Studio) to create new rows in the database – and then inspect the events being ingested into the eventhouse by running KQL queries. The SQL script itself is simple, but it covers all the basic operations – inserts, updates, and deletes.

New Sales Order Header and Sales Order Detail being generated in the Azure SQL Database via a T-SQL script.

Figure – New Sales Order generated (and then modified) with T-SQL script

Summary and Next Steps

Fabric RTI can continuously process transactional events – including updated and/or deleted records. While the approach differs from traditional data warehousing techniques, RTI can handle large volumes of data – while providing a full audit trail of changes for analysis and monitoring. I invite you to head over to the GitHub repository – and try recreating this sample in your environment. Submit your feedback on Fabric Ideas and join the conversation on the Fabric Community.

Related blog posts

Medallion Architecture in Fabric Real-Time Intelligence

April 14, 2025 by Yael Biss

In today’s fast-paced data-driven world, enterprises are building more sophisticated data platforms to gain insights and drive innovation. Microsoft Fabric Lakehouses combine the scale of a data lake with the management finesse of a data warehouse – delivering unified analytics in an ever-evolving business landscape. But with great data comes great responsibility. Protecting sensitive information … Continue reading “Purview DLP Policies with Restrict Access for Fabric Lakehouses (Preview)”

April 14, 2025 by Yael Biss

Microsoft Purview’s Data Loss Prevention (DLP) policies for Fabric now supports Fabric KQL and Mirrored DBs! Purview DLP policies help organizations to improve their data security posture and comply with governmental and industry regulations. Security teams use DLP policies to automatically detect upload of sensitive information to Microsoft 365 applications like SharePoint and Exchange, and … Continue reading “Microsoft Purview Data Loss Prevention policies for Fabric have been extended to KQL and Mirrored Databases (Preview)”