Microsoft Fabric Updates Blog

Announcing Delta Lake support in Real-Time Analytics KQL Database

As part of the One logical copy effort, we’re excited to announce that you can now enable availability of KQL Database in Delta Lake format.

Delta Lake  is the unified data lake table format chosen to achieve seamless data access across all compute engines in Microsoft Fabric.

The data streamed into KQL Database is stored in an optimized columnar storage format with full text indexing and supports complex analytical queries at low latency on structured, semi-structured, and free text data.

Enabling data availability of KQL Database in OneLake means that customers can enjoy the best of both worlds: they can query the data with high performance and low latency in their KQL database and query the same data in Delta Lake format via any other Fabric engines such as Power BI Direct Lake mode, Warehouse, Lakehouse, Notebooks, and more.

KQL Database offers a robust mechanism to batch the incoming streams of data into one or more Parquet files suitable for analysis. The Delta Lake representation is provided to keep the data open and reusable. This logical copy is managed once, is paid for once and users should consider it a single data set.

Users will only be charged once for the data storage after enabling the KQL Database availability in OneLake.

Enable OneLake availability

  1. To enable data availability in OneLake, browse to the details page of your KQL database or table.
  2. Next to OneLake folder in the Database details pane, select the Edit (pencil) icon.

Screenshot of the Database details pane in Real-Time Analytics showing an overview of the database with the edit OneLake folder option highlighted.

3. Enable the feature by toggling the button to Active, then select Done.

Screenshot of the OneLake folder details window in Real-Time Analytics in Microsoft Fabric. The option to expose data to OneLake is turned on.

You can enable data availability at a KQL database or table level.

Once you enable data availability, you can access all the new data added to your database at the given OneLake path in Delta parquet.

You can also choose to create a OneLake shortcut from Lakehouse, Data warehouse, or query the data directly via Power BI Direct Lake mode.

End-to-end streaming architecture in Fabric

Customers can now leverage data availability in OneLake to build more efficient and performant systems to handle high volume, and low latency streaming data in Microsoft Fabric.

  1. Eventstream can capture streaming data from multiple sources at scale.
  2. Eventstream allows pushing raw data into KQL Database seamlessly.
  3. KQL Database can be used to build a medallion architectural pattern with the help of in-built transformation functions such as update policies, and materialized views. The medallion structure is reflected as follows:–
    1. Bronze layer: raw data as received from Eventstream.
    2. Silver layer: deduplicated and enriched data.
    3. Gold layer: aggregated data suitable for reporting.
  4. Then you can either choose to make all three layers available in OneLake, or only make the aggregated data available for building your reports directly with Power BI.

This data design pattern allows you to scale efficiently for large incoming streams. KQL Database serves real-time analysis with low latency while making the data available in Delta Lake.

For more information on enabling data availability in OneLake, see One logical copy.

Related blog posts

Announcing Delta Lake support in Real-Time Analytics KQL Database

May 7, 2024 by Sruly Taber

These days every company is a data company. More specifically, every company has data that can provide added value to someone else. And even more to the point, data sharing has become a common and essential practice for many organizations. Whether it is sharing data with suppliers, customers, partners, or consultants, data sharing enables better … Continue reading “Introducing external data sharing: a new way to collaborate across Fabric tenants”

April 23, 2024 by Misha Desai

At the recent Fabric Conference, we announced that both code-first automated machine learning (AutoML) and hyperparameter tuning are now in Public Preview, a key step in making machine learning more complete and widely accessible in the Fabric Data Science. Our system seamlessly integrates the open-source Fast Library for Automated Machine Learning & Tuning (FLAML), offering … Continue reading “Introducing Code-First AutoML and Hyperparameter Tuning: Now in Public Preview for Fabric Data Science”