Microsoft Fabric Updates Blog

Eventhouse OneLake Availability is Generally Available

As part of the One logical copy promise, we’re excited to announce that OneLake availability of Eventhouse in Delta Lake format is Generally Available. 

Eventhouse is a cutting-edge database workspace meticulously crafted to manage and store event-based data. Engineered to handle data in motion, Eventhouse seamlessly integrates indexing and partitioning into its storing process, accommodating structured, semi-structured, and free text data formats. This sophisticated design empowers high-performance analysis with minimal latency, facilitating lightning-fast ingestion and querying within seconds.

Delta Lake is the unified data lake table format chosen to achieve seamless data access across all compute engines in Microsoft Fabric. 

The data streamed into Eventhouse is stored in an optimized columnar storage format with full text indexing and supports complex analytical queries at low latency on structured, semi-structured, and free text data.

Enabling data availability of Eventhouse in OneLake means that customers can enjoy the best of both worlds: they can query the data with high performance and low latency in their Eventhouse and query the same data in Delta Lake format via any other Fabric engines such as Power BI Direct Lake mode, Warehouse, Lakehouse, Notebooks, and more.

Delta Lake representation is provided to keep the data open and reusable. This logical copy is managed once, is paid for once and users should consider it a single data set.

Users will not be charged for additional data storage for enabling Eventhouse availability in OneLake.

Adaptive behaviour

Eventhouse offers a robust mechanism that intelligently batches incoming data streams into one or more Parquet files, which are ideally structured for analysis. This is particularly crucial when dealing with trickling data, where writing a large number of small Parquet files into the lake can be inefficient and lead to increased costs of goods sold (COGS) and poor performance.

Eventhouse addresses this challenge with an adaptive mechanism that can delay write operations for up to a few hours if there isn’t sufficient data to create optimal Parquet files. This ensures that the files are not only efficient in size but also adhere to the Delta best practices. By doing so, Eventhouse guarantees that the Parquet files are primed for analysis, balancing the need for timely data availability with the cost and performance considerations.

Partitioning

By default, Eventhouse doesn’t partition delta tables available in OneLake. Though users can optionally configure partitioning, if desired.

For more information, refer partitioning section.

End-to-end streaming architecture in Fabric

Customers can now leverage data availability in OneLake to build more efficient and performant systems to handle high volume, and low latency streaming data in Microsoft Fabric.

End-to-end streaming architecture in Fabric
  1. Eventstream can capture streaming data from multiple sources at scale.
  2. Eventstream allows pushing raw data into KQL Database seamlessly.
  3. Eventhouse can be used to build a medallion architectural pattern with the help of in-built transformation functions such as update policies, and materialized views. The medallion structure is reflected as follows: –
    • Bronze layer: raw data as received from Eventstream.
    • Silver layer: deduplicated and enriched data.
    • Gold layer: aggregated data suitable for reporting.
  4. Then you can either choose to make all three layers available in OneLake, or only make the aggregated data available for building your reports directly with Power BI.

This data design pattern allows you to scale efficiently for large incoming streams. Eventhouse serves real-time analysis with low latency while making the data available in Delta Lake.

For more information on enabling data availability in OneLake, see Eventhouse OneLake Availability.

Learn more, and help us with your feedback

To find out more about Real-Time Intelligence, read Yitzhak Kesselman’s announcement. As we launch our preview, we’d love to hear what you think and how you’re using the product. The best way to get in touch with us is through our community forum or submit an idea. For detailed how-tos, tutorials and other resources, check out the documentation.

This is part of a series of blog posts that dive into all the capabilities of Real-Time Intelligence. Stay tuned for more!

Gerelateerde blogberichten

Eventhouse OneLake Availability is Generally Available

oktober 7, 2024 door Alex Lin

Introducing Managed VNet Support for Fabric Eventstream! By creating a Fabric’s Managed Private Endpoint, you can now securely connect Eventstream to your Azure services, such as Azure Event Hubs or IoT Hub, within a private network or behind a firewall. This integration ensures your data is securely transmitted over a private network, enabling you to … Continue reading “Secure Data Streaming with Managed Private Endpoints in Eventstream (Preview)”

oktober 4, 2024 door Jason Himmelstein

We had an incredible time in our host city of Stockholm for FabCon Europe! 3,300 attendees joined us from our international community, and it was wonderful to meet so many of you in person. Throughout the week of FabCon Europe, our teams published a wealth of valuable content, and we want to ensure you have … Continue reading “Fabric Community Conference Europe Recap”