Microsoft Fabric Updates Blog

Efficient log management with Microsoft Fabric

Introduction

In the era of digital transformation, managing and analyzing log files in real-time is essential for maintaining application health, security, and performance. There are many 3rd party solutions in this area allowing collecting/processing storing, analyzing and acting upon this data source. Sometimes as your systems scale, those solution can become very costly, their cost model increases based on the amount of data they ingest and not according to the customer value.

This blog post explores a robust architecture leveraging Microsoft Fabric SaaS platform focused on its Real-Time Intelligence capabilities for efficient log files collection processing and analysis.

The use cases can vary from simple application errors troubleshooting, to more advanced use case such as application trends detection. For example Detecting slowly degrading performance issues like average user session in the app for specific activities last more than expected or more proactive monitoring using log based KPIs definition and monitoring those APIs for alerts generation.

Because Fabric provides a complete separation between compute and storage you can grow your data without necessarily growing your compute costs.

Architecture Overview

The proposed architecture integrates Microsoft Fabric’s Real-Time Intelligence (Real-Time Hub) with your source log files to create a seamless, near real-time log collection solution.

It is based on Microsoft Fabric: a Microsoft SAAS solution which is a unified suite of analytical experiences. Fabric is a modern data/AI platform based on unified and open data formats (parquet/delta) allowing both classical data management experiences like Lakehouse/warehouse at scale as well as real-time intelligence, all on a lake-centric SaaS platform for simplified analytics. Fabric’s open foundation with built-in governance enables you to connect to various clouds and tools while maintaining data trust.

This is very High-level Overview of Real-Time Intelligence within Fabric.

Log events – Fabric based Architecture

General notes

Since Fabric is a SaaS solution, all the components can be used without deploying any infrastructure in advance, just by a click of a button and very simple configurations you can customize the relevant components for this solution.

The main components used in this solution are Data pipeline, OneLake, and Eventhouse.

Our data source for the example is taken from this git repo:

https://github.com/logpai/loghub/tree/master/Spark

The files were taken and stored inside an S3 bucket to simulate the easiness of the flow, regardless your data source’s locations.

16/07/26 12:00:30 INFO util.Utils: Successfully started service ‘sparkDriverActorSystem’ on port 59219.
16/07/26 12:00:30 INFO spark.SparkEnv: Registering MapOutputTracker
16/07/26 12:00:30 INFO spark.SparkEnv: Registering BlockManagerMaster
16/07/26 12:00:30 INFO storage.DiskBlockManager: Created local directory at /opt/hdfs/nodemanager/usercache/curi/appcache/application_1460011102909_0176/blockmgr-5ea750cb-dd00-4593-8b55-4fec98723714
16/07/26 12:00:30 INFO storage.MemoryStore: MemoryStore started with capacity 2.4 GB
Typical log Event

Components

Data pipeline

First challenge to solve is how to bring the log files from your system into Fabric this is the Log collection phase: many solutions exist for this phase each with its pros and cons.

In Fabric the standard approach to bring data in is by use of Copy Activity in ADF or in its Fabric SaaS version is now called Data pipeline. Data pipeline is a low code/no code tool allowing to manage and automate the process of moving and transforming data within Microsoft Fabric, a serverless ETL tool with more than 100 connectors enabling integration with a wide variety of data sources, including databases, cloud services, file systems, and more.


In addition, it supports an on prem agent called self-hosted integration runtime, this agent that you install on a VM, acts as a bridge allowing to run your pipeline on a local VM and securing your connection from on prem network to the cloud.

Let’s describe in more details our solution data pipeline.

Bear in mind ADF is very flexible and supports reading at scale from a wide range of data sources / files integrated as well to all major cloud vendors from blob storage retrieval: like S3, GCS, Oracle Cloud, File systems, FTP/SFTP etc. so that even if your files are generated externally to Azure this is not an issue at all.

Visualization of Fabric Data Pipeline

  1. Log Collection
  2. ADF Copy Activity – Inside Data pipeline we will create an Activity called Copy Activity with the following basic config.
  3. Source – Mapped to your data sources, it can be azure blob storage with container containing the log files, other cloud object storage like S3 or GCS, log files will be retrieved in general from a specific container/folder and are fetched based on some prefix/suffix in the file name. To support incremental load process, we can configure it to delete the source files that it reads so that once the files are successfully transferred to their target they will be automatically deleted from their source. The next iteration pipeline will not have to process the same files again.
  4. Sink– OneLake/Lakehouse folder, we create ahead of time a Lakehouse which is an abstract data container allowing to hold and manage at scale your data either structured or unstructured. We will then select it from the list of connectors (look for OneLake/Lakehouse).
  5. Log Shippers – This is an optional component, sometimes it is not allowed for the ETL to access your onprem Vnet, in this case tools like Fluentd, Filebeat, Open Telemetry collector used to forward your application collected logs to the main entry point of the system which is the Azure Blob Storage.
  6. Azcopy CLI – If you don’t wish to invest in tools and all you need to copy your data in a scale/secure manner to Azure Storage, you might consider create your own log shipping solution based on the free Azcopy tool together withs some basic scripting around it for scheduling. Azcopy is a command-line utility designed for high-performance uploading, downloading, and copying data to and from Microsoft Azure Blob and File storage.

Visualization of Fabric first Activity: Copy from Source Bucket to Lakehouse

  1. Log Preparation

Upon log files landing in the Azure Blob storage, EventStream can be used to trigger the Data pipeline that will handle the data preparation and loading phase.

What is Data preparation phase’s main purpose?

After the log files land in the storage and before they are loaded to the real-time logs database the KQL Database, it might be necessary to transform the data with some basic manipulations, the reasons for that might be different.

Examples

  • Bad data formats – Logs files can contain problematic characters like new lines inside a row (stack trace error message with new lines as part of the message field of the record).
  • Metadata enrichment – The log file names can contain in their name some meaningful data: for example, file name describes the originating process name / server name, so this metadata can be lost once the file content is loaded into database.
  • Regulation restrictions – Logs can contain private data like names, credit card numbers, social security number etc. called PII that must be removed, hashed or encrypted before the load to database.

In our case we will be running a PySpark Notebook that reads the files from OneLake folder, fixes the new lines inside a row issue, and create new files in another OneLake folder, we call this notebook with a base parameter called log_path that defines the log file’s location on the OneLake to read from.

Visualization of Fabric second Activity – Running the Notebook

  1. Log Loading – Inside Data pipeline, the last step, after the transformation phase, we call again the Copy data activity, but this times source and sink are different.
  2. Source – Lakehouse folder (previous notebook output).
  3. Sink – Evenhouse specific Table (created ahead of time), it is basically an empty table (lograw).

Visualization of Fabric last Activity – Loading to EventHouse.

For this step the log collection and preparation: we broke this into 3 data pipeline activities.

  1. Copy Activity – Read the log files from source, this is the first step of the log ingestion pipeline it is running inside our orchestrator Data pipeline.
  2. Run Notebook Activity – Transform the log files, this is the execution of a single or chain of notebooks.
  3. Copy Activity – Load the log files inot destination datatbase, KQL inside Evenhouse. The logs database table called lograw, it is a specific table created ahead of time inside EventHouse Database.f

Inside the Eventhouse

First, we needed to create a KQL database with a table to hold the log records.

A KQL database is a scalable and efficient storage solution for log files, optimized for high-volume data ingestion and retrieval. Eventhouses and KQL databases operate on a fully managed Kusto engine. With an Eventhouse or KQL database, you can expect available compute for your analytics within 5 to 10 seconds. The compute resources grow with your data analytic needs.

Log Ingestion to KQL Database with Update Policy

We can separate the ETL transformation logic of what happens to the data before, it reaches the Eventhouse KQL database and after that. Before it reached the database, the only transformation we did was calling during the data pipeline a notebook to handle the new lines merge logic. This cannot be easily done as part of the database ingestion logic simply because when we try to load the files with new lines as part of a field of a record, it breaks the convention and what happens is that the ingestion process creates separate table rows for each new line of the exceptions stacktrace exception.

On the other hand, we might need to define basic transformation rules, such as date formatting, type conversion (string to numbers), parse and extract some interesting value from a String based on regular exception. To create JSON (dynamic type) of a hierarchical string (XML / JSON string etc.) for all these transformations we can work with what is called an update policy we can define a simple ETL logic inside KQL database as explained in our documentation: Update policy overview.

During this step we create a new table called logparsed from the lograw table that will be our base table for the log queries.

Query log files

After data is ingested and transformed it lands in a basic logs table that is schematized logparsed. In general, we have some common fields that are mapped to their own columns, like log level (INFO/ ERROR/ DEBUG), log category, log timestamp (a datetime typed column) and log message which can be in general either a simple error string or a complex JSON formatted string. It is usually preferred to be converted to dynamic type that will bring additional benefits like simplified query logic, and reduced data processing can avoid expensive joins.

Example for Typical Log Queries

Following queries are simple queries on the logparsed table.

Category

Purpose

KQL Query

Troubleshooting

looking for an error at specific datetime range

logsparsed

| where message contains “Exception” and formattedDatetime

between ( datetime(2016-07-26T12:10:00) .. datetime(2016-07-26T12:20:00) )

Statistics

Basic statistics

Min/Max timestamp of log events

logsparsed

| summarize minTimestamp=min(formattedDatetime), maxTimestamp=max(formattedDatetime)

Exceptions Stats

Check Exceptions Distributions

logsparsed

| extend exceptionType = case(message contains “java.io.IOException”,”IOException”,

message contains “java.lang.IllegalStateException”,”IllegalStateException”,

message contains “org.apache.spark.rpc.RpcTimeoutException”,”RpcTimeoutException”,

message contains “org.apache.spark.SparkException”,”SparkException”,

message contains “Exception”,”Other Exceptions”,

“No Exception”)

| where exceptionType != “No Exception”

| summarize count() by exceptionType

Log Module Stats

Check Modules Distributions

logsparsed

| summarize count() by log_module

| order by count_ desc

| take 10

Real-Time Dashboards

After querying the logs, it is possible to visualize the query results in Real-Time dashboards.

  • Select the query.
  • Click on Pin to Dashboard.

After adding the queries to tiles inside the dashboard this is a typical dashboard we can easily build.

Real-Time dashboards can be configured to be refreshed in Real-Time. In which case you can very easily configure how often to refresh the queries and visualization. In an extreme case it can be as low as continuos.

There are many more capabilities implemented in the Real-Time Dashboard, like data exploration Alerting using Data Activator, conditional formatting (change items colors based on KPIs threshold) and it keeps growing.

What about AI Integration ?

Machine Learning Models

Kusto supports out of the box time series analysis allowing for example anomaly detection. If that isn’t enough you can Explore data in Real-Time Dashboard tiles and clustering. You can always mirror the data of your KQL tables into OneLake delta parquet format by selecting OneLake availability.

This configuration will create another copy of your data in open format delta parquet. It will be available for any SparkML models to do whatever machine learning exploration and ML modeling you wish. There is no additional storage cost to turn on OneLake availability.

Conclusion

A well-designed Real-Time Intelligence solution for log file management using Microsoft Fabric and EventHouse can significantly enhance an organization’s ability to monitor, analyze, and respond to log events. By leveraging modern technologies and best practices, organizations can gain valuable insights and maintain robust system performance and security.

Zugehörige Blogbeiträge

Efficient log management with Microsoft Fabric

September 23, 2025 von Will Thompson (HE/HIM)

A new AI capability in Microsoft Fabric’s Real-Time Intelligence (RTI) is available with a preview of anomaly detection. This marks the start of a journey to empower users with AI-driven, scalable, proactive real-time data experiences. Learn more about the other Real-Time Intelligence announcements in our documentation. Unlocking Value with AI in Real-Time Intelligence We are … Continue reading “AI–Powered Real-Time Intelligence with Anomaly Detection (Preview)”

September 16, 2025 von Yitzhak Kesselman

Every organization shares the same ambitions: to deliver better outcomes, increase efficiency, mitigate risks, and seize opportunities before they are lost. These ambitions underpin growth, resilience, agility, and lasting competitive advantage.  Yet most organizations struggle to harness the full value of their data to realize those ambitions. Massive volumes of granular signals flow in constantly … Continue reading “The Foundation for Powering AI-Driven Operations: Fabric Real-Time Intelligence”