Efficient JSON loading to Eventhouse in Fabric Real-Time Intelligence
In the era of big data, efficiently parsing and analyzing JSON data is critical for gaining actionable insights. Leveraging Kusto, a powerful query engine developed by Microsoft, enhances the efficiency of handling JSON data, making it simpler and faster to derive meaningful patterns and trends. Perhaps more importantly, Kusto’s ability to easily parse simple or nested JSON makes it easier then ever to extract meaningful insights from this data. The purpose of this blog post is to walk through ways that JSON data can be loaded into Eventhouse in Microsoft Fabric, where you can then leverage Kusto’s powerful capabilities for this. I’ve tried this a few different ways, and the below approach is the fastest, most efficient low-code way to ingest the data into the Eventhouse. As JSON inherently supports different schemas in a single file, the expectation here is that we have a json file with varying schemas within a single file, and we would like to load this into our Eventhouse for efficient parsing with KQL.
NOTE: I tested a few other options, however both (Fabric Data Pipelines Copy Data Activity and Python SDK) require the ingestion mapping be defined as a csv mapping, which causes truncation and issues with loading the file. Even if they were to allow me to define a JSON mapping, the performance would most likely be significantly lower than using the approach outlined below.
- The test was performed on a sample 20 MB JSON file that simulated events being output by the Microsoft Graph.
- It is assumed that the only goal is to load the JSON. Schema inference is done via update policies in Eventhouse, the only part the loader needs to worry about is loading the JSON file into the Eventhouse.
- Some rows that will be processed will exceed 1 MB in length due to the size of the payload
- The assumption for this post is that you have a 100-level understanding of what Eventhouse is.
Here is a sample structure of what my JSON looks like, for reference:

Eventhouse needs to be flexible to handle JSON in a variety of structures, and we will have Update Policies run in the Eventhouse to parse this automatically into a tabular structure. An Eventhouse has already been created, and a simple one column table is being used to ingest the data:
//create a generic table in eventhouse to hold json data
.create table ['OnelakeJsonParser'] (['Data']:dynamic)
Additionally, a json mapping is created to map the json data to a single dynamic column.
.create-or-alter table ['OnelakeJsonParser'] ingestion json mapping 'OneLakeJson_mapping' '[{"column":"Data", "Properties":{"Path":"$"}}]'
The default limit for a column is 1 MB, this means that each value that gets loaded that is larger than 1 MB will be rejected. Because for this scenario we want to ingest rows that could potentially be larger than 1 MB, we are going to change the column encoding type to accept larger payloads:
//alter the column limit to ingest objects larger than 1 MB (but smaller than 32 MB)
.alter column OnelakeJsonParser.Data policy encoding type='BigObject32'
Afterwards, create the KQL query to ingest the data from OneLake directly into the table, leveraging the mapping we created earlier.
/the base query we need for our Data pipeline
.ingest async into table ['OnelakeJsonParser'] (<abfss file path from OneLake>;impersonate') with (format='multijson',ignoreSizeLimit=true,ingestionMappingReference='OneLakeJson_mapping',ingestionMappingType='json')
Important topic of note: Because the Eventhouse table is only a single column of a dynamic data type, any valid JSON can be parsed here. Update policies can be written to route the different schemas to different destination tables.
Now that we have the base KQL query working, it needs to be operationalized. The simplest no-code way to do this is to leverage Data pipelines in Fabric to execute this command. By parameterizing the pipeline and setting it to be triggered whenever a new file is loaded into OneLake, we can build an automated data processing framework that processes JSON in a low-code way and prepares these for reporting and analytics. Moving over to Fabric Data pipelines, we have a simple pipeline that executes a KQL script. Note that the KQL script is being leveraged here, not a copy activity. The KQL script is able to leverage the built in-ingestion capabilities of Kusto and load the data in much faster than the Copy Activity. Below is a redacted screenshot showing what the KQL script configuration looks like:

A quick note. If you run into failures during the ingestion process you can leverage the following Kusto command to see the reason for the failure:
.show ingestion failures
| order by FailedOn desc
| limit 1
Outside of scope of this post, but the pipeline could be configured to parameterize the file insert, the table, and properties.
Now that the data is loaded in the Eventhouse table, update policies can be created to route this effectively to different locations in the database.

This can be broken out into a parsed structure using Kusto’s powerful abilities:
OnelakeJsonParser
| extend ParsedJson= parse_json(Data)
| project-away Data
| extend id = ParsedJson.id, Subject = ParsedJson.subject, StartTime=todatetime(ParsedJson.start.dateTime), EndTime=todatetime(ParsedJson.end.dateTime), location=tostring(ParsedJson.location.displayName)
| where isnotempty(id)
| project-away ParsedJson
With just a few lines of code, KQL changed our JSON file to structured data! Notice that we also referenced nested JSON as well, where the start and end date time were nested deeper in the structure. By wrapping this in an update policy, we can have the engine automatically parse and load our JSON data for querying. A truly low-code way to handle complex JSON files.
Next Steps
To learn more, head over to the Eventhouse documentation. Submit your feedback on Fabric Ideas and join the conversation on the Fabric Community.