Microsoft Fabric Updates Blog

Using Data pipelines for copying data to/from KQL Databases and crafting workflows with the Lookup activity

AUTHOR: Guy Reginiano, Program Manager

In today’s data-driven landscape, the ability to capture, analyze, and visualize vast amounts of real-time data from diverse sources is crucial for making informed decisions and gaining a competitive edge. Synapse Real-Time Analytics in Microsoft Fabric offers a comprehensive solution to this challenge. Its seamless integration with Data factory in Microsoft Fabric allows you to use multiple sources and destinations to build data movement and data transformation solutions. In this blog post, we’ll delve into the integration of Data factory’s Data pipeline and KQL Database within the Fabric ecosystem.

Data Factory: Your Gateway to Connectivity

Data factory in Fabric empowers you with seamless connectivity to more than 170+ data stores, including on-premises data sources, cloud databases, analytical platforms, line of business applications, and more.

A pivotal feature within the Data factory experience is the Data pipeline—an evolution of Azure Data factory pipelines. This enhanced version equips you with a rich set of integration capabilities, allowing you to seamlessly move and transform data across different sources and destinations. One of the standout features of Data pipelines is the Copy Assistant, a tool that accelerates your data copying process by providing a guided and streamlined experience.

Copy Assistant: Simplifying Data Copying

The Copy Assistant within Data pipelines is designed to jumpstart your data copying journey through a user-friendly, guided process. Whether you’re a seasoned data professional or just starting out, the Copy Assistant simplifies the process of connecting to various data sources and destinations.

When you complete the steps of the Copy Assistant, a Copy activity will be created in your Data pipeline and shown in the canvas.

KQL DB as a data source in Data pipeline's Copy Assistant
KQL DB as a data source in Data pipeline’s Copy Assistant


Numerous Data Sources and Destinations

The Copy Assistant doesn’t just offer simplicity—it also presents an extensive range of data sources and destinations to choose from. These include sample data sources, such as the NYC taxi data, Fabric data sources like KQL DB, Data Warehouse, and Lakehouse, Azure data sources like Azure Blob Storage, Azure Data Explorer, and Cosmos DB. Additionally, the assistant allows integration with external data products like Amazon Redshift, PostgreSQL, and Spark, along with files from Amazon S3 and Google Cloud Storage. Services like Snowflake and many others also find a place within this comprehensive spectrum.

KQL DB as a destination in Data pipeline's Copy Assistant
KQL DB as a destination in Data pipeline’s Copy Assistant


KQL DB and Azure Data Explorer Integration

Real-Time Analytics’ KQL DB is supported as both a destination and a source, allowing you to build and manage various extract, transform, and load (ETL) activities, leveraging the power and capabilities of KQL DBs with the rich coinfections eco system of Data pipelines. Additionally, the Azure version of KQL DBs, Azure Data Explorer clusters, is also supported, enabling you to integrate Fabric capabilities with the Azure-based offering.

KQL DB as a destination in Data pipeline's 'Copy data' activity
KQL DB as a destination in Data pipeline’s ‘Copy data’ activity

Lookup Activity

In addition to the Copy activity, there’s another valuable activity at your disposal when working with KQL DBs: the Lookup activity. This activity is specifically designed for executing queries and fetching datasets from KQL DBs, Azure Data Explorer, or any other data source supported by Data pipeline’s Lookup activity. The result the query will be returned as the output of the Lookup activity, and can be used in the next activity in the pipeline. You can use it to dynamically determine which objects to operate on in a subsequent activity. The output  can be consumed in a subsequent copy, transformation, or control flow activities like ForEach activity.

Lookup activity with KQL DB
Lookup activity with KQL DB


Summary

The integration of KQL DBs and Data pipelines offers a streamlined approach to collecting, processing, storing, and analyzing data from various sources. This integration breaks down data silos and ensuring consistency across sources. KQL DB’s strong querying capabilities enable users to extract insights effectively, promoting in-depth data exploration. This integration enhances operational flexibility and empowers Fabric users to reveal concealed patterns, trends, and relationships within their data. By combining KQL DBs with a Data pipeline, Fabric users can tap into their data’s full potential, fostering innovation and gaining a competitive advantage in an increasingly data-centric landscape.

For more information, visit the documentation.

Related blog posts

Using Data pipelines for copying data to/from KQL Databases and crafting workflows with the Lookup activity

April 16, 2024 by Ruixin Xu

We are pleased to share a set of key updates regarding the Copilot in Microsoft Fabric experiences. The information in this blog post has also been shared with Fabric tenant administrators. Below are the highlights of the changes. This change is an important milestone to eventually allow Copilot to reach GA within this year. See … Continue reading “Copilot in MS Fabric: Soon available to more users in your organization”

April 15, 2024 by Santhosh Kumar Ravindran

Users orchestrate their data engineering or data science processes using notebooks and in most of the enterprise scenarios pipelines and job schedulers are used as a primary option to schedule and trigger these Spark jobs. We are thrilled to announce a new feature Job Queueing for Notebook Jobs in Microsoft Fabric. This feature aims to … Continue reading “Introducing Job Queueing for Notebook in Microsoft Fabric”