Microsoft Fabric Updates Blog

Integrating Fabric with Databricks using private network

Microsoft Fabric and Azure Databricks are widely used data platforms. This article aims to address the requirement of customers who have large data estates in Azure databricks and want to unlock additional use cases in Microsoft Fabric arising out of different business teams. When integrating the two platforms, a crucial security requirement is to ensure all data traversal happens within private network perimeters.

Fabric has several options to integrate with Databricks today –

  • Customers can virtualize Databricks delta tables in Fabric by creating shortcuts
  • Create mirrored database in Fabric on top of selected Databricks UC tables.
  • Read Onelake data in Databricks notebooks using ABFS driver.

However, none of these integrations support end to end private network connectivity today (May 25). This article explores Fabric native custom solutions that enable copying data from Databricks into Onelake, behind private network in a scalable way. These are some possible solutions until network security features are available out of the box for above integration patterns.

Functional Requirements

For this use case, the functional requirements are:

  • Serve Databricks curated data via OneLake to different Fabric workloads
  • Private network connectivity
  • High throughput copy performance for large volumes of data, typical for enterprise customers

Current Limitations

Ideal solution for this use case would be private network connectivity support, natively in shortcuts or Databricks mirroring. This will eliminate custom development to physically move data across the two platforms. Other possible solution here can be MPE(Managed Private Endpoint) support in Fabric pipelines. However, currently MPE is supported only in Fabric Spark and Eventstream workloads.

Possible Solutions

To physically copy data from ADB to Fabric, there are two native integration options available in Fabric data factory today – Pipelines & Data flow Gen2. The following table illustrates capabilities of both workloads in the context of this requirement.

GA Features as of May 25, 2025

Private Network
ADB Connector OPDG Vnet Gateway Fast Copy Fit for purpose
*Data flow Gen2 Yes Yes Yes No (No ADB Support) No
Pipeline Yes Yes Yes Yes Yes

Any complete solution must have an ADB connector, support for at least one private network feature between On-Premise Data Gateway (OPDG) & VNet Gateway and support for the Fast Copy feature (for scalability).

  • Option 1 – ADB > VNet Gateway > pipeline > OneLake

Pipeline Connector for Databricks is available today. Beginning May 25, 2025, VNet gateway support for Fabric pipelines will be in preview. Fast copy is supported by pipeline copy activity out of the box.

  • Option 2 – ADB > OPDG > pipeline > OneLake

Pipeline Connector for Databricks is available today. OPDG supports Databricks connections. Fast copy is supported by pipeline copy activity out of the box.

* Data flow Gen2 currently isn’t an option because its Databricks connector doesn’t support fast copy. This may change in the future, making it a potential alternative to pipeline-based integration.

The chosen solution must be tested to ensure it meets the expected performance requirements, especially when VNet gateway is used. VNet gateway is a managed solution and currently does not have scale up flexibility. OPDG has the option to scale VM compute SKUs if needed.

Next Steps

These are areas of active investment from the Fabric engineering team. Readers are welcome to provide feedback or submit ideas on this integration pattern.

This article has been written in collaboration with my colleague Harold Park.

Billets de blog associés

Integrating Fabric with Databricks using private network

juillet 10, 2025 par Matthew Hicks

Effortlessly read Delta Lake tables using Apache Iceberg readers Microsoft Fabric is a unified, SaaS data and analytics platform designed for the era of AI. All workloads in Microsoft Fabric use Delta Lake as the standard, open-source table format. With Microsoft OneLake, Fabric’s unified SaaS data lake, customers can unify their data estate across multiple … Continue reading “New in OneLake: Access your Delta Lake tables as Iceberg automatically (Preview)”

juillet 10, 2025 par Ed Lima

Have you ever wondered how to give AI assistants access to your organization’s data in a clean, structured way? The Model Context Protocol (MCP) is an open standard that creates a bridge between large language models and external tools, APIs, and data sources. Think of it as a universal translator that lets AI agents understand … Continue reading “Connecting AI Agents to Microsoft Fabric with GraphQL and the Model Context Protocol (MCP)”