Integrating Fabric with Databricks using private network
Microsoft Fabric and Azure Databricks are widely used data platforms. This article aims to address the requirement of customers who have large data estates in Azure databricks and want to unlock additional use cases in Microsoft Fabric arising out of different business teams. When integrating the two platforms, a crucial security requirement is to ensure all data traversal happens within private network perimeters.
Fabric has several options to integrate with Databricks today –
- Customers can virtualize Databricks delta tables in Fabric by creating shortcuts
- Create mirrored database in Fabric on top of selected Databricks UC tables.
- Read Onelake data in Databricks notebooks using ABFS driver.
However, none of these integrations support end to end private network connectivity today (May 25). This article explores Fabric native custom solutions that enable copying data from Databricks into Onelake, behind private network in a scalable way. These are some possible solutions until network security features are available out of the box for above integration patterns.
Functional Requirements
For this use case, the functional requirements are:
- Serve Databricks curated data via OneLake to different Fabric workloads
- Private network connectivity
- High throughput copy performance for large volumes of data, typical for enterprise customers
Current Limitations
Ideal solution for this use case would be private network connectivity support, natively in shortcuts or Databricks mirroring. This will eliminate custom development to physically move data across the two platforms. Other possible solution here can be MPE(Managed Private Endpoint) support in Fabric pipelines. However, currently MPE is supported only in Fabric Spark and Eventstream workloads.
Possible Solutions
To physically copy data from ADB to Fabric, there are two native integration options available in Fabric data factory today – Pipelines & Data flow Gen2. The following table illustrates capabilities of both workloads in the context of this requirement.
GA Features as of May 25, 2025
Private Network | |||||
ADB Connector | OPDG | Vnet Gateway | Fast Copy | Fit for purpose | |
*Data flow Gen2 | Yes | Yes | Yes | No (No ADB Support) | No |
Pipeline | Yes | Yes | Yes | Yes | Yes |
Any complete solution must have an ADB connector, support for at least one private network feature between On-Premise Data Gateway (OPDG) & VNet Gateway and support for the Fast Copy feature (for scalability).
- Option 1 – ADB > VNet Gateway > pipeline > OneLake
Pipeline Connector for Databricks is available today. Beginning May 25, 2025, VNet gateway support for Fabric pipelines will be in preview. Fast copy is supported by pipeline copy activity out of the box.
- Option 2 – ADB > OPDG > pipeline > OneLake
Pipeline Connector for Databricks is available today. OPDG supports Databricks connections. Fast copy is supported by pipeline copy activity out of the box.
* Data flow Gen2 currently isn’t an option because its Databricks connector doesn’t support fast copy. This may change in the future, making it a potential alternative to pipeline-based integration.
The chosen solution must be tested to ensure it meets the expected performance requirements, especially when VNet gateway is used. VNet gateway is a managed solution and currently does not have scale up flexibility. OPDG has the option to scale VM compute SKUs if needed.
Next Steps
These are areas of active investment from the Fabric engineering team. Readers are welcome to provide feedback or submit ideas on this integration pattern.
This article has been written in collaboration with my colleague Harold Park.