Spark Connector for Fabric Data Warehouse (DW) – Preview
We are pleased to announce the availability of the Fabric Spark connector for Fabric Data Warehouse (DW) in the Fabric Spark runtime. This connector enables Spark developers and data scientists to access and work with data from Fabric DW and the SQL analytics endpoint of the lakehouse, either within the same workspace or across different workspaces, using a simplified Spark API. The connector will be included as a default library within the Fabric Runtime, eliminating the need for separate installation.
Read Support
The connector supports reading data from tables or views from both the Data Warehouse and the SQL analytics endpoint. It is designed with security in mind, requiring minimal permission to work with Fabric SQL engines and adhering to security models such as Object Level Security (OLS), Row Level Security (RLS), and Column Level Security (CLS) defined at the SQL engine level.
Write Support
The connector now supports writing data of a dataframe to a Fabric DW table. It employs a two-phase write process: initially staging the Spark dataframe data into intermediate storage, followed by using the COPY INTO command to ingest the data into the Fabric DW table. This approach ensures scalability with increasing data volumes and supports multiple modes for writing data to a DW table.
PySpark Support
We are also excited to announce PySpark support for this connector, in addition to Scala. This means that you no longer need to use a workaround to utilize this connector in PySpark, as it is now available as a native capability.
To learn more about Spark Connector for Fabric Data Warehouse (DW), please refer to the documentation at: Spark connector for Fabric Data Warehouse