How Spark Supports OneLake Security with Row and Column Level Policies
Recently, we announced a significant milestone: public support for Row and Column Level Security within OneLake. This universal security framework applies consistently across all data engines, regardless of how data is accessed. Traditionally, Spark does not provide granular security features and assumes unrestricted access to the required datasets for query execution.
To address this limitation, our Spark engineering team has developed a customized solution that enables secure data access in Spark without compromising performance. At the engine level, when a job needs to read from a table protected by Row or Column Level Security policies, the process is divided into two isolated environments. One environment executes all user code, while the other securely accesses and prepares data for consumption by the user code. During preparation, Row and Column Level Security policies are applied to ensure that only authorized data is exposed.

This separation occurs seamlessly and requires no additional configuration or manual job execution from users. The Spark engine automatically initiates the secure environment for every job run within the workspace, and it dynamically scales resources based on query demand. If no queries are active, the secure environment remains available for five minutes before termination to optimize performance and reduce startup latency. Users can monitor secure environment jobs through the monitoring. These activities can be identified by the ‘SparkSecurityControl’ prefix.

OneLake enforces universal security rigorously, ensuring there are no unauthorized entry points. Direct file-level access to tables with Row and Column Level Security policies is strictly prohibited. Similarly, Spark code cannot access tables by specifying their direct file path; access must be via namespace references in Spark SQL such as lakehouse.schema.table.
Spark supports both schema-enabled and non-schema lakehouses configured with Row and Column Level Security policies. It is essential, however, that for Spark to use the secure cluster, the user must have a schema-enabled lakehouse pinned as the default, regardless of whether the queried Lakehouse is schema-enabled or not.

The preview of OneLake security is now available to all users. Check out the feature now in your workspace, review our updated documentation, or sign up for a free Microsoft Fabric trial to see OneLake security in action for yourself!