Microsoft Fabric Updates Blog

Legacy Timestamp Support in Native Execution Engine for Fabric Runtime 1.3

The recent update to Native Execution Engine on Fabric Runtime 1.3 brings support for legacy timestamp handling, allowing seamless processing of timestamp data created by different Spark versions. This feature helps to address compatibility issues introduced when Spark 3.0 switched to the Java 8 date/time API, which uses the Proleptic Gregorian calendar (SQL ISO standard), in contrast to earlier versions that relied on a hybrid Julian-Gregorian calendar. 

Why Legacy Timestamp Support Matters 

Spark writes dates and timestamps in Parquet files as integers or longs (representing days or seconds from the UNIX epoch, January 1, 1970). Due to the calendar switch, the same date or timestamp value written by Spark 2.x and Spark 3.0 might have different values in Parquet files. For example, suppose we’re writing someone’s date of birth, like January 1, 1955, to Parquet. In Spark 2.x, January 1, 1955, would be written to Parquet as -5,479 (representing 5,479 days before the epoch) in a date field. In Spark 3.x, January 1, 1955, would be written as -5,478 days before the epoch, due to the consistent use of the Proleptic Gregorian calendar. This results in a one-day discrepancy when reading the same Parquet file across Spark versions if no rebasing is applied. 

Legacy timestamp support addresses this by rebasing timestamp values between these two calendar systems during read and write operations, so data remains accurate across Spark versions. 

Configuration Parameters for Legacy Timestamp Handling 
Several configurations in Spark control rebasing behavior for dates and timestamps stored in Parquet files. In Fabric Runtime 1.3, these settings allow you to specify how Spark should read date and timestamp data: 

For INT96 Timestamps

  • spark.sql.parquet.int96RebaseModeInWrite 
  • spark.sql.parquet.int96RebaseModeInRead 

For Date and INT64 Timestamps (in milliseconds or microseconds)

  • spark.sql.parquet.datetimeRebaseModeInWrite 
  • spark.sql.parquet.datetimeRebaseModeInRead 

Each of these can be set to: 

  • LEGACY: For INT96 timestamps, rebase dates/timestamps from the Julian calendar to the Proleptic Gregorian calendar. For DATE and INT64 timestamps (in days, milliseconds, or microseconds), rebase from the Proleptic Gregorian calendar to the hybrid Julian-Gregorian calendar. 
  • CORRECTED: No rebasing; use the Proleptic Gregorian calendar consistently for all dates and timestamps. 
  • EXCEPTION (default): Throws an error if there’s a compatibility issue. 

For example: 

SET spark.sql.parquet.datetimeRebaseModeInRead = 'LEGACY'; 

Native Execution Engine’s Behavior and Solution 

Native Execution Engine on Fabric Runtime 1.3, now includes legacy timestamp support to handle date and timestamp values written by earlier versions of Spark. 

Solution 

The legacy timestamp support feature allows the native execution engine to handle legacy timestamp values within Parquet files and Delta Tables without requiring users to configure additional settings. This feature, controlled by the configuration parameter spark.gluten.legacy.timestamp.rebase.enabled, enables the engine to automatically adjust for calendar differences, ensuring seamless compatibility across Spark versions. For dates that might be impacted by calendar discrepancies, the native execution engine uses predefined offsets to ensure consistent handling. Dates after the UNIX epoch (1970-01-01) are processed as-is, without requiring any rebase adjustments, as they are natively compatible. 

How to Enable and Use Legacy Timestamp Support in Native Execution Engine on Fabric Runtime 1.3 

To enable legacy timestamp support, simply set the following configuration in your Spark session or environment: 

SET spark.gluten.legacy.timestamp.rebase.enabled = true; 

This configuration enables Fabric’s native execution engine to rebase dates and timestamps as needed to ensure compatibility with legacy Spark-written data, including data from both Spark 2.x and Spark 3.x. 

This feature will be available in production across all regions by November 14, 2024, though it will not be enabled by default. Users can activate it through configuration settings. Default enablement is planned for a future update. 

Powiązane wpisy w blogu

Legacy Timestamp Support in Native Execution Engine for Fabric Runtime 1.3

kwietnia 6, 2026 autor: Arshad Ali

ADO.NET is a widely adopted data access technology in the .NET ecosystem that enables applications to connect to and work with data from databases and big data platforms. The Microsoft ADO.NET Driver for Fabric Data Engineering lets you connect, query, and manage Spark workloads in Microsoft Fabric with the reliability and simplicity of standard ADO.NET … Continue reading “Microsoft ADO.NET Driver for Microsoft Fabric Data Engineering (Preview)”

marca 27, 2026 autor: Avinanda Chattapadday

The enterprise-grade JDBC driver enables secure, flexible, and performant connectivity to Spark SQL workloads running in Microsoft Fabric, using Fabric’s Livy APIs as the execution layer.