Microsoft Fabric Updates Blog

Working with large data types in Fabric Data Warehouse

Traditionally, warehouses are designed for smaller data types (numbers, dates, smaller strings) that are suitable for efficient analytics. Currently, Fabric Data Warehouse has a limit that allows you to store string or binary data up to 8KB per cell. Increasing this limit has been one of the top requests for Fabric DW.

We are excited to share that we are removing this limit and enabling you to store large string and binary values in Fabric DW. We are adding support for VARCHAR(MAX) and VARBINARY(MAX) types, allowing you to store up to 1MB of data per cell. You can use VARCHAR(MAX) and VARBINARY(MAX) types to declare the columns that should contain more than 8KB of data.

CREATE TABLE Product ( id int,
                       title VARCHAR(200),
                       description VARCHAR(MAX)   
)

With these types, you won’t have to worry about truncation of strings that might represent descriptions, comments, notes, and other potentially larger textual values that might exceed 8KB.

The VARCHAR(MAX) type will open new scenarios by enabling you to store semi-structured data formatted as JSON without worrying about potential parsing errors due to truncation. JSON formatted text commonly exceeds 8KB in length, and with the new 1MB storage size, most JSON documents will fit into the warehouse columns. Also, the introduction of VARCHAR(MAX) and VARBINARY(MAX) opens the possibility to enhance SQL endpoints for Lakehouse and mirrored databases, as the string or binary data will no longer be truncated to 8KB.

Performance enhancements for new data types

One of the primary concerns when introducing large data types is performance. Traditionally, warehouses recommend optimizing data types and minimizing them to match the largest possible value in the column. With VARCHAR(MAX), this concern is heightened because it is one of the most demanding types.

To address these concerns, we are introducing several performance improvements in Fabric DW. These enhancements will speed up various operations on string and binary columns, such as the LIKE operator, filtering by string columns, and batch mode execution on large text. As a result, you should experience minimal overhead when using the VARCHAR(MAX) type compared to the VARCHAR(N) type, assuming you are working with similar data sizes. Although we still recommend optimizing your data types, we are ensuring that sub-optimal and large data types introduce minimal overhead.

Conclusion

By introducing VARCHAR(MAX) and VARBINARY(MAX) in the Fabric warehouse, we are removing one of the key obstacles to the adoption of Fabric DW and enabling numerous scenarios to enhance your warehousing solutions.

This feature is currently in private preview, used by a limited number of customers and it will be available publicly in October 2024.

Related blog posts

Working with large data types in Fabric Data Warehouse

May 19, 2025 by Amir Jafari

Co-author: Joanne Wong We’re excited to announce the upcoming integration of Fabric data agent with Copilot in Power BI, enhancing your ability to extract insights seamlessly. What’s new? A new chat with your data experience is launching soon in Power BI– a full-screen Copilot for users to ask natural language questions and receive accurate, relevant … Continue reading “Extracting deeper insights with Fabric Data Agents in Copilot in Power BI”

May 19, 2025 by Santhosh Kumar Ravindran

The Fabric Spark Native Execution Engine (NEE) is now generally available (GA) as part of Fabric Runtime 1.3. This C++-based vectorized engine (built on Apache Gluten and Velox) runs Spark workloads directly on the lakehouse, requiring no code changes or new libraries. It supports Spark 3.5 APIs and both Parquet and Delta Lake formats, so … Continue reading “Microsoft Fabric Spark: Native Execution Engine now generally available”