Microsoft Fabric Updates Blog

Boost performance effortlessly with Automated Table Statistics in Microsoft Fabric

We’re thrilled to introduce Automated Table Statistics in Microsoft Fabric Data Engineering — a major upgrade that helps you get blazing-fast query performance with zero manual effort.

Whether you’re running complex joins, large aggregations, or heavy filtering workloads, Fabric’s new automated statistics will help Spark make smarter decisions, saving you time, compute, and money.

What are Automated Table Statistics?

In simple terms, table statistics are summary metrics about your data that Spark uses to optimize queries which include:

  • Total row counts.
  • Null counts per column.
  • Minimum and maximum values per column.
  • Distinct value counts per column.
  • Average and maximum column lengths.

Until now, generating these statistics required running manual commands (ANALYZE TABLE) or setting up custom pipelines. With this release, Fabric collects them automatically when you create a new Delta table — no setup or tuning required.

Why does it matter?

Spark’s Cost-Based Optimizer (CBO) relies on good statistics to:

  • Choose the best join strategy (e.g., broadcast vs. shuffle joins).
  • Prune partitions effectively.
  • Reduce unnecessary data shuffles.
  • Optimize aggregations.

Without accurate stats, Spark can only guess, often leading to slow or expensive query plans. With automated stats, Spark becomes much smarter — and in our internal benchmarks, we’ve seen up to ~45% performance gains on complex workloads.


How does it work?

  • Enabled by default on all new Delta tables.
  • Tracks first 32 columns (including nested columns).
  • Stores stats separately in Parquet format to avoid bloating data files.
  • Fully integrated with Fabric Spark 3.5 and later.

You can also fine-tune behavior with configurations:

Enable/disable stats collection:

spark.conf.set("spark.microsoft.delta.stats.collect.extended", "true")

Enable/disable optimizer injection:

spark.conf.set("spark.microsoft.delta.stats.injection.enabled", "true")

How to Check or Recompute Statistics

Want to see what’s under the hood? Fabric lets you inspect collected stats and recompute them if needed:

Check current stats (Scala):

println(spark.read.table("tableName").queryExecution.optimizedPlan.stats)

Recompute after schema changes:

StatisticsStore.recomputeStatisticsWithCompaction(spark, "tableName")

For full control, you can also use the familiar ANALYZE TABLE command.


Important limitations to know

As with any advanced feature, it’s important to understand the current boundaries:

  • Stats are collected only at write time.
  • Updates from other engines won’t be included.
  • Only the first 32 columns are tracked.
  • Recomputing stats may require rewriting the table.
  • No fallback → in rare cases, stats may sometimes impact performance.

We’re actively working on expanding support — including improvements for existing tables and better recompute workflows.

Learn More

To learn more about automated statistics for tables please visit the Configure and manage Automated Table Statistics in Fabric Spark documentation.

Related blog posts

Boost performance effortlessly with Automated Table Statistics in Microsoft Fabric

November 18, 2025 by Bogdan Crivat

Two years ago, we introduced Microsoft Fabric to Azure Synapse users as a bold new direction for analytics—one that promised unification, simplicity, scale. Today, that promise has matured into a complete platform that’s not just ready for enterprises, it’s built for them. To be very clear: Synapse continues to be supported. We will fix bugs, … Continue reading “Two Years in: How Fabric Redefines the Modernization Path for Synapse Users”

November 4, 2025 by Misha Desai

We’re introducing a set of new enhancements for Data Agent creators — designed to make it easier to debug, improve, and express your agent’s logic. Whether you’re tuning example queries, refining instructions, or validating performance, these updates make it faster to iterate and deliver high-quality experiences to your users. New Debugging Tools View referenced example … Continue reading “Creator Improvements in the Data Agent”