Microsoft Fabric Updates Blog

Boost performance effortlessly with Automated Table Statistics in Microsoft Fabric

We’re thrilled to introduce Automated Table Statistics in Microsoft Fabric Data Engineering — a major upgrade that helps you get blazing-fast query performance with zero manual effort.

Whether you’re running complex joins, large aggregations, or heavy filtering workloads, Fabric’s new automated statistics will help Spark make smarter decisions, saving you time, compute, and money.

What are Automated Table Statistics?

In simple terms, table statistics are summary metrics about your data that Spark uses to optimize queries which include:

  • Total row counts.
  • Null counts per column.
  • Minimum and maximum values per column.
  • Distinct value counts per column.
  • Average and maximum column lengths.

Until now, generating these statistics required running manual commands (ANALYZE TABLE) or setting up custom pipelines. With this release, Fabric collects them automatically when you create a new Delta table — no setup or tuning required.

Why does it matter?

Spark’s Cost-Based Optimizer (CBO) relies on good statistics to:

  • Choose the best join strategy (e.g., broadcast vs. shuffle joins).
  • Prune partitions effectively.
  • Reduce unnecessary data shuffles.
  • Optimize aggregations.

Without accurate stats, Spark can only guess, often leading to slow or expensive query plans. With automated stats, Spark becomes much smarter — and in our internal benchmarks, we’ve seen up to ~45% performance gains on complex workloads.


How does it work?

  • Enabled by default on all new Delta tables.
  • Tracks first 32 columns (including nested columns).
  • Stores stats separately in Parquet format to avoid bloating data files.
  • Fully integrated with Fabric Spark 3.5 and later.

You can also fine-tune behavior with configurations:

Enable/disable stats collection:

spark.conf.set("spark.microsoft.delta.stats.collect.extended", "true")

Enable/disable optimizer injection:

spark.conf.set("spark.microsoft.delta.stats.injection.enabled", "true")

How to Check or Recompute Statistics

Want to see what’s under the hood? Fabric lets you inspect collected stats and recompute them if needed:

Check current stats (Scala):

println(spark.read.table("tableName").queryExecution.optimizedPlan.stats)

Recompute after schema changes:

StatisticsStore.recomputeStatisticsWithCompaction(spark, "tableName")

For full control, you can also use the familiar ANALYZE TABLE command.


Important limitations to know

As with any advanced feature, it’s important to understand the current boundaries:

  • Stats are collected only at write time.
  • Updates from other engines won’t be included.
  • Only the first 32 columns are tracked.
  • Recomputing stats may require rewriting the table.
  • No fallback → in rare cases, stats may sometimes impact performance.

We’re actively working on expanding support — including improvements for existing tables and better recompute workflows.

Learn More

To learn more about automated statistics for tables please visit the Configure and manage Automated Table Statistics in Fabric Spark documentation.

Related blog posts

Boost performance effortlessly with Automated Table Statistics in Microsoft Fabric

November 4, 2025 by Misha Desai

We’re introducing a set of new enhancements for Data Agent creators — designed to make it easier to debug, improve, and express your agent’s logic. Whether you’re tuning example queries, refining instructions, or validating performance, these updates make it faster to iterate and deliver high-quality experiences to your users. New Debugging Tools View referenced example … Continue reading “Creator Improvements in the Data Agent”

November 3, 2025 by Arshad Ali

Additional authors – Madhu Bhowal, Ashit Gosalia, Aniket Adnaik, Kevin Cheung, Sarah Battersby, Michael Park Esri is recognized as the global market leader in geographic information system (GIS) technology, location intelligence, and mapping, primarily through its flagship software, ArcGIS. Esri empowers businesses, governments, and communities to tackle the world’s most pressing challenges through spatial analysis. … Continue reading “ArcGIS GeoAnalytics for Microsoft Fabric Spark (Generally Available)”