Introduction Welcome to What’s New in Fabric Warehouse, where we’ll spotlight our work improving quality, delivering major performance enhancements, boosting developer productivity, and our continuous investments in security. Whether you’re migrating from Synapse, optimizing your workloads, writing SQL in VS Code, or exploring new APIs, this roundup has something for every data professional. With quality …
Continue reading “What’s new in Fabric Warehouse – July 2025 Recap”
We’re thrilled to announce the general availability (GA) of Autoscale Billing for Apache Spark in Microsoft Fabric — a serverless billing model designed to offer greater flexibility, transparency, and cost efficiency for running Spark workloads at scale. With this model now fully supported, Spark Jobs can run independently of your Fabric capacity and are billed …
Continue reading “Autoscale Billing for Spark in Microsoft Fabric (Generally Available)”
Microsoft Fabric offers native Git integration and deployment pipelines to facilitate version control, collaboration, and automated releases for workspace items like user data functions. This guide explains how to set up and manage Git integration for user data functions within a Fabric workspace.
• Workspace preparation and Git linking: Users start by selecting or creating a Fabric workspace containing user data functions, then enable Git integration via workspace settings by connecting to a Git provider and repository branch, optionally specifying a folder for organization.
• Branching strategy configuration: Teams are advised to adopt branching strategies such as main/develop, feature, and release branches, along with pull request and code review policies to maintain code quality and collaboration.
• Managing user data functions in Git: Each data function is stored in a function_app.py file; users clone the repository locally, edit or add functions, and update the definition.json file to reflect new functions and required libraries like numpy.
• Committing, syncing, and publishing changes: After committing changes in VS Code, users sync with the Fabric portal, update the function via source control, and publish to deploy the new or updated functions, making them available for invocation.
Microsoft Fabric has introduced new features for its User Data Functions (UDFs), enhancing Python-based data processing capabilities within the platform. These updates include support for asynchronous functions and the use of pandas DataFrame and Series types for input and output, enabling more efficient handling of large-scale data.
• Async function support: Developers can now write async functions in Fabric UDFs to improve responsiveness and efficiency, especially for managing high volumes of I/O-bound operations, such as reading files asynchronously from a Lakehouse.
• Pandas DataFrame and Series integration: UDFs can accept and return pandas DataFrames and Series, allowing batch processing of rows with improved speed and performance in data analysis tasks. An example function calculates total revenue by driver using pandas groupby operations.
• Usage in notebooks: These functions can be invoked directly from notebooks using pandas objects, facilitating efficient aggregation and analysis of large datasets interactively within Microsoft Fabric.
• Getting started and benefits: Users can enable these features by updating the fabric-user-data-functions library to version 1.0.0. The enhancements reduce I/O operations, enable concurrent task handling, and improve performance on datasets with millions of rows.
Effortlessly read Delta Lake tables using Apache Iceberg readers Microsoft Fabric is a unified, SaaS data and analytics platform designed for the era of AI. All workloads in Microsoft Fabric use Delta Lake as the standard, open-source table format. With Microsoft OneLake, Fabric’s unified SaaS data lake, customers can unify their data estate across multiple …
Continue reading “New in OneLake: Access your Delta Lake tables as Iceberg automatically (Preview)”