Microsoft Fabric Updates Blog

User Data Functions now support async functions and pandas DataFrame, Series types

Fabric user data functions empower developers to process and analyze data at scale directly within Microsoft Fabric by writing custom logic in Python. THer are now two new features for User data functions programming model in Microsoft Fabric:

  1. Support for writing async functions: With an async function you can improve responsiveness and efficiency by handling multiple tasks at once. They are ideal for managing high volumes of I/O-bound operations. 
  2. Support for Pandas DataFrames and Series for input/output types: Fabric user data functions (UDFs) let developers define functions that process batches of input rows as Pandas DataFrames and return results as Pandas arrays or Series, improving speed and performance in large-scale data analysis.

How to enable these features

Explore new features that enhance productivity in Microsoft Fabric’s User data functions.

  1. Create a new user data function or open an existing one.
  2. Select Library management.
  3. Update the fabric-user-data-functions library to 1.0.0.
Update fabric-user-data-functions library
Update fabric-user-data-function library

How to write an async function

Add async keyword with your function definition as shown below in the example. This example function reads a CSV file from a Lakehouse using pandas. Function takes file name as an input parameter.

import pandas as pd 

# Replace the alias "<My Lakehouse alias>" with your connection alias.
@udf.connection(argName="myLakehouse", alias="<My Lakehouse alias>")
@udf.function()
async def read_csv_from_lakehouse(myLakehouse: fn.FabricLakehouseClient, csvFileName: str) -> str:

    # Connect to the Lakehouse
    connection = myLakehouse.connectToFilesAsync()   

    # Download the CSV file from the Lakehouse
    csvFile = connection.get_file_client(csvFileName)

    downloadFile = await csvFile.download_file()
    csvData = await downloadFile.readall()
    
    # Read the CSV data into a pandas DataFrame
    from io import StringIO
    df = pd.read_csv(StringIO(csvData.decode('utf-8')))

    # Display the DataFrame    
    result="" 
    for index, row in df.iterrows():
        result=result + "["+ (",".join([str(item) for item in row]))+"]"
    
    # Close the connection
    csvFile.close()
    connection.close()

    return f"CSV file read successfully.{result}"

How to use pandas DataFrames and Series types

Fabric user data functions have better performance by integrating Apache Arrow for handling Pandas data structures. Pandas data within UDFs can be operated using JSON for serialization and deserialization, which, while flexible, introduces overhead, particularly with large datasets. The new Arrow-optimized approach leverages Apache Arrow’s highly efficient columnar memory format and zero-copy mechanisms. This means Pandas DataFrames and Series are now represented using Arrow both when transmitted on the wire and when stored in-memory during UDF execution. This native integration bypasses the costly translation to and from JSON. 

Add pandas library to the User data functions item:

Add pandas library to user data functions item in Fabric
Add pandas library to user data functions item

Remember to include import pandas as pd in your function_app.py file before getting started. Let’s dive into an example to show how to use pandas DataFrames and Series as input/output types in Fabric user data functions. We’ll create a function that calculates the total revenue earned by each driver.

import pandas as pd 

@udf.function()
def total_revenue_by_driver(df: pd.DataFrame) -> pd.Series:
    """
    Description: Calculate total revenue earned by each driver. This function sums up all trip fares for each driver to determine their total earnings, useful for driver performance analysis.
    
    Args:
    df : pd.DataFrame
        Input DataFrame containing trip data with columns:
        - 'driver_id': str or int, unique driver 
        - 'trip_fare': float, fare amount for each trip

    Example: Use this example as input to test the function
    	{
 	 "driver_id": ["D001", "D002", "D001", "D003", "D002"],
 	 "trip_fare": [25.50, 30.00, 22.75, 45.00, 28.50]
  
}
    
    Returns: pd.Series
        Series with driver IDs as index and total revenue as values.
   """
   
    result_series = df.groupby("driver_id")["trip_fare"].sum()
    result_series.name = "total_revenue"
    return result_series

Publish the changes and then test the function, you should receive the following output:

{"D001":48.25,"D002":58.5,"D003":45}

Invoke functions from a Notebook

With notebooks, you can effortlessly call data processing functions on datasets containing millions of rows. This approach allows you to efficiently aggregate and analyze large volumes of data, such as calculating total revenue by driver, directly within your notebook. Using the same function as shown above, you can test and validate your results on massive datasets, making notebooks a powerful tool for both development and production scenarios.

data= {'driver_id': ['D001', 'D002', 'D001', 'D003', 'D002'],'trip_fare': [25.50, 30.00, 22.75, 45.00, 28.50]}

import pandas as pd
df= pd.DataFrame(data)
# Get functions. Replace 
myFunctions = notebookutils.udf.getFunctions('my-user-data-function-name')
# Invoke the function
result= myFunctions. total_revenue_by_driver(df)

# returns a Series object
print(result

Conclusion

These features improve efficiency and performance when working with large datasets containing millions of rows with pandas DataFrame, Series object types support and reduce I/O operations and handle tasks concurrently. Checkout the library in PyPI fabric-user-data-functions·PyPI for latest version updates.

To learn more, refer to the how to user Fabric user data functions SDK documentation.

Get started with free trial today and unlock the full potential of your data with Microsoft Fabric User data functions. Submit your feedback on Fabric Ideas and join the conversation on the Fabric Community.

Related blog posts

User Data Functions now support async functions and pandas DataFrame, Series types

November 4, 2025 by Misha Desai

We’re introducing a set of new enhancements for Data Agent creators — designed to make it easier to debug, improve, and express your agent’s logic. Whether you’re tuning example queries, refining instructions, or validating performance, these updates make it faster to iterate and deliver high-quality experiences to your users. New Debugging Tools View referenced example … Continue reading “Creator Improvements in the Data Agent”

November 3, 2025 by Arshad Ali

Additional authors – Madhu Bhowal, Ashit Gosalia, Aniket Adnaik, Kevin Cheung, Sarah Battersby, Michael Park Esri is recognized as the global market leader in geographic information system (GIS) technology, location intelligence, and mapping, primarily through its flagship software, ArcGIS. Esri empowers businesses, governments, and communities to tackle the world’s most pressing challenges through spatial analysis. … Continue reading “ArcGIS GeoAnalytics for Microsoft Fabric Spark (Generally Available)”