Microsoft Fabric Updates Blog

Semantic link in Microsoft Fabric: Bridging BI and Data Science

We are pleased to introduce the Public Preview of semantic link, an innovative feature that seamlessly connects Power BI datasets with Synapse Data Science within Microsoft Fabric. As the gold layer in a medallion architecture, Power BI datasets contain the most refined and valuable data in your organization. With semantic link, we unlock this data’s potential beyond traditional business intelligence by making it accessible to notebooks and Python in Microsoft Fabric.

Python has emerged as the go-to language for state-of-the-art machine learning and boasts a vast ecosystem of libraries for a wide range of tasks, including rich visualizations, statistical analysis, and data validation. By bridging this gap, we aim to empower business analysts to utilize modern data tools with their data, enable Power BI developers to streamline automation tasks, and facilitate seamless collaboration with data scientists.

Semantic link supports the popular pandas and Spark APIs, making it easy to join existing data and apply common libraries. You can compute Power BI measures, read tables, and execute DAX queries. Semantic link goes beyond plain data connectivity by propagating semantic information from Power BI to power new capabilities of Microsoft Fabric for data augmentation, validation and exploration, as well as an extendable set of semantic functions.

Empowering Insights: Mapping the Journey from
Power BI to OneLake with semantic link

In this blog post, we’ll showcase semantic links capabilities to access Power BI datasets.

Use semantic link to bring your Power BI data to pandas

Semantic link offers easy to use Python methods for pandas users to discover and read data:

The following code snippets show how to install the python library in Microsoft Fabric and evaluate Power BI measures. The resulting FabricDataFrame is a semantically aware pandas dataframe – with all its functionality – while providing additional features like semantic propagation and semantic functions. Note that this sample assumes that the Power BI dataset “Customer Profitability Sample” is accessible in the Fabric workspace.

Package installation and sample code for evaluate_measure.

To make your adventures into notebooks even easier, you can use the %%dax cell magic to execute DAX. The sample below queries a Dynamic Management View (DMV) and its output is available in the _ variable for further analysis using Python (see output caching). All underlying requests are run on low-priority, making sure that your production workload is not impacted.

%%dax cell magic loading and usage.

Use semantic link to bring your Power BI data to Spark

Spark users can access Power BI data from all languages supported in Fabric: Python, R, and SparkSQL using the semantic link Spark native connector. Configure the Power BI catalog to gain access to all your datasets. In this example we evaluate a measure using the special _Metrics table. All other tables are accessible using e.g. “pbi.`Customer Profitability Sample`.Customer” and ready to be combined with other Spark data sources.

Configuration and usage of the Spark native connector for Power BI datasets.

Use semantic propagation for data augmentation

Semantic links Python API returns FabricDataFrame when accessing Power BI data to enable data augmentation and semantic functions. Here’s a brief example on how you can augment an existing dataframe with Power BI data. Instead of computing the measure for a set of dimensions, joining the data frame and filtering it, the add_measure function simplifies the operation by matching the columns to the Power BI dataset – here Customer[Country/Region] and Industry[Industry] – to compute the measures Total Revenue and Total COGS at these levels and automatically adding them.

Data augmentation using add_measure.

Discover semantic functions with intelligent code auto-completion

Semantic functions enable intelligent auto-complete by matching function parameters with column metadata. For example, the to_geopandas function provides suggestions to bind the  lat_col and long_col parameters to the latitude and longitude columns based on Power BI data categories.

Semantic function parameter auto-complete.

A semantic function is a regular Python function, exposed on FabricDataFrames and accompanied with metadata to enable intelligent auto-completion. While semantic link provides a few semantic functions available on GitHub, you can define your own semantic functions using Python decorators. The @semantic_function decorator applied on the _is_capatial function makes it available for intelligent code auto-completion.

Explore and validate data in Power BI from Python

Ensuring data quality is a crucial task and semantic link provides tools to support this. In this example we visualize existing relationships defined in your Power BI dataset.

Visualizing Power BI dataset relationships using list_relationships.

To understand the data in even more detail, the find_dependencies and plot_dependencies_metadata methods help you understand and visualize functional dependencies present in your data:

Functional dependencies detected using find_dependencies function.

To learn even more about data validation and exploration visit our docs.

Get coding!

In summary, semantic link is a powerful tool that enables business analysts and data scientists to use data effectively in a comprehensive data science environment. By using semantic link, you can:

  • Eliminate duplicated business logic by empowering data scientists to directly access your semantic model in Power BI datasets
  • Do even more with semantic information present in Power BI datasets using semantic functions, data augmentation, validation and exploration.

Hope you find semantic link useful, and we welcome your feedback and suggestions. To try semantic link follow our how-to guides. We’d love to hear your feedback in the comments, and Fabric ideas!

Related blog posts

Semantic link in Microsoft Fabric: Bridging BI and Data Science

April 23, 2024 by Misha Desai

At the recent Fabric Conference, we announced that both code-first automated machine learning (AutoML) and hyperparameter tuning are now in Public Preview, a key step in making machine learning more complete and widely accessible in the Fabric Data Science. Our system seamlessly integrates the open-source Fast Library for Automated Machine Learning & Tuning (FLAML), offering … Continue reading “Introducing Code-First AutoML and Hyperparameter Tuning: Now in Public Preview for Fabric Data Science”

April 18, 2024 by Santhosh Kumar Ravindran

We are excited to announce a new feature which has been a long ask from Synapse Spark customers, Optimistic Job Admission for Spark in Microsoft Fabric.This feature brings in more flexibility to optimize for concurrency usage (in some cases ~12X increase) and prevents job starvation. This job admission approach aims to reduce the frequency of … Continue reading “Introducing Optimistic Job Admission for Fabric Spark”