Data Factory Announcements at Build 2024 Recap
Data Factory Announcements at Microsoft Build Recap
Last week was such an exciting week for Fabric during the Microsoft Build Conference, filled with several product announcements and sneak previews of upcoming new features.
Thanks to all of you who participated in the conference, either in person or by being part of the many virtual conversations through blogs, Community forums, social media and other channels. Thank you also for all your product feedback and Ideas forum suggestions that help us defining the next wave of product enhancements.
We wanted to make sure you didn’t miss any of the Data Factory in Fabric announcements, by providing you with this recap of all new features:
- Introducing Data workflows in Microsoft Fabric [Announcement]
- Introducing Blob Storage Event Triggers for Data Pipelines [Announcement]
- Modern Get Data experience in Pipelines [Announcement]
- General Availability of the Power Query Connectors SDK for VS Code [Announcement]
- Introducing Trusted Workspace Access in Fabric Data Pipelines [Announcement]
- Dataflows Gen2 Refresh History Enhancement – Refresh the Refresh History [Announcement]
- New Pipeline activity: HDInsight [Announcement]
- New Pipeline activity: Fabric Spark Job Definition [Announcement]
- Parent/child pipeline pattern monitoring improvements [Announcement]
You can continue reading below for more information about each of these capabilities.
Introducing Data workflows in Microsoft Fabric
We are thrilled to announce the preview of Data workflows, a transformative capability within Microsoft Fabric that redefines your approach to constructing and managing data pipelines. Data workflows in Microsoft Fabric is powered by the Apache Airflow runtime, and provides an integrated, cloud-based platform for development, scheduling, and monitoring python-based data workflows, articulated as Directed Acyclic Graphs (DAGs). This innovation delivers a Software-as-a-Service (SaaS) experience for data pipeline development and management using Apache Airflow, making Apache Airflow runtime readily accessible for the development and operationalization of your data workflows.
Learn more about this new Data Factory in Fabric capability at Introducing Data workflows in Microsoft Fabric | Microsoft Fabric Blog | Microsoft Fabric
Introducing Blob Storage Event Triggers for Data Pipelines
A very common use case among data pipeline users in a cloud analytics solution is to trigger your pipeline when a file arrives or is deleted. We have introduced Azure Blob storage event triggers as a public preview feature in Fabric Data Factory Data Pipelines. This utilizes the Fabric Reflex alerts capability that also leverages Event Streams in Fabric to create event subscriptions to your Azure storage accounts.
You can learn more about Blob Storage event triggers at Data pipelines storage event triggers in Data Factory (Preview) – Microsoft Fabric | Microsoft Learn
Modern Get Data experience in Pipelines
We are thrilled to share the new Modern Get Data experience in Data Pipeline to empower users intuitively and efficiently discover the right data, right connection info and credentials.
Users can start the Modern Get Data experience by clicking Copy data Assistant in the Pipeline landing page or Copy data drop down. With the new modern get data experience, users can easily connect to recently used Fabric items (e.g. Lakehouse, Datawarehouse) in One Lake Datahub. It also provides a super intuitive way to read sources from sample data and new connections.
Learn more about this new feature at Easily connect your data with the new modern get data experience for data pipeline | Microsoft Fabric Blog | Microsoft Fabric
General Availability of the Power Query Connectors SDK for VS Code
We are thrilled to announce that the Power Query SDK is now generally available in Visual Studio Code! This marks a significant milestone in our commitment to providing developers with powerful tools to enhance data connectivity and transformation.
The Power Query SDK is a set of tools that allow you as the developer to create new connectors for Power Query experiences available in products such as Power BI Desktop, Semantic Models, Power BI Datamarts, Power BI Dataflows, Fabric Dataflow Gen2 and more.
This new SDK has been in public preview since November of 2022, and we’ve been hard at work improving this experience which goes beyond what the previous Power Query SDK in Visual Studio had to offer.
The latest of these biggest improvements was the introduction of the Test Framework in March of 2024 that solidifies the developer experience that you can have within Visual Studio Code and the Power Query SDK for creating a Power Query connector.
The Power Query SDK extension for Visual Studio will be deprecated by June 30, 2024, so we encourage you to try this new Power Query SDK in Visual Studio Code today if you haven’t.
To get started with the Power Query SDK in Visual Studio Code, simply install it from the Visual Studio Code Marketplace. Our comprehensive documentation and tutorials are available to help you harness the full potential of your data.
Join our vibrant community of developers to share insights, ask questions, and collaborate on exciting projects. Our dedicated support team is always ready to assist you with any queries.
We look forward to seeing the innovative solutions you’ll create with the Power Query SDK in Visual Studio Code. Happy coding!
Dataflows Gen2 Refresh History Enhancement – Refresh the Refresh History
Introducing a convenient enhancement to the Dataflows Gen2 Refresh History experience! Now, alongside the familiar “X” button in the Refresh History screen, you’ll find a shiny new Refresh Button. This small but mighty addition empowers users to refresh the status of their dataflow’s refresh history status without the hassle of exiting the refresh history and reopening it. Simply click the Refresh Button, and voilà! Your dataflow’s refresh history status screen is updated, keeping you in the loop with minimal effort. Say goodbye to unnecessary clicks and hello to streamlined monitoring!
Introducing Trusted Workspace Access in Fabric Data Pipelines
We are excited to announce a new feature in Fabric that enables you to create data pipelines to access your firewall-enabled Azure Data Lake Storage Gen2 (ADLS Gen2) accounts. This feature leverages the workspace identity to establish a secure and seamless connection between Fabric and your storage accounts.
With trusted workspace access, you can create data pipelines to your storage accounts with just a few clicks. Then you can copy data into Fabric Lakehouse and start analyzing your data with Spark, SQL, and Power BI. Trusted workspace access is available for workspaces in Fabric capacities (F64 or higher). It supports organizational accounts or service principal authentication for storage accounts.
You can learn more about this new capability at How to use trusted workspace access in data pipelines.
New Pipeline activity: HDInsight
We are excited to announce the availability of the Azure HDInsight activity for data pipelines. The Azure HDInsight activity allows you to execute Hive queries, invoke a MapReduce program, execute Pig queries, execute a Spark program, or a Hadoop Stream program. Invoking either of the 5 activities can be done in a singular Azure HDInsight activity, and you can invoke this activity using your own or on-demand HDInsight cluster.
To learn more about this activity, read https://aka.ms/HDInsightsActivity
New Pipeline activity: Fabric Spark Job Definition
We are excited to announce the availability of the Fabric Spark job definition activity for data pipelines. With this new activity, you will be able to run a Fabric Spark Job definition directly in your pipeline. Detailed monitoring capabilities of your Spark Job definition will be coming soon!
To learn more about this activity, read https://aka.ms/SparkJobDefinitionActivity
Parent/child pipeline pattern monitoring improvements
Today, in Fabric Data Factory Data Pipelines, when you call another pipeline using the Invoke Pipeline activity, the child pipeline is not visible in the monitoring view. We have made updates to the Invoke Pipeline activity so that you can view your child pipeline runs. This requires an upgrade to any pipelines that you have in Fabric that already use the current Invoke Pipeline activity. You will be prompted to upgrade when you edit your pipeline and then provide a connection to your workspace to authenticate. Another additional new feature that will light up with this invoke pipeline activity update is the ability to invoke pipeline across workspaces in Fabric.
Thank You for your feedback, keep it coming!
We wanted to thank you for your support, usage, excitement, and feedback around Data Factory in Fabric. We’re very excited to continue learning from you regarding your Data Integration needs and how Data Factory in Fabric can be enhanced to empower you to achieve more with data.
Please continue to share your feedback and feature ideas with us via our official Community channels, and stay tuned to our public roadmap page for updates on what will come next: