Announcing Fabric Copilot pricing
During January 2024, we announced the worldwide availability for public preview of Copilot in Microsoft Fabric. This preview includes Copilot for Power BI, Data Factory and Data Science & Data Engineering. With the Copilot in preview, Microsoft Fabric brings an improved way to transform, enrich and analyze data, and shortens the time to insights.
Today, we announce that Copilot in Fabric begins billing on March 1st, 2024, as part of your existing Power BI Premium or Fabric Capacity, at a rate of 400 Capacity Units seconds per 1,000 input tokens and 1,200 Capacity Units seconds per 1,000 output tokens. Further details regarding the cost of Copilot in Fabric from March 1st, 2024, are provided below.
Requests to Copilot consume Fabric Capacity Units (CU). Copilot usage is measured by the number of tokens processed. Tokens can be thought of as pieces of words. As a reference, 1000 tokens approximately represent 750 words. The Fabric Copilot cost is calculated per 1,000 tokens, and input and output tokens are consumed at different rates. This table defines how many CUs are consumed as part of Copilot usage.
|Operation in Metrics App
|Operation Unit of Measure
|Copilot in Fabric
|The input prompt
|Per 1,000 Tokens
|400 CU seconds
|Copilot in Fabric
|The output completion
|Per 1,000 Tokens
|1,200 CU seconds
If you’re utilizing Copilot for Power BI and your request involves 500 input tokens and 100 output tokens, then you’ll be charged a total of (500*400+100*1,200)/1,000 = 320 CU seconds in Fabric.
Monitoring the Usage
Starting from February 2024, you can view the total capacity usage for Copilot under the operation name “Copilot in Fabric” in your Fabric Capacity Metrics App.
At the end of each billing cycle, your colleague in the Finance department can review your Copilot usage billing charges by referring to the “Copilot in Fabric” item in the Fabric invoicing records.
Capacity Utilization Type
Fabric Copilot is classified as “background job” as the capacity utilization type to support more Copilot requests during busy hours.
Fabric is designed to deliver lightning-fast performance by allowing operations to access more CU (Capacity Units) resources than are allocated to capacity. Fabric smooths or averages the CU usage of an “interactive job” over a minimum of 5 minutes, “background job” over a 24-hour period. According to the Fabric throttling policy, the first phase of throttling begins when a capacity has consumed all its available CU resources for the next 10 minutes.
Fabric Copilot is powered by Azure Open AI large language models currently deployed to limited data centers. However, customers can enable cross-geo process tenant settings to use Copilots by processing their data in another region where the Azure Open AI service is available. This region could be outside of the user’s geographic region, compliance boundary, or national cloud instance. While performing region mapping, we prioritize data residency as the foremost consideration and attempt to map to a region within the same geographic area whenever feasible.
The cost of Fabric Capacity Units can vary depending on the region. Regardless of the consumption region where GPU capacity is utilized, customers are billed based on the Fabric Capacity Units pricing in their billing region.
For example, if a customer’s requests are mapped from region 1 to region 2, with region 1 being the billing region and region 2 being the consumption region, the customer is charged based on the pricing in region 1.