Microsoft Fabric Updates Blog

AI Ready Apps: From RAG to Chat – Interacting with SQL Database in Microsoft Fabric using GraphQL and MCP

In the current digital environment, applications are expected to offer more than basic functionality, they must demonstrate intelligence, adaptability, and provide seamless user experiences. To remain competitive, organizations increasingly recognize that effective data utilization is fundamental to driving innovation. Data powers real-time insights and supports advanced agentic AI systems capable of reasoning, acting, and learning. With technologies such as intelligent search functions, personalized recommendation engines, and natural language interfaces, organizations can now extract significant value from both structured and unstructured data, making this capability more attainable and critical than ever before.

In our previous AI Ready Apps blog post, we explored how to build a Retrieval Augmented Generation (RAG) pipeline using Microsoft Fabric’s SQL Database to prepare data for AI applications. In this follow-up, we’ll dive deeper into the Contoso Recruiting Agency use case to see how they extended that foundation. By integrating Microsoft Fabric API for GraphQL and the Model Context Protocol (MCP) server with SQL Database in Microsoft Fabric, Contoso unlocked the ability to converse with their data—transforming static resume repositories into dynamic, intelligent systems that support real-time candidate discovery and interaction.

End to End RAG pipeline in Microsoft Fabric

Agentic AI Application Workflow

When a recruiter uploads a resume to their designated folder in Azure Blob Storage, the data pipeline is automatically triggered. The file is ingested into Microsoft Fabric’s Lakehouse, from where it undergoes a series of transformations: text is extracted, segmented into chunks, and the key candidate attributes such as name, phone number, email, and city are extracted using an entity extraction step. The chunks are then converted into vector embeddings using Azure OpenAI and all this information is securely stored in a SQL Database in Fabric.

This structured and searchable vector store becomes the foundation for intelligent interaction. Using GraphQL and the Model Context Protocol (MCP) server, recruiters can now query the database conversationally asking natural language questions and receiving context-aware responses powered by similarity search and structured querying.

In the upcoming sections, we’ll walk you through the complete end-to-end setup that brings this intelligent experience to life

Prerequisite

This blog post requires users to bring their own key (BYOK) for AI services, which also means creating these services outside of the Microsoft Fabric platform.

Dataset

Considering the file formats supported by the Document Intelligence Service, we will utilize the PDF files from Resume dataset.

All resumes in this dataset are AI-generated and contain only fictional information for demonstration purposes.

Note: In a previous blog post, we recommended using a Kaggle resume dataset. However, due to inconsistent formatting in that dataset, which often leads to errors or difficulties when extracting key entities such as candidate name, location, phone number, and email. We now recommend using the curated data set provided here. It is optimized for better compatibility with entity extraction service and ensures smoother experience.

Steps

  1. Create a workspace named ‘IntelligentApp’.
  2. Create a Lakehouse named ‘blob_filestorage’.
  3. Create SQL Database in Fabric named ‘datamart’.

Important: If you followed the previous blog and already created the dbo.documents table, please drop it before proceeding. The schema has been updated in this blog to include candidate-specific fields such as name, location, phone, and email. You can run the following SQL command to drop the table:

DROP TABLE IF EXISTS dbo.documents;

This ensures a clean start with the updated structure.

4. Navigate to the workspace IntelligentApp, select Import Notebook From this computer and then import the notebook  cp_azblob_lakehouse.ipynb from the cloned repository’s notebook folder.

Import a Notebook

Import Notebook

  5. Attach the Lakehouse blob_filestorage  to cp_azblob_lakehouse notebook

    • Open the notebook, on the Explorer click Add sources.
    • Select Existing data sources.
    • Select blob_filestorage from OneLake catalog and then Connect.

Attach a Lakehouse to the Notebook

Attach Lakehouse to Notebook

 6. Create a User Data Function

    • Navigate to the IntelligentApp workspace and select +New item.
    • Search for Function and Select ‘User data function(s)’.
    • Provide the name file_processor.
    • Select ‘New Function’.
    • Add Lakehouse and SQL Database as managed connection(s).
      1. In the Home menu, select Manage Connections, then ‘+ Add data connection’.
      2. From the OneLake catalog select datamart (SQL Database), then ‘Connect’.
      3. Repeat the previous step to add blob_filestorage (Lakehouse) as a managed connection.
    •  Select Library management and add the following dependencies (Select ‘+ Add from PyPI to add the dependencies’). The dependencies are also listed in /functions/requirements.txt file of the cloned repository. Ensure you are using fabric-user-data-functions version 0.2.28rc0 or higher.

UDF - Add dependencies

Add User Data Function Dependencies

    • In the function editor, replace existing content with the contents of function\function_app_v2.py from the cloned repository.
    • Select ‘Publish’ to deploy the function, when the functions are deployed, select ‘Refresh’.

7. Create a Data pipeline by navigating to the workspace and then clicking on ‘+ New Item’.

      • Search and select ‘Data pipeline’.
      • Provide the name blob_ingest_pipeline.

8. Create a Data pipeline storage trigger by selecting the ‘Add trigger’ button and provide the following configuration:

        • Source: Select Azure Blob Storage events.
        • Storage account: Connect to existing Azure Blob Storage account.
        • Subscription: Select your Azure subscription.
        • Azure Blob Storage account: Select the blob storage account under your subscription.
        • Eventstream name: blob_ingest_stream.

Create Trigger

Create Data pipeline Storage Trigger

9. Select ‘Next’, to configure the event type and source.

    • Event Type(s): Select only the Microsoft.Storage.BlobCreated event. This will ensure that an event is generated each time a new blob object is uploaded.

Select ‘Next’ to review the configuration, then ‘Connect’ to connect to the blob storage. A successful connection will be indicated by the status ‘Successful’, finally, select ‘Save’.

On the Set alert screen, under Save location, configure the following settings,

    • Select, Create a new item.
    • New item name: blob_activator

Select ‘Create’ to create and save the alert.

Save alert

Configure Event Source

Now that we have setup the stream, it’s time to define the blob_ingest_pipeline.

Pipeline Definition

Templates offer a quick way to begin building data pipelines. Importing a template brings in all required activities for orchestrating a pipeline.

To import a template:

Import pipeline template

Import Data Pipeline Template

The imported data pipeline is preloaded with all necessary activities, variables, and connectors required for end-to-end orchestration. Consequently, there is no need to manually add a variable or an activity. Instead, you can proceed directly to configuring values for the variables and each activity parameter in the pipeline, as detailed below,

1.Configure pipeline variables

Select the pipeline canvas, select the ‘Variables’ menu, and then configure values for the following variables:

A screenshot of a computer

AI-generated content may be incorrect.

Pipeline variable values

Name Type Value Comment
fileName String @pipeline()?.TriggerEvent?.FileName
container String @pipeline()?.TriggerEvent?.FolderPath
source String @pipeline()?.TriggerEvent?.Source
cognitiveServiceEndpoint String <<YOUR-MULTI-SERVICE-ACCOUNT-ENDPOINT>> Replace<< YOUR-MULTI-SERVICE-ACCOUNT-ENDPOINT>> with the cognitive Service Endpoint.
For example
https://myservices.cognitiveservices.azure.com/
apiKey String <<YOUR-MULTI-SERVICE-ACCOUNT-APIKEY>> Replace <<YOUR-MULTI-SERVICE-ACCOUNT-APIKEY with the apikey of your multi-service account
openAIEndpoint String <<YOUR-OPENAI-SERVICE-ENDPOINT >> Replace <<YOUR-OPENAI-SERVICE-ENDPOINT>> with the endpoint of your Azure OpenAI Account.
For example,
https://myopenaiservice.openai.azure.com/
openAIKey String <<YOUR-OPENAI-APIKEY>> Replace <<YOUR-OPENAI-APIKE>> with the apikey of your Azure OpenAI Account
embeddingModel String text-embedding-3-small
recepientEmailAddress String <<to-email-address>> recipients email address
senderEmailAddress String <<from-email-address>> sender’s email address

Get Multi-Services account endpoint

Get multi-services account endpoint

2. Configure ‘Notebook’ activity

The notebook associated with this activity utilizes NotebookUtils to manage file system. During the execution of the notebook, a folder corresponding to the container name will be created if it does not exist. Subsequently, the file will be copied from Azure Blob Storage to the Lakehouse folder.

Configure this activity as outlined:

  • General Tab – Name: azureblob_to_lakehouse
  • Settings Tab – Notebook: cp_azblob_lakehouse
Base parameters

Configure the following parameters and its value:

  • Name: fileName
      • Type: String
      • Value:
@variables('fileName')
  • Name: container
        • Type: String
        • Value:
@variables('container')
  • Name: source
      • Type: String
      • Value:
@variables('source')

3. Configure ‘Extract Text’ activity

The function extract_text associated with this activity uses Azure AI Document Intelligence service to extract the ‘text’ content from the file copied into the Lakehouse by the previous activity.

Configure this activity as outlined:

  • General Tab – Name: Extract Text
  • Settings Tab
    • Type: Fabric user data functions
    • Connection: Sign-in (if not already) using your workspace credentials.
    • Workspace: IntelligentApp (default selected)
    • User data functions: file_processor
    • Function: extract_text
Parameters,
  • Name: filePath
      • Type: str
      • Value:
@activity('azureblob_to_lakehouse').output.result.exitValue
  • Name: cognitiveServicesEndpoint
      • Type: str
      • Value:
@variables('cognitiveServiceEndpoint')
  • Name: apiKey
      • Type: str
      • Value:
@variables('apiKey')

4. Configure ‘If Conditions’ activity

The ‘If Conditions’ activity verifies the success of the text extraction in the previous step. If the extraction was unsuccessful, an email would be sent to the configured recipient, and the pipeline would be terminated.

Configure this activity as outlined:

  • General Tab – Name: Text Extraction Results
  • Activities Tab – Expression:
@empty(activity('Extract Text').error)
    • Case: False, edit the ‘false’ condition using the edit (pencil) icon, and add the following activities:

‘Office 365 Outlook’ activity: to send alert emails.

  • General Tab – Name: Text Extraction Failure Email Alert
  • Settings Tab – Signed in as: Sign- in (if not already) using the credentials as those of your workspace.
    • To:
@variables('recepientEmailAddress')
    • Subject: Text Extraction Error
    • Body:
<p>@{replace(string(activity('Extract Text').error.message), '\','')}</p>

Advanced:

  • From:
@variables('senderEmailAddress')
  • Importance: High

‘Fail’ activity: To terminate the pipeline

  • General Tab – Name: Text Extraction Process Failure
  • Settings Tab – Fail message:
@{replace(string(activity('Extract Text').error), '\','')}
    • Error code:
@{activity('Extract Text').statuscode}

5. Configure ‘Extract Entities’ activity

The function ‘extract_entities’ associated with this pipeline uses Azure AI Language service to extract entities such as name, email, phone, location etc; for the text extracted by the previous activity.

Configure this activity as outlined:

  • General Tab – Name: Extract Entities
  • Settings Tab
    • Type: Fabric user data functions
    • Connection: Sign-in (if not already) using your workspace credentials.
    • Workspace: IntelligentApp (default selected)
    • User data functions: file_processor
    • Function: extract_entities
Parameters,
  • Name: text
      • Type: str
      • Value:
@activity('Extract Text').output.output
  • Name: cognitiveServicesEndpoint
      • Type: str
      • Value:
@variables('cognitiveServiceEndpoint')
  • Name: apiKey
      • Type: str
      • Value:
@variables('apiKey')

6. Configure ‘Generate Chunks’ activity

The function ‘chunk_text’ associated with this activity uses the tiktoken tokenizer to ‘generate chunks’ for the text extracted by the previous activity.

Configure this activity as outlined:

  • General Tab Name: Generate Chunks
  • Settings Tab
    • Type: Fabric user data functions
    • Connection: Sign-in (if not already) using the credentials that have complete access to your workspace.
    • Workspace: IntelligentApp (default selected)
    • User data functions: file_processor
    • Function: chunk_text

Parameters:

  • Name: text
      • Type: str
      • Value:
@activity('Extract Text').output.output
  • Name: maxToken
      • Type: int
      • Value: 500
  • Name: encoding
      • Type: str
      • Value: cl100k_base

7. Configure ‘Generate Embeddings’ activity

The function ‘generate_embeddings’ associated with this activity uses Azure Open AI Service embedding model to convert the redacted chunks into embeddings.

Configure this activity as outlined:

  • General Tab Name: Generate Embeddings
  • Settings Tab
    • Type: Fabric user data functions
    • Connection: Sign-in (if not already) using the credentials that have complete access to your workspace.
    • Workspace: IntelligentApp (default selected)
    • User data functions: file_processor
    • Function: generate_embeddings
Parameters,
  • Name: text
      • Type: list
      • Value:
@activity('Generate Chunks').output.output
  • Name: openAIServiceEndpoint
      • Type: str
      • Value:
@variables('openAIEndpoint')
  • Name: embeddingModel
      • Type: str
      • Value:
@variables('embeddingModel')
  • Name: openAIKey
      • Type: str
      • Value:
@variables('openAIKey')

8. Configure ‘If Conditions’ activity

The ‘If Conditions’ activity verifies the success of the generate embeddings in the previous step. If the embeddings generation were unsuccessful, an email would be sent to the configured recipient, and the pipeline would be terminated.

Configure this activity as outlined:

  • General Tab Name: Generate Embeddings Results
  • Activities Tab Expression:
@empty(activity('Generate Embeddings').error)
    • Case: False, edit the ‘false‘ condition using the pencil icon, and add the following activities:
      • ‘Office 365 Outlook’ activity to send alert emails.
  • General Tab Name: Generate Embeddings Failure Email Alert
  • Settings Tab:
    • Signed in as: (if not already) using the credentials as those of your workspace.
    • To:
@variables('recepientEmailAddress')
    • Subject: Generate Embeddings Error
    • Body:
<p>@{replace(string(activity('Generate Embeddings').error.message), '\','')}</p>

Advanced – From:

@variables('senderEmailAddress')
  • Importance: High

‘Fail’ activity: To terminate the pipeline

  • General Tab Name: Generate Embeddings Processing Failure
  • Settings Tab – Fail message:
@{replace(string(activity('Generate Embeddings').error), '\','')}
    • Error code:
@{activity('Generate Embeddings').statuscode}

9. Configure ‘Create Database Objects’ activity

The function ‘create_table’ associated with this activity executes a SQL command to create a documents table within the previously created ‘datamart’, SQL database. Configure this activity as outlined below:

  • General Tab – Name: Create Database Objects
  • Settings Tab:
    • Type: Fabric user data functions
    • Connection: Sign-in (if not already) using the credentials that have complete access to your workspace.
    • Workspace: IntelligentApp (default selected)
    • User data functions: file_processor
    • Function: create_table

10. Configure ‘Save Data’ activity

The function ‘insert_data’ associated with this activity executes a SQL command to bulk insert rows in the documents table created in the previous activity.

Configure this activity as outlined:

  • General Tab Name: Save Data
  • Settings Tab:
    • Type: Fabric user data functions
    • Connection: Sign-in (if not already) using the credentials that have complete access to your workspace.
    • Workspace: IntelligentApp (default selected)
    • User data functions: file_processor
    • Function: insert_data

Parameters:

  • Name: data
      • Type: list
      • Value:
@activity('Generate Embeddings').output.output
  • Name: documentLocation
      • Type: str
      • Value:
@concat( concat( concat( concat( concat('https://' , last(split(variables('source'),'/')) ), '.blob.core.windows.net/'), variables('container')), '/'), variables('fileName'))
  • Name: candidateName
      • Type: str
      • Value:
@activity('Extract Entities').output.output.Name
  • Name: phoneNumber
      • Type: str
      • Value:
@activity('Extract Entities').output.output.PhoneNumber
  • Name: email
      • Type: str
      • Value:
@activity('Extract Entities').output.output.Email
  • Name: candidateLocation
      • Type: str
      • Value:
@activity('Extract Entities').output.output.Location

Execute pipeline (pipeline in action)

Let’s consolidate our progress to date and observe the pipeline execution in real-time.

  • Upload a PDF file:
    • Use the Azure Storage Explorer or alternatively Azure Portal and create a Blob container named resume.
    • Upload a PDF file from the resume dataset.

Create a Container and Upload a PDF File

Pipeline Execution Review

  • Pipeline execution review:
    • From the pipeline’s ‘Run’ menu, select View run history and select the recent pipeline run.
    • In the details view, check to see if the status is Succeeded.
    • In case of a Failure, try to Rerun the pipeline using the rerun option.
  • Review Lakehouse:
    • A folder with the same name as that of the container (resume) is created.
    • The PDF file is copied from Azure Blob Storage to the Lakehouse files.

Lakehouse Review

Lakehouse Review

  • Review Database
    • The document table should be automatically created by the pipeline.
    • Chunk data, embeddings, and other entity (candidate in this case) information (such as name, phone, email, and location) are stored in the documents table.

Database Review

Database Review

GraphQL API

So far, we’ve explored how a data pipeline ingests documents—like resumes from our sample dataset, extracts their content, and then stores the results in a SQL Database within Microsoft Fabric. This stored data includes chunked data, vector embeddings, and associated entity metadata.

In this section, we’ll shift our focus to making that enriched data accessible by exposing it through a GraphQL API. By the end of this section, you’ll have a fully functional GraphQL endpoint that lets you query documents using natural language, enabling flexible and efficient access to your AI-ready data.

Navigate to the SQL Databaes datamart database in Fabric portal

Setting up Database Objects

Execute the following SQL Scripts from the cloned repository in the order in the example provided:

  1. Create Scoped Credentials sets up database credentials to access Azure OpenAI API key.In the script, replace the following values before executing,
    • <<YOUR_STRONG_PASSWORD>> with a strong password of your choice (for example, N’V3RYStr0NGP@ssw0rd!’).
    • <<AZURE_OPEN_AI_ENDPOINT>> with your Azure Open AI service endpoint (for example, https://resume.openai.azure.com/).
    • <<AZURE_OPEN_AI_API_KEY>> with your Azure Open AI key.
  2. Create Embeddings is a stored procedure that leverages the configured embedding model to convert natural language to its embedding equivalent. In the script, replace the following values before executing,

Note: Ensure @Credential parameter matches the credentials used in the Create Scoped Credential stored procedure.

    • <<EMBEDDING_MODEL>> with the name of your embedding model (for example, text-embedding-3-small).
    • <<API_VERSION>> with the API Version of the embedding model (for example, 2023-05-15).

You can find all information by logging into Azure AI Foundry and navigating to your model deployment.

3. Similarity Search stored procedure performs a vector search using the stored embeddings to find relevant documents. It also supports predicate-based filtering, such as search by location. This is the primary stored procedure that will be used by the GraphQL endpoint.

To validate the configuration of your stored procedures, execute the example command. The output itself is not critical; the objective is to confirm successful execution without errors.

EXECUTE dbo.similarity_search 'candidates with networking experience'

Within this Similarity Search stored procedure, there is an internal call to the Create Embeddings stored procedure. This internal call MUST explicitly use OUTPUT WITH RESULT SETS NONE. This is crucial because it suppresses any result sets that the Create Embeddings procedure might otherwise produce.

By doing so, it guarantees that only the Similarity Search stored procedure returns a result set. This singular result set is what the GraphQL API uses to infer and generate its schema definition. If the Create Embeddings procedure were to return a result set, the GraphQL API schema generation process would fail due to unexpected multiple result sets.

Search

Setting up GraphQL API

Once the SQL scripts are successfully executed, it’s time to stand up your GraphQL API. Setting up a GraphQL API in Microsoft Fabric is very straightforward. Simply follow the Steps to Create GraphQL API in SQL Database.

When prompted to enter a name, use smart_search. On the Get Data screen select all objects, (ensure that the similarity search stored procedure is checked), then select the Load button.  Get Data

GraphQL API Setup

Your GraphQL API should be ready to go! Simply copy and save the GraphQL endpoint link; we will be using it in the upcoming section to setup a MCP Server.

GraphQL

In order to test the endpoint, use the Sample GraphQL Query Script. To execute the script, click on the Query tab, paste your script in the new query window, and select the Run button as demonstrated in the screenshot example:

GraphQL in Action

Integrating GraphQL with an MCP Server

If you’ve followed along thus far, we’ve successfully created an ingestion pipeline to process documents (like resumes from our dataset) and built a powerful GraphQL API to retrieve that data. While this provides efficient data access, a simple search often isn’t enough for truly intelligent applications.

To move beyond basic retrieval and enable dynamic, multi-turn interactions, we need to introduce agentic conversation. This section focuses on setting up an MCP (Model Context Protocol) Server, which will expose our GraphQL endpoint as a specialized tool, transforming static searches into fluid, conversational experiences.

Simply follow the steps to stand up a MCP Server:

  1. Navigate to the mcp_graphql folder.
  2. Install dependencies,  pip install -r requirements.txt
  3. Rename the file settings.env to .env.
  4. Replace <<your_fabric_graphql_endpoint>> with the GraphQL endpoint link obtained in the previous section.
  5. Configure Your MCP Client:

For Claude:

    • Navigate to File -> Settings -> Developer -> Edit Config
    • Use the template claude_desktop_config.json as reference.
    • Replace <<path_to_mcp_graphql_server.py>> with the absolute path to your mcp_graphql.py file (for example, “C:\\fabric-sqldb-ai-ragpipeline\\mcp-graphql\\mcp_graphql.py”).

For more information on configuring Claude desktop, refer to the documentation.

For VS Code GitHub:

    • Follow the Use MCP servers in VS Code steps to configure MCP Server.
    • Use the template settings.json as reference.
    • Replace <<path_to_mcp_graphql_server.py>> with the absolute path to your mcp_graphql.py file (for example, “C:\\fabric-sqldb-ai-ragpipeline\\mcp-graphql\\mcp_graphql.py”).
  1. Restart MCP Clients.
  2. A new window will prompt you for Entra Authentication. Once logged in, you will be able to view the tools listed in the MCP Server.

Note: Make sure your database has sufficient read and write access to the EntraId.

  1. Select Claude Sonnet 4 as your model (recommended for its strong reasoning capabilities).
  2. For VS Code, select Agent mode. Then, select the tool icon and select the get_resumes tool.

VS Code

3. For Claude, tool will be automatically selected.

Agent in Action

You are ready to start interacting with your MCP Client (Agent Host) using (but not limited to) the following example prompts.

Prompt 1: ‘Looking for chef’s resume. The chef must have foundational cooking skills but be exceptional in kitchen hygiene and organizational skills. Share the candidate’s profile (link) with a short description, why they are a good fit!’

Agent in Action

Prompt 2: ‘Looking for a full stack developer from Alabama’.

Agent in Action

Troubleshooting

  • When adding a Python library from PyPI to User Data Functions, you might notice an error, such as a wiggly line under the library name (e.g., azure-ai-textanalytics), like a spelling mistake. Users should ensure the library name is spelled correctly and then ignore the error by tabbing out to the Version dropdown and selecting the correct version. This transient error should resolve itself.
  • The imported pipeline reportedly doesn’t seem to preload with the parameter values. For each activity in the pipeline, ensure that the parameter values are provided and correct.
  • While testing the GraphQL API, if you get an error like ‘Only those stored procedures whose metadata for the first result set described by sys.dm_exec_describe_first_result_set are supported.’, it may be due to missing or incorrect configuration in your stored procedures for instance the credentials in Create_Embeddings. Double-check that all required values like endpoint, API key, and model name are correctly filled in and ensure the dbo.similarity_searh stored procedure includes the statement OUTPUT WITH RESULT SETS NONE.

Conclusion

We hope this post gives you practical tips for building Agentic AI applications using Microsoft Fabric platform and its ecosystem. Your feedback is welcome, and we look forward to seeing how you apply these insights. Special thanks to Frederic Gisbert for supporting and validating this blog series, helping make it accessible for others to build next-generation AI solutions.

Entradas de blog relacionadas

AI Ready Apps: From RAG to Chat – Interacting with SQL Database in Microsoft Fabric using GraphQL and MCP

noviembre 4, 2025 por Misha Desai

We’re introducing a set of new enhancements for Data Agent creators — designed to make it easier to debug, improve, and express your agent’s logic. Whether you’re tuning example queries, refining instructions, or validating performance, these updates make it faster to iterate and deliver high-quality experiences to your users. New Debugging Tools View referenced example … Continue reading “Creator Improvements in the Data Agent”

octubre 29, 2025 por Adam Saxton

This month’s update delivers key advancements across Microsoft Fabric, including enhanced security with Outbound Access Protection and Workspace-Level Private Link, smarter data engineering features like Adaptive Target File Size, and new integrations such as Data Agent in Lakehouse. Together, these improvements streamline workflows and strengthen data governance for users. Contents Events & Announcements Fabric Data … Continue reading “Fabric October 2025 Feature Summary”