Azure AI Building RAG Application Solution: Using Azure AI Search Service and Creating Data Source, Index, and Indexer

Implementation of AI Capabilities for the existing software applications is one of the most demanded as well as expected features. When the application is deployed on Microsoft Azure, it can easily use the Azure AI Capabilities e.g. Azure AI Search, Azure Open AI for building Retrieval-Augmented Generation (RAG) application. For example, if you are having online Loan or Insurance applications, as an application owner you want to provide Q&A based application for end-users so that end-user can retrieve Loan or insurance related information e.g. eligibility, rates of interest, Insurance rules, etc. In this case, rules for Loan or Insurance may be continuously changing and hence you are storing such information in Word, PDF, etc. file formats. Thes file types are stored in Azure BLOB Storage. In such applications, you must have a RAG solution build using Azure Open AI, Azure AI Search Service and Azure BLOB Storage. In this case, the files are stored in Azure BLOB storage, the Index based search using vector Queries will be implemented using Azure AI ang RAG will be managed using Azure Open AI. Figure 1, provides an idea of the application implementation.

Figure 1: RAG Implementation

As shown in Figure 1, the application has the following behavior:

PDF documents are store in BLOB Storage; these documents have the necessary data that will be used to provide information as per demanded by the end-user.
The Azure Open AI Service will be used to generate embeddings for the PDF documents stored in the Blob storage.
The embeddings will be generated by splitting the document and these will be stored in Azure AI Search Service.
The Azure AI Search Service will be having Index to store the embeddings. The Azure AI Search service will use the Azure Blob Storage as data source so that embeddings from the documents from the Azure Blob Storage will be stored in the indexes.
The Query/Prompt will be sent from the Client application e.g. Blazor app, Angular, React.js app, etc. to the ASP.NET Core API that have the AI RAG logic.
The prompt will be passed to Azure Open AI that will generate embedding from the prompt so that it will be used to query the embedded data.
The embedding generated from the prompt, will be applied on the Index created in Azure AI Search service to retrieve the match.
The query will be executed, and the match result will be returned as documents.
The response will be generated based on the Azure Open AI Service and the RAG result will be returned back to the client.

In this part, we will be creating Azure AI Search Service, Azure Open AI Service using Azure AI Portal, and then we will create Azure AI Search Service Datasource, Index, and Indexer.

Open Azure Portal, Login with your subscription. Once you login, create a new Resource Group. Create a new Azure AI Search Service as shown in Figure 2.

Figure 2: Create an Azure AI Search Service

Make sure that the pricing tier is selected carefully. Manage the scaling and other details of the Azure AI Search service carefully.

Once the Azure AI Search service is created, now it's time to create Azure Storage Account and Blob container in it. Make sure that the primary service for Azure Storage as Blob Storage. As shown in Figure 3, create Azure Storage Account and Blob Storage in it.

Figure 3: The Azure Blob Container

Note that I have created the container named pdfdata. Upload PDF documents. I have downloaded Two-wheeler policy documents from online web sites of insurance provider.

Finally, create an Azure Open AI Service as shown in Figure 4.

Figure 4: Creating Azure Open AI Service

Click on the Next button and keep the other options to default. Once the service is created click on the Azure AI Foundry Portal as shown in Figure 5.

Figure 5: The Navigation to Azure AI Foundry Portal

Once Azure Open AI Foundry portal is open created deployment for gpt-4.1 and text-embedding-3-large models as shown in Figure 6.

Figure 6: Model Deployment

GPT-4.1 Model

This is Multimodal & Long-Context Mastery that supports up to 1 million tokens of context and handles text, image, audio, and video inputs. This is ideal for complex reasoning and agentic tasks.
This is Superior Coding & Instruction that Follows and Outperforms previous models like GPT-4o and GPT-4.5 in benchmarks for software engineering, format compliance, and multi-step instructions.
This is Optimized for Real-World Use Comes in three variants those are GPT-4.1, Mini, and Nano—offering a balance of speed, cost, and intelligence for different deployment needs.

text-embedding-3-large (Embedding Model)

This is High-Dimensional Semantic Representation that Generates embeddings with up to 3072 dimensions, capturing deep semantic meaning for tasks like search, clustering, and RAG6.
This is a Multilingual & Benchmark-Leading Performance model that achieves top scores on MIRACL and MTEB benchmarks, outperforming previous models like text-embedding-ada-002 in both English and multilingual tasks.
This is Flexible & Cost-Efficient Deployment Embeddings that can be shortened e.g., to 256 dimensions with minimal performance loss and great for vector databases and low-resource environments.

Now, we have created Azure AI Search service and Azure Storage account with Blob container, we can proceed with the Code part.

In this article, since we will be creating DataSource an Index, and an Indexer Azure AI Search Service programmatically, we need to understand those concepts first:

What is Azure AI Search Service Data Source?

A data source in Azure AI Search defines the origin of content that is the location where the raw data lives before it's indexed.
It acts as a bridge between the data storage system like Azure Blob Storage, Cosmos DB or SQL Database.
We can configure it with connection details, authentication, and extraction settings so the indexer knows how to access and process the data.
Following are the storage type supported:

Azure Blob Storage: Ideal for unstructured data like PDFs, images, and Office docs.
Azure SQL Database: For structured, relational data.
Azure Cosmos DB: Great for NoSQL JSON-based data.
Azure Data Lake Storage Gen2: For hierarchical big data.
SharePoint, OneDrive, MySQL, and more: Available via preview or Logic App connectors.

What is an Azure AI Search Index?

An Azure AI Search Index is the core structure that stores and organizes searchable content for your application. In a simple word we can say it like a specialized database table designed for fast and intelligent search.

The Index has following Key Concepts:

Documents & Fields:

Each index contains documents, which are units of searchable data e.g., a product, article, or profile, etc.
Documents are made up of fields like title, description, tags, etc. each with attributes like searchable, filterable, sortable, and facetable.

Searchable: Enables full-text search on the field. Text is tokenized and analyzed.
Filterable: Allows filtering results using expressions like $filter=field eq 'myval'.
Sortable: Lets you sort results using $orderby=field.
Facetable: Supports faceted navigation e.g., count by designation.

Schema Definition:

We need to define the index schema when creating the index by specifying field names, types text, number, date, vector and schema behaviors.
The index schema supports complex types and vector fields for advanced search like semantic or hybrid search.

a vector field is a special type of field in the index that is used to store numerical embeddings. These embeddings essentially are high-dimensional representations of content like text, images, or audio. These embeddings allow the search engine to perform similarity-based retrieval, which is especially useful for semantic search, recommendation systems, and AI-powered applications.

Indexing Process:

Data is ingested into the index via push with manual upload or pull with using indexers.
Indexers can apply AI enrichment e.g., OCR, entity recognition, sentiment analysis to transform raw content into searchable insight.

Querying & Retrieval:

Once index is populated, it supports powerful queries like full-text search, vector similarity, filters, facets, autocomplete, and semantic ranking.
Results are returned in milliseconds, optimized for real-time applications.

What is Indexer?

An indexer is a built-in crawler in Azure AI Search that automates data ingestion from external sources like Azure Blob Storage, SQL Database, or Cosmos DB.
It reads raw content, transforms it if needed, and populates a search index to make the data searchable without manual uploads or custom code.
Indexers operate on a pull model this means that Azure AI Search pulls data from the source at scheduled intervals or on demand.
This eliminates the need for developers to write custom logic to push data into the index.
Indexing Workflow:

Document Cracking: Extracts text, metadata, and images from files or records.
Field Mapping: Maps source fields to index fields e.g. title is mapped to pageTitle.
Skillset Execution: This is an optional step Applies AI enrichment like OCR, entity recognition, or sentiment analysis.
Output Field Mapping: Maps enriched content back into the index.

Step 1: Open Visual Studio and create ASP.NET Core API project targeted to .NET 9. Name this project as Core_RAG-Service. In this project, add following packages;

Azure.AI.OpenAI

Access to Advanced Models Azure.AI.OpenAI is used to provide enterprise-grade access to various OpenAI’s powerful models like GPT-4, GPT-4o, DALL·E, etc. to integrate directly into the Azure ecosystem for tasks like text generation, image creation, and speech processing.
Enterprise-Ready Security & Compliance is used to offer built-in security features such as private networking, regional availability, and responsible AI content filtering which make it suitable for regulated industries and sensitive data use cases.
Flexible Integration & Deployment Developers can deploy models via REST APIs or SDKs like Python, C#, JavaScript, etc. further fine-tune them for specific tasks, and embed them into apps using Azure tools like AI Foundry, Azure AI Search, and Cosmos DB.
Scalable AI Innovation Azure OpenAI enables rapid prototyping and scaling of AI solution ranging from intelligent chatbots and content generation to workflow automation and data insights while maintaining control over performance and cost.

Azure.Identity

Unified Authentication for Azure SDKs Azure.Identity provides a set of TokenCredential classes that enable secure authentication across Azure services using Microsoft Entra ID.
Flexible Credential Options that supports multiple authentication methods like DefaultAzureCredential, ManagedIdentityCredential, InteractiveBrowserCredential, that makes it easy to authenticate in local development, cloud-hosted apps, or CI/CD pipelines.
Simplifies Development & Deployment Developers can use the same credential logic across environments.
Built for .NET and Beyond While it's a .NET library, Azure.Identity integrates with other Azure SDKs and tools, offering consistent authentication patterns across languages and platforms.

Azure.Search.Documents

Powerful Search Capabilities It enables full-text, vector, and hybrid search across structured and unstructured data that is ideal for building intelligent search experiences in apps and websites.
Index Management & Data Ingestion can be used to create, update, and manage search indexes, upload documents, and enrich content using skillsets like OCR, entity recognition, sentiment analysis during ingestion.
Flexible Querying & Filtering Supports keyword search, semantic ranking, faceted navigation, geospatial filtering, and autocomplete—plus advanced filters for metadata and document-level access control.
Support various SDKs for Multiple Languages Available for .NET, Python, Java, and JavaScript, with easy integration via REST APIs or Azure SDKs—making it developer-friendly across platforms.

Azure.Storage.Blob

Unified SDK for Storage Services Azure.Storage provides client libraries to interact with Blob, Queue, and File storage to make it easier to manage unstructured data, messaging queues, and shared files in the cloud.

Microsoft.SemanticKernel

AI Orchestration Framework Semantic Kernel lets developers integrate and orchestrate AI models like OpenAI, Azure OpenAI, Hugging Face with traditional code—bridging the gap between LLMs and conventional programming.
Plugins & Functions Architecture that supports both semantic functions prompt-based AI tasks and native functions like standard code methods, which can be combined into reusable plugins for modular and scalable AI workflows.
Memory & Planning Capabilities Includes built-in memory stores and planners to manage context, recall information, and dynamically generate execution plans—ideal for multi-step reasoning and task automation.
Cross-Platform & Enterprise-Ready Available for .NET, Python, and Java, Semantic Kernel is designed for enterprise-grade reliability, with support for multimodal inputs, vector databases, and secure deployment across cloud or local environments.

Microsoft.SemanticKernel.Connectors.AzureOpenAI

Dedicated Azure Integration Layer: This connector provides a streamlined interface to Azure-hosted OpenAI models, separating Azure-specific logic from the general OpenAI connector for better clarity and modularity.
Support for Multimodal AI Services: It enables access to Azure OpenAI capabilities like chat completions, text embeddings, audio-to-text, text-to-image, and text-to-audio
Simplified Configuration & Deployment Developers can configure deployments using environment variables or .env files, specifying model names, endpoints, and API keys. The SDK supports flexible setup for different modalities and deployment scenarios.

Make sure that you copy Blob Connection string, Azure AI Search Service Endpoint, Azure AI Search Service Key and Azure Open AI Endpoint and Key for it so that we can use them in the code.

Step 2: In the API project, modify the appsettings.json file and create keys for the Azure AI Search Service Endpoint, Key, Azure Open AI Key, Endpoint and Blob connection string along with Azure AI Search Service Data Source Name, Index Name, Indexer Name, Deployment name as shown in Listign 1.

{
    "Logging": {
        "LogLevel": {
            "Default": "Information",
            "Microsoft.AspNetCore": "Warning"
        }
    },
    "AllowedHosts": "*",
    "AzureSettings": {
        "AzureAISearchEndPoint": "[AI-SEARCH-SERVICE-ENDPOINT]",
        "AzureAISearchKey": "[YOUR-AIS-SEARCH-KEY]",
        "AzureAISearchIndexName": "rag-index-for-info",
        "AzureAISearchIndexerName": "rag-indexer-for-info",
        "AzureAISearchDataSourceName": "rag-datasource-for-info",
        "AzureBLOBConnectionString": "[AZURE-STORAGE-CONNECTION-STRING]",
        "AzureBLOBStorageContainerName": "pdfdata",
        "AzureOpenAIServiceEndpoint": "[AZURE-OPEN-AI-ENDPOINT]",
        "AzureOpenAIServiceChatDeploymentName": "gpt-4.1",
        "AzureOpenAIServiceApiKey": "[YOUR-OPENAI-API-KEY]",
        "AzureOpenAIServiceEmbeddingDeploymentName": "textembedding3large"

    }
}

Listing 1: Various Keys in appsettings.json

Step 3: In the project, add a new folder named Models. In this folder, add a new file named IndexedAISearchDocumentModel.cs. In this class file, we will add code for IndexedAISearchDocumentModel class. In this class we will add code for the properties those we will be using to create Azure AI Search Service Index. Listing 2, shows code for the IndexedAISearchDocumentModel class.

using Azure.Search.Documents.Indexes;
using Azure.Search.Documents.Indexes.Models;

namespace Core_RAG_Service.Models
{
   public class IndexedAISearchDocumentModel
    {

        [SimpleField(IsKey = true)]
        public string? chunk_id { get; set; }

        [SearchableField]
        public string? content { get; set; }


        [SimpleField]
        public string? metadata_storage_name { get; set; }

        [SimpleField]
        public string? metadata_storage_path { get; set; }

        [FieldBuilderIgnore] // Optional: if you're not indexing this directly
        public float[]? text_vector { get; set; }
    }
}

Listing 2: IndexedAISearchDocumentModel class

As sown in Listing 2, we have the class that contains various properties. Since these properties will be used to map themselves to Azure AI Search service index, we need to apply various attributes on it. Attributes are explained as follows:

SimpleField:

It is a class used in Azure Cognitive Search to define fields in a search index to:

Represents a field using a primitive type or a collection of primitive types
configured to be filterable, sortable, facetable, or hidden
structure how data is indexed and queried in search applications

SearchableField:

It is a type of field in a search index that allows full-text search capabilities.
It is designed to let users search for terms within the field’s content.

Step 3: In the project, add a new folder and name it as AzureAISearchServiceManager.cs. In this class file we will write logic to create Azure AI Search Service Data Source, Index, and Indexer. We need to use following classes to implement the logic:

Kernel (Semantic Kernel):

The Kernel is the central orchestrator in the Semantic Kernel framework.
This manages AI model integration, plugins, and memory.
This coordinates workflows and function execution.
This enables tool calling, prompt templating, and agent orchestration.

AzureKeyCredential:

This class is used for authentication when connecting to Azure services.
This holds an API key securely.
This Allows key rotation without recreating the client.
This is used across many Azure SDKs e.g., Search, OpenAI, Document Intelligence.

AzureOpenAIClient:

This is the main client for interacting with the Azure OpenAI service.
This connects to Azure-hosted OpenAI models like GPT-4, GPT-3.5, etc.
This supports chat completions, embeddings, image generation, and various other AI features.
This can authenticate via API key or Microsoft Entra ID.

SearchIndexClient:

This is used to manage indexes in Azure Cognitive Search.
This is used to Create, update, delete, or list search indexes.
This is used to analyze text using tokenizers and analyzers.
This is used to manage synonym maps, aliases, and knowledge agents.

VectorSearch:

This class is available in Azure AI Search in .NET SDK.
This is the central configuration object that defines how vector search behaves across the index.
It ties together all the components like algorithms, vectorizers, compression methods, and profiles them into a unified setup.
This class has following two important properties:

Algorithms: Specifies how vectors are indexed and searched e.g. HNSW.
Profiles: Named combinations of algorithm and vectorizer and compression used by vector fields.
Vectorizers: Defines how raw content is converted into embeddings.

VectorSearchField:

This is a class in the Azure SDK to used specifically for .NET. This is used to define a searchable vector field in a search index.
This Stores numerical vector representations of content e.g. embeddings from OpenAI models.
This Enables similarity search using algorithms like cosine similarity or dot product.
This is typically used in Retrieval-Augmented Generation (RAG) and semantic search scenarios.
This class has the following properties:

Name: Field name in the index.
Type: This must be CollectionIEdm.Single).
VectorSearchDimensions: Number of dimensions in the vector e.g. 3072 for Text Embedding 3 Large for Azure OpenAI.
VectorSearchProfileName: Name of the profile that defines the search algorithm.
VectorEncodingFormat: Format used to encode the vector e.g. base64, float array.
IsStored: Whether the field is persisted separately for retrieval.
IsHidden: Whether the field is excluded from search results.

HnswAlgorithmConfiguration:

This is a class in the Azure SDK for .NET that defines how the Hierarchical Navigable Small World (HNSW) algorithm behaves when used for vector similarity search in Azure AI Search.
HNSW is a graph-based algorithm for approximate nearest neighbor (ANN) search.
This class is known for:

High recall which of accuracy of results
Fast search speeds
Scalability for large datasets

This class has following properties:

m: Max number of connections per node. Higher=better recall, slower indexing
efConstruction: Controls indexing quality. Higher=better graph, slower build time.
efSearch: Controls search accuracy. higher = better recall, slower query time.
metric: Similarity metric. e.g. cosine
name: Name of the configuration profile

VectorSearchProfile:

This class in Azure AI Search defines a reusable configuration for how vector search should behave in the index.
It acts as a bridge between vector fields, the search algorithm, and the vectorizer that generates embeddings.
This class has the following properties:

name: Unique name for the profile.
algorithm_configuration_name: Refers to the algorithm setup e.g., HNSW.
vectorizer_name: Specifies the embedding model or service used
compression_name: Compression method for storing vectors. This is an optional property.

In the AzureAISearchServiceManager.cs file, add the code for AzureAISearchServiceManager class as shown in Listing 3.

 public class AzureAISearchServiceManager
    {
        Kernel kernel;
        AzureKeyCredential _credential;
        AzureOpenAIClient _openAIClient;
        SearchIndexClient _indexClient;
        IConfiguration configuration;
        string searchKey, searchEndpoint;
        IConfiguration _configuration;
        string indexName=string.Empty;
        string indexerName = string.Empty;
        string azureAIDataSourceName = string.Empty;
        string azureBolbContainerName = string.Empty;
        string azureBlobStorageConnectionString = string.Empty;
        string azureOpenAIKey = string.Empty;
        string azureOpenAIEndpoint = string.Empty;
        string azureOpenAIChatDeploymentName = string.Empty;
        string azureOpenAIEmbeddingDeploymentName = string.Empty;

        string extractedText = string.Empty;

        public AzureAISearchServiceManager(IConfiguration configuration)
        {
            _configuration = configuration ?? throw new ArgumentNullException(nameof(configuration));
            searchKey = _configuration["AzureSettings:AzureAISearchKey"] ?? throw new ArgumentNullException("AzureSettings:AzureAISearchKey");
            searchEndpoint = _configuration["AzureSettings:AzureAISearchEndPoint"] ?? throw new ArgumentNullException("AzureSettings:AzureAISearchEndPoint");
            _credential = new AzureKeyCredential(searchKey);

            _indexClient = new SearchIndexClient(new Uri(searchEndpoint), _credential);
            azureOpenAIEndpoint = _configuration["AzureSettings:AzureOpenAIServiceEndpoint"] ?? throw new ArgumentNullException("AzureSettings:AzureOpenAIServiceEndpoint");
            azureOpenAIKey = _configuration["AzureSettings:AzureOpenAIServiceApiKey"] ?? throw new ArgumentNullException("AzureSettings:AzureOpenAIServiceApiKey");
            azureOpenAIChatDeploymentName = _configuration["AzureSettings:AzureOpenAIServiceChatDeploymentName"] ?? throw new ArgumentNullException("AzureSettings:AzureOpenAIServiceChatDeploymentName");
            azureOpenAIEmbeddingDeploymentName = _configuration["AzureSettings:AzureOpenAIServiceEmbeddingDeploymentName"] ?? throw new ArgumentNullException("AzureSettings:AzureOpenAIServiceEmbeddingDeploymentName");
            azureAIDataSourceName = _configuration["AzureSettings:AzureAISearchDataSourceName"] ?? throw new ArgumentNullException("AzureSettings:AzureAISearchDataSourceName");   
            azureBlobStorageConnectionString = _configuration["AzureSettings:AzureBLOBConnectionString"] ?? throw new ArgumentNullException("AzureSettings:AzureBLOBConnectionString");
            azureBolbContainerName = _configuration["AzureSettings:AzureBLOBStorageContainerName"] ?? throw new ArgumentNullException("AzureSettings:AzureBLOBStorageContainerName");
            indexName = _configuration["AzureSettings:AzureAISearchIndexName"] ?? throw new ArgumentNullException("AzureSettings:AzureAISearchIndexName");
            indexerName = _configuration["AzureSettings:AzureAISearchIndexerName"] ?? throw new ArgumentNullException("AzureSettings:AzureAISearchIndexerName");


            var builder = Kernel.CreateBuilder();
            builder.AddAzureOpenAIChatCompletion(
                                 azureOpenAIChatDeploymentName,
                                  azureOpenAIEndpoint,
                                  azureOpenAIKey
                                  );
            kernel = builder.Build();

              CreateIndex() ;
           
        }

        private async void CreateIndex()
        {


            var fields = new List<SearchField>
        {
            new SimpleField("chunk_id", SearchFieldDataType.String)
            {
                IsKey = true,
                IsFilterable = true
            },
            new SearchableField("content")
            {
                IsSortable = false,
                IsFilterable = false,
                IsFacetable = false
            },
            new SimpleField("metadata_storage_name", SearchFieldDataType.String)
            {
                IsFilterable = true,
                IsSortable = true
            },
            new SimpleField("metadata_storage_path", SearchFieldDataType.String)
            {
                IsFilterable = true,
                IsSortable = false
            },
            new VectorSearchField("text_vector", 3072, vectorSearchProfileName: "vector-config")
        };



            var definition = new SearchIndex(indexName) 
            {
                Fields = fields,
                VectorSearch = new VectorSearch
                {
                    Algorithms =
                        {
                            new HnswAlgorithmConfiguration("hnsw-config")
                        },
                    Profiles =
                        {
                            new VectorSearchProfile("vector-config", "hnsw-config")
                        }
                }
            };
            _indexClient.CreateOrUpdateIndex(definition);



            var dataSource = new SearchIndexerDataSourceConnection(
                name: azureAIDataSourceName,
                type: SearchIndexerDataSourceType.AzureBlob,
                connectionString: azureBlobStorageConnectionString,
                container: new SearchIndexerDataContainer(azureBolbContainerName)
            );


            var indexerClient = new SearchIndexerClient(new Uri(searchEndpoint), _credential);
            indexerClient.CreateOrUpdateDataSourceConnection(dataSource);


            
            var indexer = new SearchIndexer(
                name: indexerName,
                dataSourceName: dataSource.Name,
                targetIndexName: indexName 
            );

            indexer.Parameters = new IndexingParameters
            {
                BatchSize = 1,
                MaxFailedItems = -1,
                MaxFailedItemsPerBatch = -1,
                
            };

            indexerClient.CreateOrUpdateIndexer(indexer);
           
        }


        private void RunIndexer()
        {
            var indexerClient = new SearchIndexerClient(new Uri(searchEndpoint), _credential);
            indexerClient.RunIndexer(indexerName);
        }
}

Listing 3: AzureAISearchServiceManager class

As shown in Listing 3, the AzureAISearchServiceManager class has the constructor that reads the keys from the appsettings.json. These are used to create instances SearchIndexClient. Furthermore, the constructor creates a IKernelBuilder instance using the CreateBuilder() method of the Kernel class. The AddAzureOpenAIChatCompletion() method of the IKernelBuilder interface. This method is a part of Semantic Kernel framework that is used to integrated Azure OpenAI capabilities in to the application. It registers a chat completion service with the kernel so you can interact with models like GPT-4 hosted on Azure. This method accepts following 3 parameters:

The Azure Open AI Specifies the deployment name that we already did in Azure AI Foundry Portal.
Azure Open AI Service Endpoint to connect to the service
Azure Open AI Key to authenticate the application against the service

The AzureAISearchServiceManager class has the CreateIndex() method. This method has the following specifications:

The List of SearchField class:

This list defines an index schema for Azure AI Search Service using SimpleField, SearchableField, and VectorSearchField.
We have created index schema as follows:

chunk_id: This is SimpleFiled, This is a string filed. This issued as a Key field in the index schema.
content: This is a SearchableField. This is used to store contents from the documents those are being used for searching for building RAG application.
metadata_storage_name: This is SimpleField. This represents the name of the document that is stored in the index. This is used to retrieve the reference documents name when the RAG result is returned back to the prompt (query).
metadata_storage_path: This is SimpleField. This is used to store the path of the document stored in the index.
text_vector: This is VectorSearchField. This is the most important field of the index schema. This is used to store vector embedding of the documents those are used as a source for the query. This field uses the VectorSearchProfile as explained in the initial part of Step 3.

Create SearchIndex class, is used to create SearchIndex

The constructor class accepts the indexname. The indexname is defined in appsettings.json file.
The Fields property of the class is accepting the Index Schema List.
The VectorSearch property is instantiated using VectorSearch Algorithm and Profile using HnswAlgorithmConfiguration and VectorSearchProfile classes respectively.

The CreateOrUpdateIndex() method of the SearchIndexClient class is used to create an Index in Azure AI Search.
The SearchIndexerDataSourceConnection class instance is created:

This class in Azure AI Search defines the connection between the search indexer and the external data source it pulls from. e.g. Azure Blob, SQL Server, etc.
It is a key part of configuring an indexer, which automates data ingestion into the search index.
The following properties are used to create an instance of SearchIndexerDataSourceConnection class:

name: The name of the data source. Currently, we are reading this name from the appsettings.json.
type: The actual data source that will be connected with Azure AI Search. Currently, we are using Azure Blob.
connectionString: The Azure Blob Storage connection string.
container: The Azure Blob container name that contains the documents.

The instance of the SearchIndexerClient class is created by using the Azure AI Search Endpoint and Ket Credentials. The CreateOrUpdateDataSourceConnection() method of the SearchIndexerClient class is used to create a Data Source in Azure AI Search.
Furthermore, an instance of the SearchIndexer class is created using the IndexerName that we have defined in the appsettings.json file, then the datasource name and finally using the indexname.
The CreateOrUpdateIndexer() method of the SearchIndexerClient class used to create indexer in Azure AI Search.

The CreateIndex() method is called using the constructor of the AzureAISearchServiceManager class. The class also contains the RunIndexer() method. This method is used to run the indexer created in Azure AI Search.

Step 4: Register the AzureAISearchServiceManager class in the Dependency Container of the ASP.NET Core application as shown in Listing 4.

.........
builder.Services.AddSingleton<AzureAISearchServiceManager>();
.........

Listing 4: The registration of the AzureAISearchServiceManager class in Dependency Container

Run the application and invoke the default endpoint that is provided in the ASP.NET Core API project e.g. weatherforecast. The constructor of the AzureAISearchServiceManager is invoked, that creates the DataSource, Index, and Indexer in the Azure AI Search as shown in Following Figures:

Figure 7: The Data Source

Figure 8: The Index

Figure 9: The Indexer

So now we have created the Azure Open AI Service, Model Deployment, Azure AI Search Service, and using code we have created the Data Source, Index, and Indexer configuration. In Next step we will Implement the RAG using the code.

Search This Blog

Technology Wonders

Azure AI Building RAG Application Solution: Using Azure AI Search Service and Creating Data Source, Index, and Indexer

Popular posts from this blog

ASP.NET Core 8: Creating Custom Authentication Handler for Authenticating Users for the Minimal APIs

ASP.NET Core 7: Using PostgreSQL to store Identity Information

Uploading Excel File to ASP.NET Core 6 application to save data from Excel to SQL Server Database