Azure Open AI: Generating SQL Statement from the Prompt and executing it

In this article, we will implement the process of generating SQL Statements from the simple English language prompt.  AI Prompt is an input send to the AI model to guide its response. In a simple language we can define Prompt as a beginning of the conversation so that the appropriate response can be received. The accuracy of the response is depending on the appropriate prompt input to the model. The AI model is capable enough to understand and compile the prompt and generate response. In different context the prompt can be set in different format, e.g. Technical Prompt for expecting technical answers, Creative Prompt for expecting a story or a poem, and the Simple Prompt for simple translation. OfCourse for the accurate response the prompt must be accurate. 

In the Line-of-Business (LOB) applications we have the data mostly stored in the Relational Databases. To extract information from various table from database, we need to write complex queries with joins across tables (if needed). In such cases, the end-user (if non-technical) may expect you to design a simple prompt-based UI so that the end-user can enter a plain simple English prompt and expect the result e.g. Please provide me details of orders placed by Contact Name Maria Anders from Customer Tables and Orders Table       

Now how to implement such application? Well, this is where we can use the AI capabilities. We can implement this type of requirement using AI Chat GPT models. Figure 1 will demonstrate the implementation:


Figure 1: The Prompt-to-SQL Statement Generator

To implement the code for this article, you must satisfy the following prerequisites:

  • Azure Subscription
    • Knowledge of Azure Resource Group
    • Required permissions to create AI resources
  • Visual Studio 2022
  • .NET 9
As shown in Figure 2, you need to create an Open AI Service using Azure Portal.   




Figure 2: The Azure Open AI Service

As shown in Figure 2, copy the Keys and Endpoint. We will be needing it in forthcoming steps in our code. Figure 3 shows Keys and Endpoint.



Figure 3: Keys and Endpoint

We need to deploy the GTP model, as shown in Figure 1, click on the Explore Azure AI Foundry Portal. This is the portal where we can deploy GPT and other models. Once the portal is opened, click on Deployments and create a new deployment as shown in Figure 4.



Figure 4: Deployment of the Model

As shown in Figure 4, click on the Deploy base model. You will be shown with the List of the Models from where you can choose the suitable model, I have used the gpt-4.1 as shown in Figure 5



Figure 5: The GPT4.1 Model.

Once the Confirm button is clicked, we can customize the deployment e.g. Deployment Name, Deployment Type, Tokens, and other details as shown in Figure 6. Make sure that you copy the Deployment name that we will be using in the code.  



Figure 6: The Deployment Customization

We have out deployment ready, now we can use this deployment in the ASP.NET Core 9 API application. 

Step 1: Open Visual Studio 2022, create a new ASP.NET Core 9 API project and name it as Core_APISQLGenerator. In this project add the following NuGet Packages:

  • Azure.AI.OpenAI, I am using version 2.1.0
    • Make sure that you use this version, because the application code is for 2.1.0 version. One more important thing is that older version has method changes so that will be problem while running the application.
  • Microsoft.Data.SqlClient  
Step 2: In the project, add folder named Models. In this folder we will add class files named Query.cs and Response.cs. In the Query.cs add the code as shown in Listing 1.


namespace Core_APISQLGenerator.Model
{
    public class Query
    {
        public string? Prompt { get; set; }
    }
}

Listing 1: Query.cs 

We will use this class to send Prompt input to API. In the Response.cs we will add code for the Response class to receive the response from the API. Listing 2, shows code for the Response class


namespace Core_APISQLGenerator.Model
{
    public class AppResponse
    {
        public string? Prompt { get; set; }
        public string? SqlQuery { get; set; }
        public string? SqlResponse { get; set; }
    }
}
Listing 2: The Response class
This class contains properties that will return the output prompt generated from input prompt along with the SqlQuery and SqlResponse generated based on the input prompt.

Step 3: For this article, I am using the Northwind database, you can download the SQL Script for this database from the following link.   

Step 4: Modify the appsettings.json file where we will add configuration for the Connection string to database, Open API Key and Deployment name as shown in Listing 3.


{
  "Logging": {
    "LogLevel": {
      "Default": "Information",
      "Microsoft.AspNetCore": "Warning"
    }
  },
    "AllowedHosts": "*",
    "ConnectionStrings": {
        "DbConnection": "Server=SERVER;Database=nw;Integrated Security=SSPI;TrustServerCertificate=True;"
    },
    "AzureAISettings": {
        "DeploymentName": "mygpt21devpy",
        "APIVersion": "2024-12-01-preview",
        "Endpoint": "https://[OPEN-API-NAME].openai.azure.com/",
        "APIKey": "[YOUR_NEW_API_KEY]"
    }
}
Listing 3: The appsettings.json
 
Step 5: In the project, add a new folder named Services. In this folder add a new class file named LLMSQLGenerator.cs. This file contains code for the  LLMSQLGenerator class. This class contains code for various method those uses Azure Open AI and AI Chat to generate SQL from the prompt. This class will have following methods:
  • GenerateSQLQueryAsync(): 
    • This method accepts a prompt from the end-user. This prompt is then used to create a User Message using the CreateUserMessage() method of the ChatMessage class. The ChatMessage class has another important method named CreateSystemMessage(). This method is used to define system message so that the User Message and System Message can be used to generate the SQL System. 
    • This method uses the AzureOpenAIClient class. This class is used to connect with the Azure Open AI Service created earlier using it Endpoint and it authenticate the current application to the Azure Open AI Service using the API Key of the Azure Open AI Service that we have copied earlier from the Azure Portal. The Authentication takes place using the ApiKeyCredential class.  
    • The GetChatClient() method of the AzureOpenAIClient class accepts the deployment name as an input parameter. This deployment name is the same that we have used by deploying the GPT-4.1 model using an Azure AI Foundry portal. This method returns an instance of the ChatClient class. 
    • The ChatClient class has the CompleteChatAsync() method. This method accepts the List of ChatMessage class that has the System and User messages as well as the ChatCompletionOptions class as input parameters. The ChatCompletionOptions class is used to define chat options like temperature. The CompleteChatAsync() method returns the ChatResult object. This object contains the result in the ChatCompleion object.  
    •  If the input prompt is as follows "Please provide me sum of Freight by Each Ship Country from the OrderDetailsDynamic Table, I want to see the second maximum sum" , then the resultant prompt will be as follows: 
"Certainly! To get the second maximum sum of Freight by each ShipCountry from the OrderDetailsDynamic table, you can use the following SQL query:\n\n```sql\nSELECT ShipCountry, FreightSum\nFROM (\n    SELECT ShipCountry, SUM(Freight) AS FreightSum,\n           DENSE_RANK() OVER (ORDER BY SUM(Freight) DESC) AS rk\n    FROM OrderDetailsDynamic\n    GROUP BY ShipCountry\n) t\nWHERE rk = 2;\n```\n\nThis query will return the ShipCountry (or countries, in case of a tie) with the second highest total Freight.","
    •  If you see the output prompt, we have SQL statement, but it is in text form; and we need to extract the SQL statement. To do this the GenerateSQLQueryAsync() method contains string processing logic to extract the SQL Statement. If there are multiple Select statements are generated, then to pick the first select statement the GenerateSQLQueryAsync() method uses the helper method named PickSelectStamenents(). This method performs string processing and using the RegEx classit picks the first match of the Select statement so that it can be executed. NOTE: YOU CAN FURTHER FINETUNE THIS LOGIC WHERE, YOU CAN PICK ALL SELECT STATEMENTS THOSE ARE GENERATED.      
  • RunSqlQuery():
    • This method accepts the extracted SQL Statement from the GenerateSQLQueryAsync() method and execute it by connecting to the SQL Server database. This method returns a List of ExpandoObject. This is the dynamically generated object because generated SQL Statements may return any data from any tables from the database.
    • This method uses a helper method named ConvertToDynamicObjects(). This method accepts the string output generated from the select statements and convert it into the ExpandoObject so that the result is generated. The ConvertToDynamicObjects() method uses the string processing logic and result will be generated in the KeyValue pair like COLUMN-NAME:VALUE-OF-THE-COLUMN. 
The code for the LLMSQLGenerator class is shown in Listing 4.


using System.Dynamic;
using System.Text.Json;
using System.Text.RegularExpressions;
using Azure.AI.OpenAI;
using Core_APISQLGenerator.Model;
using Microsoft.Data.SqlClient;
using OpenAI.Chat;
namespace Core_APISQLGenerator.Services
{
    public class LLMSQLGenerator
    {
        AzureOpenAIClient _client;
        string? connectionString = string.Empty;
        string? deploymentName = string.Empty;

        public LLMSQLGenerator(IConfiguration configuration)
        {
            string? apiKey = configuration["AzureAISettings:APIKey"];
             

            string? endpoint = configuration["AzureAISettings:Endpoint"];
             

            _client = new AzureOpenAIClient(new Uri(endpoint), new System.ClientModel.ApiKeyCredential(apiKey));
             

            var connStr = configuration["ConnectionStrings:DbConnection"];
            

            connectionString = connStr;

            deploymentName = configuration["AzureAISettings:DeploymentName"];
        }

        public List<ExpandoObject> RunSqlQuery(string query)
        {
            string result = string.Empty;
            var records = new List<ExpandoObject>();
            using (var connection = new SqlConnection(connectionString))
            {
                connection.Open();
                using (var command = new SqlCommand(query, connection))
                using (var reader = command.ExecuteReader())
                {
                    if (reader.HasRows)
                    {
                        while (reader.Read())
                        {
                            for (int i = 0; i < reader.FieldCount; i++)
                            {
                                result += $"{reader.GetName(i)}: {reader[i]}\n";
                            }
                            result += "\n";
                        }
                        // Convert the result to JSON format
                        records = ConvertToDynamicObjects(result);

                    }
                    else
                    {
                        result = "No data found.";
                    }
                }
            }
            return records;
        }

        public async Task<AppResponse> GenerateSQLQueryAsync(string prompt)
        {
            string result = string.Empty;
            var response = new AppResponse();
            if (string.IsNullOrEmpty(prompt))
            {
                throw new ArgumentException("Prompt should not be null or empty.");
            }
            try
            {
                var messages = new List<ChatMessage>
                {
                    ChatMessage.CreateSystemMessage("You are an AI that generates SQL queries."),
                    ChatMessage.CreateUserMessage($"Generate a SQL query based on the following prompt:{prompt}"),
                };

                var chatClient = _client.GetChatClient(deploymentName);
                
                
                // Create a chat completion request

                var options =  new ChatCompletionOptions
                {
                    Temperature = 0.5f
                };
         

                var completion = await chatClient.CompleteChatAsync(messages, options);
              

                // Extract the text from the ChatMessageContent object
                result = string.Join("", completion.Value.Content.Select(part => part.Text));
                response.Prompt = result;
                result = result.Replace("```sql", "")
                            .Replace("```", " ")
                            .Replace("\\n", " ")  // if literal \n
                            .Replace("\n", " ")   // if real newline characters
                            .Trim();

                
                // Extract SQL statements from the result
                var actualResult = PickSelectStatements(result);

                if (actualResult.Count > 0)
                {
                    response.SqlQuery = actualResult[0]; // Assuming we want to store the first SQL statement
                }
                else
                {
                    throw new InvalidOperationException("No valid SQL statements found in the response.");
                }
            }
            catch (Exception ex)
            {
                throw new InvalidOperationException("Error generating SQL query.", ex);
            }
            return response;
        }



       


        private   List<ExpandoObject> ConvertToDynamicObjects(string result)
        {
            var records = new List<ExpandoObject>();
            var lines = result.Split(new[] { '\n', '\r' }, StringSplitOptions.RemoveEmptyEntries);

            foreach (var line in lines)
            {
                dynamic record = new ExpandoObject();
                var dictionary = (IDictionary<string, object>)record;

                var fields = line.Split(new[] { "  " }, StringSplitOptions.RemoveEmptyEntries);
                foreach (var field in fields)
                {
                    var keyValuePair = field.Split(new[] { ':' }, 2);
                    if (keyValuePair.Length == 2)
                    {
                        string key = keyValuePair[0].Trim();
                        string value = keyValuePair[1].Trim();
                        dictionary[key] = value;
                    }
                }

                records.Add(record);
            }

            return records;
        }

        private List<string> PickSelectStatements(string content)
        {
            var statements = new List<string>();

            // Regex to match SELECT to end of semicolon or next blank line or end of string
            var queryPattern = @"(?is)SELECT\b.*?(?:;|\n\s*\n|$)";
            var matches = Regex.Matches(content, queryPattern);

            foreach (Match match in matches)
            {
                var sql = match.Value.Trim();
                // Optional cleanup: remove markdown fences like ```sql or ```
                sql = Regex.Replace(sql, @"^```sql|```", "", RegexOptions.IgnoreCase).Trim();
                statements.Add(sql);
            }

            return statements;
        }
    }
}
Listing 4: The LLMSQLGenerator class

Step 6: Modify the Program.cs by adding the code for the dependency registration of LLMSQLGenerator class in dependency container and add a new Endpoint to access methods from the LLMSQLGenerator class to send the prompt and received the SQL result. The code for the Program.cs is shown in Listing 5.


using Core_APISQLGenerator.Model;
using Core_APISQLGenerator.Services;

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddScoped<LLMSQLGenerator>();
// Add services to the container.
// Learn more about configuring OpenAPI at https://aka.ms/aspnet/openapi
builder.Services.AddOpenApi();

var app = builder.Build();

// Configure the HTTP request pipeline.
if (app.Environment.IsDevelopment())
{
    app.MapOpenApi();
}

app.UseHttpsRedirection();

 
app.MapPost("/getresult", async (LLMSQLGenerator llmSqlGenerator, Query query) =>
{
    AppResponse response = new AppResponse(); 
    response = await llmSqlGenerator.GenerateSQLQueryAsync(query.Prompt);
    var sqlResult = llmSqlGenerator.RunSqlQuery(response.SqlQuery);

    
    return Results.Ok(new { GeneratedStatement = response.Prompt, SqlQuery = response.SqlQuery, SqlResult = sqlResult,  });
});

app.Run();

 
Listing 5: The Program.cs code

Run the application, and try the following prompts:

{ "Prompt":"List all Customer Names from OrderDetailsDynamic Table those are haveing Ship Country as France" }

The Result for the above prompt is as shown in Figure 7:



Figure 7: The Prompt 1 Result



{ "Prompt":"Please provide me sum of Freight by Each Ship Country from the OrderDetailsDynamic Table" }



Figure 8: The Prompt 2 Result


Code for this article can be downloaded from this link.  


Popular posts from this blog

Uploading Excel File to ASP.NET Core 6 application to save data from Excel to SQL Server Database

ASP.NET Core 6: Downloading Files from the Server

ASP.NET Core 6: Using Entity Framework Core with Oracle Database with Code-First Approach