From LLM to RAG: How Azure OpenAI Powers Smarter Maritime ERP at mBluewave

Introduction

Large Language Models (LLMs) like GPT are changing how we interact with systems. They can generate human-like answers, write summaries, and even reason across documents. But there is a catch: LLMs don’t know your company’s data, asking any question specific to that data will result in convincing but wrong answer. A serious risk in industries like banking, finance and maritime, where compliance and safety are non-negotiable.

That is where Retrieval-Augmented Generation (RAG) comes in: you ground the LLM with your own data. At mBluewave, our AI-first Maritime ERP platform, we built a RAG-powered assistant using Azure OpenAI + Azure AI Search to make vessel manuals and procedures accessible using simple human-like interaction.

Why RAG?

Ask GPT: “How do I reset the fire suppression system?” It might give you a plausible but incorrect procedure. With RAG, the answer is pulled directly from the vessel’s own safety manual, ensuring compliance and accuracy.

Architecture at a glance:

Manuals and safety documents in Azure Blob Storage
Chunked, embedded, and indexed with Azure AI Search
Retrieval with metadata filters (by vessel, department, doc type)
Azure OpenAI generates grounded answers with citations

mBluewave RAG Example

User query:

“What are the steps to reset the fire suppression system on Vessel A?”

Workflow:

Search service retrieves the most relevant passages from Vessel A’s manuals.
Context is injected into the GPT prompt.
Model answers only from sources, returning step-by-step instructions.
Citations show exactly which manual and page were used.

Result: A precise, compliant answer tailored to that vessel — not a hallucination.

Enterprise GTP using Azure Open AI and Azure AI Search

(Not an actual example but similar to ours, copied from an open-source GitHub repository)

Sample Code (C# / ASP.NET Minimal API)

// Ask endpoint: retrieve, ground, and answer
app.MapPost("/ask", async (AskRequest req, SearchClient search, OpenAIClient openai) =>
{
    // 1) Vector search with vessel filter
    var options = new SearchOptions { Size = 5 };
    options.Filter = $"vessel eq '{req.Vessel}'";
    options.Select.Add("chunkText");
    options.Select.Add("source");

    var results = await search.SearchAsync<SearchDocument>("", options);
    var top = results.Value.GetResults().Select(r => (string)r.Document["chunkText"]).ToList();

    // 2) Build grounded prompt
    var sys = "Answer ONLY from provided sources. If unknown, say so.";
    var context = string.Join("\n\n", top);
    var user = $"{req.Question}\n\nSources:\n{context}";

    var chat = new ChatCompletionsOptions
    {
        DeploymentName = "gpt-4.1-mini",
        Temperature = 0,
        Messages =
        {
            new ChatRequestSystemMessage(sys),
            new ChatRequestUserMessage(user)
        }
    };

    var completion = await openai.GetChatCompletionsAsync(chat);
    return Results.Ok(completion.Value.Choices[0].Message.Content[0].Text);
});

Handling Wrong Answers

No AI system is perfect. Besides putting a disclaimer that AI can make mistakes, the way we are handling these scenarios are:

Guardrails in design if the answer isn’t in the docs, the AI says “I don’t know, please check the manual”.
Citations every answer points back to the source, building user trust.
Feedback loop crew can flag answers, helping us refine and retrain.

Keeping Costs in Check

It is not cheap to keep this running but cost are manageable if right practices are followed:

Use smaller models where possible (e.g., GPT-4.1-mini instead of full GPT-4.1 for everyday queries).
Embed once, reuse often embeddings are only re-generated when documents change.
Chunk smartly splitting docs into ~700–900 token sections balances accuracy and token usage.
Cache common queries frequent lookups (e.g., “When is next inspection due?”) can be cached for near-zero cost.
Monitor usage track tokens, queries per vessel, and adjust.

Future Directions

Multi-modal RAG combining text with diagrams, checklists, and images.
Offline/on-board sync ensuring crews have access even without internet.

From LLM to RAG: How Azure OpenAI Powers Smarter Maritime ERP at mBluewave

Introduction

Why RAG?

mBluewave RAG Example

Sample Code (C# / ASP.NET Minimal API)

Handling Wrong Answers

Keeping Costs in Check

Future Directions

Comments

More from this blog

Step towards Self-Serve Analytics: Natural Language Queries on Databases

Why Most Data Teams Don’t Need Real-Time Streaming

The Hidden Costs of Data Pipelines (and How to Control Them)

Shifting Security Left in Data Platforms: Docker Image Scanning with Trivy

Command Palette

Introduction

Why RAG?

mBluewave RAG Example

Sample Code (C# / ASP.NET Minimal API)

Handling Wrong Answers

Keeping Costs in Check

Future Directions

Comments

More from this blog