---
type: system
title: Grounded RAG assistant running in production
domain: ai
services: [ai-automation, ai-dev-culture]
skills: [RAG, LLM Integration, Prompt Engineering, Guardrails, Go]
technologies: [Go, Astro, OpenRouter, RAG, SSE, MCP]
problem: Visitors want a fast, honest answer on whether there's a fit — but generic chatbots hallucinate, leak their system prompt, and can be hijacked by instructions hidden inside user messages (prompt injection).
approach: Built a grounded RAG assistant that answers using only a per-language knowledge base compiled from the site's own content. It resists prompt injection (ignores instructions embedded in user messages), refuses to fabricate metrics, clients, or availability, sanitizes conversation history, and runs a structured lead-qualification flow. Model routing and knowledge-base caching keep per-conversation cost bounded.
result: A live, self-hosted production LLM system — not a demo — serving real visitor traffic on this site, grounded enough to cite its sources and safe enough to refuse out-of-scope or injected requests.
evidence: Running live on this page — open the assistant panel and try it, including a prompt-injection attempt.
public_links: []
available_for: live demo
language: en
canonical: https://asmanmalikov.com/en/proof/rag-assistant/
---

# Grounded RAG assistant running in production

- **Problem:** Visitors want a fast, honest answer on whether there's a fit — but generic chatbots hallucinate, leak their system prompt, and can be hijacked by instructions hidden inside user messages (prompt injection).
- **Approach:** Built a grounded RAG assistant that answers using only a per-language knowledge base compiled from the site's own content. It resists prompt injection (ignores instructions embedded in user messages), refuses to fabricate metrics, clients, or availability, sanitizes conversation history, and runs a structured lead-qualification flow. Model routing and knowledge-base caching keep per-conversation cost bounded.
- **Result:** A live, self-hosted production LLM system — not a demo — serving real visitor traffic on this site, grounded enough to cite its sources and safe enough to refuse out-of-scope or injected requests.
- **Evidence:** Running live on this page — open the assistant panel and try it, including a prompt-injection attempt.
