System · ai
Grounded RAG assistant running in production
AI AutomationAI Development Culture GoAstroOpenRouterRAGSSEMCP
site content ──► knowledge base (per-lang, cached) ──► retrieval ──► guarded LLM (OpenRouter) ──► cited answer
│
prompt-injection defense · anti-fabrication · history sanitization Problem
Visitors want a fast, honest answer on whether there's a fit — but generic chatbots hallucinate, leak their system prompt, and can be hijacked by instructions hidden inside user messages (prompt injection).
Approach
Built a grounded RAG assistant that answers using only a per-language knowledge base compiled from the site's own content. It resists prompt injection (ignores instructions embedded in user messages), refuses to fabricate metrics, clients, or availability, sanitizes conversation history, and runs a structured lead-qualification flow. Model routing and knowledge-base caching keep per-conversation cost bounded.
Result
A live, self-hosted production LLM system — not a demo — serving real visitor traffic on this site, grounded enough to cite its sources and safe enough to refuse out-of-scope or injected requests.
Evidence
Running live on this page — open the assistant panel and try it, including a prompt-injection attempt.
Available for: live demo
The engineering is in the guardrails, not the model call: grounding on retrieved context only, prompt-injection resistance, anti-fabrication rules, and a routed, cached model path that keeps cost predictable. Go’s concurrency makes concurrent and streaming requests natural — the design is SSE- and MCP-ready.