Overview
A retrieval-augmented generation chat interface built on the Vercel AI SDK and OpenAI. The app draws from 100+ internal documentation articles (sourced from the shared mc-docs package) to answer staff questions about WiFi configuration, PaperCut printing, Education Perfect, BYOD setup, and dozens more school systems — with streamed responses, inline source citations, and contextual follow-up suggestions.
Key Features
- Streaming chat responses — Real-time token streaming via the Vercel AI SDK for responsive conversation
- Inline source citations — Each answer includes badge-linked references to the source documents used during retrieval
- Follow-up suggestions — AI-generated follow-up questions appear after each response to guide staff toward related topics
- Clarification prompts — When a query is ambiguous, the model asks for clarification before answering
- Markdown rendering — Responses rendered with GitHub Flavored Markdown for structured formatting
Pipeline
Document ingestion scripts process the mc-docs content into a vector store. At query time, relevant documents are retrieved, injected into the prompt as context, and the augmented prompt is streamed back to the client. An evaluation framework measures answer quality against expected outputs.
Design Decisions
- RAG over fine-tuning: Internal documentation changes constantly — new systems get deployed, procedures update, staff change. Fine-tuning would require retraining the model on every content change, which is impractical for a school ICT team. RAG lets you update the retrieval corpus by re-running ingestion without touching the model. It also provides citations for free — staff can see which documents contributed to each answer, building trust in the responses. Keyword search alone wouldn’t handle the natural language gap (e.g. “how do I get printing working” vs. a doc titled “PaperCut Configuration Guide”).
- Evaluation framework: Measures retrieval relevance (are the correct documents surfaced for known queries?) and answer accuracy (does the generated response match expected output?). This shaped the chunking strategy and prompt design — if evaluation showed the model pulling from too many loosely-related docs, retrieval was tightened; if answers were correct but unsourced, the prompt was adjusted to require citations.
Demo
RAG-powered tech support chat answering staff questions with inline citations.