# Reference Architecture: MCP Server + SSE Chat on FastAPI Pattern for adding an MCP server and a streaming chat assistant to an existing FastAPI application with any frontend framework. First built for the [Margaret Hamilton Digital Archive](https://hamilton.warehack.ing) (Starlight + vanilla JS + FastAPI), then adapted for [SpiceBook](https://spicebook.warehack.ing) (Astro SSR + React 19 + FastAPI). Both are in production. --- ## Origin Story The Hamilton Archive needed a chat assistant that could answer questions about Apollo-era documents using RAG (retrieval-augmented generation). The requirements were: 1. **MCP server** — so Claude Code and other MCP clients could query the archive programmatically 2. **Chat panel** — floating widget on all pages, streaming LLM responses via SSE, aware of whatever the user was currently reading (a Starlight page, a PDF in the viewer, etc.) 3. **RAG pipeline** — semantic search → batch SQL fetch → character-budget truncation → LLM completion This was built as vanilla TypeScript (no framework) because the Hamilton Archive uses Starlight with static output — there's no React, no Zustand, no build-time component hydration. The chat widget is a single 1,125-line `.ts` file that does manual DOM manipulation, localStorage conversation management, and inline Lucide SVG icon paths. When the same pattern was needed for SpiceBook, the architecture was adapted: - **Frontend**: React 19 with Zustand for state, split across `ChatWidget.tsx` + `chat-store.ts` + `chat-api.ts` - **Context model**: `PageContext(title, path, description)` → `NotebookContext(notebook_id, title, engine)` — the domain changed but the shape is identical - **RAG function**: `_build_context(query)` → `_build_notebook_context(req)` — this is the main customization point between deployments - **Caddy routing**: per-route `handle` blocks → single `@api.path` matcher — simpler but less precise **What stays identical** across both projects: | Component | Identical? | Notes | |-----------|:----------:|-------| | SSE event protocol | Yes | `status`, `token`, `reasoning`, `error`, `done` | | SSE client parser | Yes | `parseSSEBlock()` with `\n\n` boundary detection | | `_sse_event()` helper | Yes | Compact JSON formatting | | httpx streaming client | Yes | Same timeouts, limits, connection pooling | | `_chat_completion_stream()` | Yes | Same SSE line parser for OpenAI-compatible endpoints | | MCP mounting pattern | Yes | `mcp.http_app()` + `combine_lifespans()` + `app.mount("/mcp", ...)` | | FastMCP tool conventions | Yes | Return `str` (JSON), never raise `HTTPException` | | Conversation limits | Yes | MAX_CONVERSATIONS=20, MAX_MESSAGES=50 | | Title derivation | Yes | First user message truncated to ~50-60 chars | --- ## Two Frontend Variants ### Variant A: Vanilla TypeScript (Hamilton Archive) **Single file**: `chat-widget.ts` (1,125 lines) — no framework, no build-time hydration, no npm state library. **Entry point**: ```typescript // ChatWidget.astro (11 lines) import { initChatWidget } from './chat-widget.ts' initChatWidget() document.addEventListener('astro:after-swap', initChatWidget) ``` **State management**: Module-scoped variables + direct localStorage: ```typescript const STORAGE_KEY_INDEX = 'hamilton-chat-conversations' const STORAGE_KEY_ACTIVE = 'hamilton-chat-active' const STORAGE_KEY_PREFIX = 'hamilton-chat-conv-' const STORAGE_KEY_LEGACY = 'hamilton-chat-history' // flat format, auto-migrated ``` Storage uses a split architecture: an index array (conversation metadata) stored separately from individual conversation message arrays (`STORAGE_KEY_PREFIX + id`). This avoids loading all message content when just rendering the history list. **DOM manipulation**: `data-open`/`data-view` attributes on the widget root element control CSS visibility. Rendering is imperative — `renderMessages()`, `renderHistoryList()`, etc. Lucide icons are pasted as inline SVG path strings (no icon library dependency). **Key advantage**: Zero JS framework overhead. The widget works in any static site (Starlight, plain HTML, Hugo) because it only needs a `