Astro + Starlight documentation site for mcwaddams MCP server. Features: - Diátaxis documentation structure (tutorials, how-to, reference, explanation) - Office Space theming (Milton Waddams, TPS Reports, red Swingline) - 29 documentation pages covering all 20 tools - TPS Reports section for test results - Flair gamification config (pieces of documentation flair) - Custom CSS with Office Space color scheme Structure: - Getting Started: backstory, installation, quickstart - Tutorials: first extraction, legacy formats, indexing, resources - How-To: tables, Excel analysis, markdown, pagination, URLs - Reference: all tools, universal/word/excel tools, MCP resources, formats - Explanation: architecture, mixins, fallbacks, resource system - TPS Reports: dashboard, coverage, torture tests - Community: credits, feedback, leaderboard
99 lines
3.1 KiB
Plaintext
99 lines
3.1 KiB
Plaintext
---
|
|
title: The Backstory
|
|
description: How Milton Waddams became the patron saint of legacy document processing.
|
|
---
|
|
|
|
import { Aside } from '@astrojs/starlight/components';
|
|
|
|
# The Backstory
|
|
|
|
> *"I was told I could listen to the radio at a reasonable volume from nine to eleven..."*
|
|
|
|
## The Relocation
|
|
|
|
Milton Waddams was relocated to the basement. They took his stapler. But down there, surrounded by boxes of `.doc` files from 1997 and `.xls` spreadsheets that predate Unicode, he became something else entirely.
|
|
|
|
He became a **document processing expert**.
|
|
|
|
<Aside type="tip" title="Fun Fact">
|
|
After the movie *Office Space* came out, Swingline didn't actually make a red stapler. Demand was so high they started manufacturing them. The movie created the product.
|
|
</Aside>
|
|
|
|
## The Problem
|
|
|
|
Every enterprise has them:
|
|
|
|
- **The Archive Folder** — 50,000 Word documents from before the cloud existed
|
|
- **The Legacy Database Export** — Excel files with formulas referencing cells that no longer exist
|
|
- **The Board Presentations** — PowerPoint decks with embedded charts from 2003
|
|
- **The Contract Repository** — `.doc` files that crash modern Word
|
|
|
|
AI agents can read PDFs. They can parse JSON. But Office documents? The binary formats, the OLE containers, the OOXML with custom schemas?
|
|
|
|
*Nobody wants to deal with that.*
|
|
|
|
## The Solution
|
|
|
|
**mcwaddams** handles the documents nobody else wants to touch.
|
|
|
|
```python
|
|
# Extract text from a 1997 .doc file
|
|
result = await extract_text("contract_final_FINAL_v2.doc")
|
|
|
|
# It just works
|
|
print(result["text"])
|
|
```
|
|
|
|
### What We Handle
|
|
|
|
| Format | Era | Status |
|
|
|--------|-----|--------|
|
|
| `.docx` | 2007+ | ✅ Full support |
|
|
| `.doc` | 1997-2007 | ✅ Works fine |
|
|
| `.xlsx` | 2007+ | ✅ Full support |
|
|
| `.xls` | 1997-2007 | ✅ Works fine |
|
|
| `.pptx` | 2007+ | ✅ Full support |
|
|
| `.ppt` | 1997-2007 | ✅ Works fine |
|
|
|
|
<Aside type="caution" title="Did You Get the Memo?">
|
|
If a document is password-protected or encrypted, we'll detect it and tell you. We can't extract content from encrypted files, but we won't silently fail either.
|
|
</Aside>
|
|
|
|
## The Philosophy
|
|
|
|
### 1. No Silent Failures
|
|
|
|
When python-docx can't handle a file, mammoth tries. When openpyxl fails, pandas steps in. You'll always get either content or a clear error message explaining why.
|
|
|
|
### 2. Legacy is Not Abandoned
|
|
|
|
Those `.doc` files from 2003? They're still business-critical for someone. We don't treat legacy formats as second-class citizens.
|
|
|
|
### 3. Context-Aware Extraction
|
|
|
|
Large documents get paginated automatically. The MCP resource system lets you fetch chapters on-demand. Your context window stays manageable.
|
|
|
|
### 4. Testing Painful Stuff
|
|
|
|
We threw 301 random Office documents at mcwaddams. **299 succeeded.** The 2 failures were empty/corrupt files.
|
|
|
|
See the [TPS Reports](/tps/dashboard/) for proof.
|
|
|
|
---
|
|
|
|
## The Name
|
|
|
|
Milton Waddams. The guy with the stapler. Relegated to the basement with the old filing cabinets and the roaches.
|
|
|
|
That's where the legacy documents live too.
|
|
|
|
*"I could set the building on fire..."*
|
|
|
|
---
|
|
|
|
<div style="text-align: center; margin-top: 2rem;">
|
|
|
|
**Ready to start?** → [Installation](/installation/)
|
|
|
|
</div>
|