docs: rewrite LOCAL_DEVELOPMENT.md, delete stale CLAUDE_DESKTOP_SETUP.md

CLAUDE_DESKTOP_SETUP.md was actively misleading — listed 8 tools (out of 47), referenced the old `mcp-pdf-tools` package name, and had hardcoded user paths. README.md and QUICKSTART.md cover the same territory correctly now, and nothing in the repo links to it. LOCAL_DEVELOPMENT.md kept its structure (setup → wiring up → testing → publishing → gotchas) but updated to reflect current reality: - `claude mcp add` syntax now uses the required `--` separator - Three patterns shown (local source, pinned PyPI version, latest PyPI with --refresh) since they each serve different dev workflows - markdown_to_pdf added to manual verification checklist - Publishing pipeline now matches what we actually do (clean dist/, PII audit per global rules, twine for upload since uv publish doesn't read ~/.pypirc) - Common gotchas section: mktexfmt errors, FunctionTool test failures, PyPI JSON caching — all real things hit during this session - Removed claim that the server has "23 PDF tools"
2026-05-05 17:23:14 -06:00 · 2026-05-05 17:23:14 -06:00 · c3dd788120
commit c3dd788120
parent 31b8b2e6d4
2 changed files with 102 additions and 230 deletions
--- a/CLAUDE_DESKTOP_SETUP.md
+++ b/CLAUDE_DESKTOP_SETUP.md
@ -1,88 +0,0 @@
 # Claude Desktop MCP Configuration
 This document explains how the MCP PDF Tools server has been configured for Claude Desktop.
 ## Configuration Location
 The MCP configuration has been added to:
 ```
 /home/rpm/.config/Claude/claude_desktop_config.json
 ```
 ## PDF Tools Server Configuration
 The following configuration has been added to your Claude Desktop:
 ```json
 {
  "mcpServers": {
    "pdf-tools": {
      "command": "uv",
      "args": [
        "--directory",
        "/home/rpm/claude/mcp-pdf-tools",
        "run",
        "mcp-pdf-tools"
      ],
      "env": {
        "PDF_TEMP_DIR": "/tmp/mcp-pdf-processing"
      }
    }
  }
 }
 ```
 ## What This Enables
 With this configuration, all your Claude sessions will have access to:
 - **extract_text**: Extract text from PDFs with multiple method support
 - **extract_tables**: Extract tables from PDFs with intelligent fallbacks
 - **extract_images**: Extract and filter images from PDFs
 - **extract_metadata**: Get comprehensive PDF metadata and file information
 - **get_document_structure**: Analyze PDF structure, outline, and fonts
 - **is_scanned_pdf**: Detect if PDFs are scanned/image-based
 - **ocr_pdf**: Perform OCR on scanned PDFs with preprocessing
 - **pdf_to_markdown**: Convert PDFs to clean markdown format
 ## Environment Variables
 - `PDF_TEMP_DIR`: Set to `/tmp/mcp-pdf-processing` for temporary file processing
 ## Backup
 A backup of your original configuration has been saved to:
 ```
 /home/rpm/.config/Claude/claude_desktop_config.json.backup
 ```
 ## Testing
 The server has been tested and is working correctly. You can verify it's available in new Claude sessions by checking for the `mcp__pdf-tools__*` functions.
 ## Troubleshooting
 If you encounter issues:
 1. **Server not starting**: Check that all dependencies are installed:
   ```bash
   cd /home/rpm/claude/mcp-pdf-tools
   uv sync --dev
   ```
 2. **System dependencies missing**: Install required packages:
   ```bash
   sudo apt-get install tesseract-ocr tesseract-ocr-eng poppler-utils ghostscript python3-tk default-jre-headless
   ```
 3. **Permission issues**: Ensure temp directory exists:
   ```bash
   mkdir -p /tmp/mcp-pdf-processing
   chmod 755 /tmp/mcp-pdf-processing
   ```
 4. **Test server manually**:
   ```bash
   cd /home/rpm/claude/mcp-pdf-tools
   uv run mcp-pdf-tools --help
   ```
--- a/LOCAL_DEVELOPMENT.md
+++ b/LOCAL_DEVELOPMENT.md
@ -1,201 +1,161 @@
-# 🔧 Local Development Guide for MCP PDF
+# Local Development Guide
-This guide shows how to test MCP PDF locally during development before publishing to PyPI.
+This guide is for working **on** mcp-pdf — testing changes against a local Claude Code session before publishing to PyPI. End-user install instructions live in [README.md](README.md) and [QUICKSTART.md](QUICKSTART.md).
-## 📋 Prerequisites
+## Prerequisites
 - Python 3.10+
- uv package manager
+- [uv](https://docs.astral.sh/uv/) package manager
- Claude Desktop app
+- Claude Code CLI (or Claude Desktop app)
- Git repository cloned locally
+- Git
-## 🚀 Quick Start for Local Testing
+System binaries needed at runtime are listed in [README.md → System Dependencies](README.md#system-dependencies). For development you can skip the ones you're not actively touching (e.g. skip `pandoc` if you're not changing `markdown_to_pdf`).
-### 1. Clone and Setup
+## Setup
 ```bash
 # Clone the repository
 git clone https://github.com/rsp2k/mcp-pdf.git
 cd mcp-pdf
-# Install dependencies
+# Install dev deps + all optional extras so every tool is exercised
-uv sync --dev
+uv sync --all-extras --dev
-# Verify installation
+# Smoke test — does the package even import?
-uv run python -c "from mcp_pdf.server import create_server; print('✅ MCP PDF loads successfully')"
+uv run python -c "from mcp_pdf.server import create_server; print('ok')"
 ```
-### 2. Add MCP Server to Claude Desktop
+## Wiring up a local checkout to Claude Code
-#### For Production Use (PyPI Installation)
+Three useful patterns, in order of how often you'll want each:
-Install the published version from PyPI:
+### Pattern 1 — Local source, hot-reload edits
 Best when iterating on tool code. `uv run` resolves to the local checkout, so saves are picked up next time the MCP server restarts.
 ```bash
-# For personal use across all projects
+# From inside the mcp-pdf checkout
-claude mcp add -s local pdf-tools uvx mcp-pdf
+claude mcp add -s project pdf-tools-dev -- uv --directory . run mcp-pdf
-# For project-specific use (isolated to current directory)
+# Or from anywhere, with an absolute path
-claude mcp add -s project pdf-tools uvx mcp-pdf
+claude mcp add -s project pdf-tools-dev -- uv --directory /path/to/mcp-pdf run mcp-pdf
 ```
-#### For Local Development (Source Installation)
+The `--` separator is required so the Claude CLI doesn't try to parse `--directory` as one of its own flags.
-When developing MCP PDF itself, use the local source:
+### Pattern 2 — Specific PyPI version (regression testing)
 Useful for confirming that a regression report on a published version actually reproduces.
 ```bash
-# For development from local source
+claude mcp add -s project pdf-tools-pinned -- uvx --from "mcp-pdf==2.2.0" mcp-pdf
 claude mcp add -s project pdf-tools-dev uv -- --directory /path/to/mcp-pdf-tools run mcp-pdf
 ```
-Or if you're in the mcp-pdf directory:
+### Pattern 3 — Latest from PyPI (post-publish smoke test)
 After running `uv publish`, verify the new version actually works in Claude before announcing it:
 ```bash
-# Development server from current directory
+claude mcp add -s project pdf-tools-released -- uvx --refresh --from "mcp-pdf[markdown]" mcp-pdf
 claude mcp add -s project pdf-tools-dev uv -- --directory . run mcp-pdf
 ```
-### 3. Alternative: Manual Server Testing
+`--refresh` forces uvx to bypass its cache, otherwise it'll keep using the previous release.
-You can also run the server manually for debugging:
+## Running the server outside Claude (for debugging)
 Sometimes the MCP transport hides the real failure. Run the server directly:
 ```bash
 # Run the MCP server directly
 uv run mcp-pdf
-
+# Or, equivalently:
 # Or run with specific FastMCP options
 uv run python -m mcp_pdf.server
 ```
-### 4. Test Core Functionality
+Then use a test harness like the FastMCP in-process client, or hit the stdio transport directly with JSON-RPC messages. The server logs go to stderr.
-Once connected to Claude Code, test these key features:
+## Tests
 #### Basic PDF Processing
 ```
 "Extract text from this PDF file: /path/to/test.pdf"
 "Get metadata from this PDF: /path/to/document.pdf"
 "Check if this PDF is scanned: /path/to/scan.pdf"
 ```
 #### Security Features
 ```
 "Try to extract text from a very large PDF"
 "Process a PDF with 2000 pages" (should be limited to 1000)
 ```
 #### Advanced Features
 ```
 "Extract tables from this PDF: /path/to/tables.pdf"
 "Convert this PDF to markdown: /path/to/document.pdf"
 "Add annotations to this PDF: /path/to/target.pdf"
 ```
 ## 🔒 Security Testing
 Verify the security hardening works:
 ### File Size Limits
 - Try processing a PDF larger than 100MB
 - Should see: "PDF file too large: X bytes > 104857600"
 ### Page Count Limits  
 - Try processing a PDF with >1000 pages
 - Should see: "PDF too large for processing: X pages > 1000"
 ### Path Traversal Protection
 - Test with malicious paths like `../../../etc/passwd`
 - Should be blocked with security error
 ### JSON Input Validation
 - Large JSON inputs (>10KB) should be rejected
 - Malformed JSON should return clean error messages
 ## 🐛 Debugging
 ### Enable Debug Logging
 ```bash
-export DEBUG=true
+# Full suite
-uv run mcp-pdf
+uv run pytest
 ```
-### Check Security Functions
+# A single area
-```bash
+uv run pytest tests/test_server.py
-# Test security validation functions
+
 # With coverage
 uv run pytest --cov=mcp_pdf
 # Security-focused tests live at the repo root (not under tests/)
 uv run python test_security_features.py
 # Run integration tests
 uv run python test_integration.py
 ```
-### Verify Package Structure
+## Manual verification before a release
 A short list to walk through against a real PDF before publishing. Update version in `pyproject.toml` first.
 - [ ] `extract_text` on a small PDF returns text
 - [ ] `extract_text` on a 1000+ page PDF chunks correctly (no MCP "response too large" errors)
 - [ ] `extract_tables` finds tables in a financial-report-style PDF
 - [ ] `pdf_to_markdown` writes to disk by default and the inline escape hatch (`inline=True`) returns full markdown
 - [ ] `markdown_to_pdf` converts a real `.md` → PDF via the auto-detected engine (run with `pdf_engine=None` and check the `engine_used` field in the response)
 - [ ] `extract_form_data` + `fill_form_pdf` round-trip on an interactive form
 - [ ] `add_sticky_notes`, `add_highlights`, `add_stamps` all produce viewable annotations in Acrobat
 - [ ] Path traversal blocked: passing `../../etc/passwd` returns a sanitized error
 - [ ] Page count limit blocked: a >1000-page PDF returns a sanitized error
 - [ ] Error messages don't leak `/home/$USER/` paths (the `sanitize_error_message` helper handles this)
 ## Publishing pipeline
 ```bash
-# Check package builds correctly
+# 1. Bump version in pyproject.toml (date-based or semver — see global rules)
 # 2. Sync uv.lock — easy to forget; the lockfile and pyproject.toml versions
 #    must agree or the published wheel and sdist disagree
 uv lock
 # 3. Clean dist/ — uv publish uploads everything in dist/, so stale files
 #    from previous releases cause duplicate-upload errors
 rm -rf dist/
 # 4. Build
 uv build
-# Verify package metadata
+# 5. Pre-publish PII audit — see ~/.claude/rules/python.md for the
 #    full grep recipe. The unpacked-sdist check is non-negotiable.
 mkdir -p /tmp/sdist-audit && tar -xzf dist/*.tar.gz -C /tmp/sdist-audit
 grep -rnEi 'real-domain|10\.[0-9]+\.|192\.168\.|/home/' /tmp/sdist-audit/
 # 6. Validate metadata
 uv run twine check dist/*
 # 7. Upload (uv publish doesn't read ~/.pypirc — twine does)
 uv run twine upload dist/*
 # 8. Verify the version is live
 curl -s -o /dev/null -w "HTTP %{http_code}\n" \
    https://pypi.org/project/mcp-pdf/$(grep ^version pyproject.toml | cut -d'"' -f2)/
 ```
-## 📊 Testing Checklist
+Once published, push tags + commits to both remotes:
 Before publishing, verify:
 - [ ] All 23 PDF tools work correctly
 - [ ] Security limits are enforced (file size, page count)
 - [ ] Error messages are clean and helpful  
 - [ ] No sensitive information leaked in errors
 - [ ] Path traversal protection works
 - [ ] JSON input validation works
 - [ ] Memory limits prevent crashes
 - [ ] CLI command `mcp-pdf` works
 - [ ] Package imports correctly: `from mcp_pdf.server import create_server`
 ## 🚀 Publishing Pipeline
 Once local testing passes:
 1. **Version Bump**: Update version in `pyproject.toml`
 2. **Build**: `uv build`  
 3. **Test Upload**: `uv run twine upload --repository testpypi dist/*`
 4. **Test Install**: `pip install -i https://test.pypi.org/simple/ mcp-pdf`
 5. **Production Upload**: `uv run twine upload dist/*`
 ## 🔧 Development Commands
 ```bash
-# Format code
+git push new-origin main
-uv run black src/ tests/
+git push origin main   # gitea mirror
 # Lint code  
 uv run ruff check src/ tests/
 # Run tests
 uv run pytest
 # Security scan
 uv run pip-audit
 # Build package
 uv build
 # Install editable for development
 pip install -e .  # (in a venv)
 ```
-## 🆘 Troubleshooting
+## Common gotchas
-### "Module not found" errors
+### "Module not found" after a `git pull`
- Ensure you're in the right directory
+Run `uv sync --all-extras` again — new optional dependencies don't auto-install.
 - Run `uv sync` to install dependencies
 - Check Python path with `uv run python -c "import sys; print(sys.path)"`
-### MCP server won't start
+### MCP server connects but tools don't appear
- Check that all system dependencies are installed (tesseract, java, ghostscript)
+The Claude Code session caches the tool list at MCP-server-connect time. After adding a new tool, run `/mcp` and reconnect.
 - Verify with: `uv run python examples/verify_installation.py`
-### Security tests fail
+### `markdown_to_pdf` errors with `mktexfmt: Did not find entry for byfmt=xelatex`
- Run `uv run python test_security_features.py -v` for detailed output
+The host's TeX install is missing format files. Either run `sudo fmtutil-sys --all` to regenerate them, or sidestep by passing `pdf_engine="weasyprint"` (after `pip install weasyprint`) or `pdf_engine="tectonic"`.
 - Check that security constants are properly set
-This setup allows for rapid development and testing without polluting your system Python or needing to publish to PyPI for every change.
+### Tests fail with `'FunctionTool' object is not callable`
 Tests under `tests/test_server.py` call the `@mcp_tool`-decorated methods directly, but `@mcp_tool` wraps them. Call the methods via the mixin instance instead — see `tests/test_mixin_architecture.py` for the working pattern.
 ### `uv publish` succeeds but PyPI shows old version
 PyPI's package-level JSON cache (`/pypi/<pkg>/json`) lags by a minute or two. The version-specific URL (`/pypi/<pkg>/<ver>/`) updates immediately — use that for verification.