mcp-pdf-tools/LOCAL_DEVELOPMENT.md
Ryan Malloy c3dd788120
Some checks are pending
Security Scan / security-scan (push) Waiting to run
docs: rewrite LOCAL_DEVELOPMENT.md, delete stale CLAUDE_DESKTOP_SETUP.md
CLAUDE_DESKTOP_SETUP.md was actively misleading — listed 8 tools (out of
47), referenced the old `mcp-pdf-tools` package name, and had hardcoded
user paths. README.md and QUICKSTART.md cover the same territory
correctly now, and nothing in the repo links to it.

LOCAL_DEVELOPMENT.md kept its structure (setup → wiring up → testing →
publishing → gotchas) but updated to reflect current reality:

- `claude mcp add` syntax now uses the required `--` separator
- Three patterns shown (local source, pinned PyPI version, latest PyPI
  with --refresh) since they each serve different dev workflows
- markdown_to_pdf added to manual verification checklist
- Publishing pipeline now matches what we actually do (clean dist/,
  PII audit per global rules, twine for upload since uv publish
  doesn't read ~/.pypirc)
- Common gotchas section: mktexfmt errors, FunctionTool test failures,
  PyPI JSON caching — all real things hit during this session
- Removed claim that the server has "23 PDF tools"
2026-05-05 17:23:14 -06:00

6.1 KiB

Local Development Guide

This guide is for working on mcp-pdf — testing changes against a local Claude Code session before publishing to PyPI. End-user install instructions live in README.md and QUICKSTART.md.

Prerequisites

  • Python 3.10+
  • uv package manager
  • Claude Code CLI (or Claude Desktop app)
  • Git

System binaries needed at runtime are listed in README.md → System Dependencies. For development you can skip the ones you're not actively touching (e.g. skip pandoc if you're not changing markdown_to_pdf).

Setup

git clone https://github.com/rsp2k/mcp-pdf.git
cd mcp-pdf

# Install dev deps + all optional extras so every tool is exercised
uv sync --all-extras --dev

# Smoke test — does the package even import?
uv run python -c "from mcp_pdf.server import create_server; print('ok')"

Wiring up a local checkout to Claude Code

Three useful patterns, in order of how often you'll want each:

Pattern 1 — Local source, hot-reload edits

Best when iterating on tool code. uv run resolves to the local checkout, so saves are picked up next time the MCP server restarts.

# From inside the mcp-pdf checkout
claude mcp add -s project pdf-tools-dev -- uv --directory . run mcp-pdf

# Or from anywhere, with an absolute path
claude mcp add -s project pdf-tools-dev -- uv --directory /path/to/mcp-pdf run mcp-pdf

The -- separator is required so the Claude CLI doesn't try to parse --directory as one of its own flags.

Pattern 2 — Specific PyPI version (regression testing)

Useful for confirming that a regression report on a published version actually reproduces.

claude mcp add -s project pdf-tools-pinned -- uvx --from "mcp-pdf==2.2.0" mcp-pdf

Pattern 3 — Latest from PyPI (post-publish smoke test)

After running uv publish, verify the new version actually works in Claude before announcing it:

claude mcp add -s project pdf-tools-released -- uvx --refresh --from "mcp-pdf[markdown]" mcp-pdf

--refresh forces uvx to bypass its cache, otherwise it'll keep using the previous release.

Running the server outside Claude (for debugging)

Sometimes the MCP transport hides the real failure. Run the server directly:

uv run mcp-pdf
# Or, equivalently:
uv run python -m mcp_pdf.server

Then use a test harness like the FastMCP in-process client, or hit the stdio transport directly with JSON-RPC messages. The server logs go to stderr.

Tests

# Full suite
uv run pytest

# A single area
uv run pytest tests/test_server.py

# With coverage
uv run pytest --cov=mcp_pdf

# Security-focused tests live at the repo root (not under tests/)
uv run python test_security_features.py
uv run python test_integration.py

Manual verification before a release

A short list to walk through against a real PDF before publishing. Update version in pyproject.toml first.

  • extract_text on a small PDF returns text
  • extract_text on a 1000+ page PDF chunks correctly (no MCP "response too large" errors)
  • extract_tables finds tables in a financial-report-style PDF
  • pdf_to_markdown writes to disk by default and the inline escape hatch (inline=True) returns full markdown
  • markdown_to_pdf converts a real .md → PDF via the auto-detected engine (run with pdf_engine=None and check the engine_used field in the response)
  • extract_form_data + fill_form_pdf round-trip on an interactive form
  • add_sticky_notes, add_highlights, add_stamps all produce viewable annotations in Acrobat
  • Path traversal blocked: passing ../../etc/passwd returns a sanitized error
  • Page count limit blocked: a >1000-page PDF returns a sanitized error
  • Error messages don't leak /home/$USER/ paths (the sanitize_error_message helper handles this)

Publishing pipeline

# 1. Bump version in pyproject.toml (date-based or semver — see global rules)

# 2. Sync uv.lock — easy to forget; the lockfile and pyproject.toml versions
#    must agree or the published wheel and sdist disagree
uv lock

# 3. Clean dist/ — uv publish uploads everything in dist/, so stale files
#    from previous releases cause duplicate-upload errors
rm -rf dist/

# 4. Build
uv build

# 5. Pre-publish PII audit — see ~/.claude/rules/python.md for the
#    full grep recipe. The unpacked-sdist check is non-negotiable.
mkdir -p /tmp/sdist-audit && tar -xzf dist/*.tar.gz -C /tmp/sdist-audit
grep -rnEi 'real-domain|10\.[0-9]+\.|192\.168\.|/home/' /tmp/sdist-audit/

# 6. Validate metadata
uv run twine check dist/*

# 7. Upload (uv publish doesn't read ~/.pypirc — twine does)
uv run twine upload dist/*

# 8. Verify the version is live
curl -s -o /dev/null -w "HTTP %{http_code}\n" \
    https://pypi.org/project/mcp-pdf/$(grep ^version pyproject.toml | cut -d'"' -f2)/

Once published, push tags + commits to both remotes:

git push new-origin main
git push origin main   # gitea mirror

Common gotchas

"Module not found" after a git pull

Run uv sync --all-extras again — new optional dependencies don't auto-install.

MCP server connects but tools don't appear

The Claude Code session caches the tool list at MCP-server-connect time. After adding a new tool, run /mcp and reconnect.

markdown_to_pdf errors with mktexfmt: Did not find entry for byfmt=xelatex

The host's TeX install is missing format files. Either run sudo fmtutil-sys --all to regenerate them, or sidestep by passing pdf_engine="weasyprint" (after pip install weasyprint) or pdf_engine="tectonic".

Tests fail with 'FunctionTool' object is not callable

Tests under tests/test_server.py call the @mcp_tool-decorated methods directly, but @mcp_tool wraps them. Call the methods via the mixin instance instead — see tests/test_mixin_architecture.py for the working pattern.

uv publish succeeds but PyPI shows old version

PyPI's package-level JSON cache (/pypi/<pkg>/json) lags by a minute or two. The version-specific URL (/pypi/<pkg>/<ver>/) updates immediately — use that for verification.