2 Commits

Author SHA1 Message Date
4090c788a2 Strip operator-private files from sdist + add structural defense
Some checks failed
Security Scan / security-scan (push) Has been cancelled
The PII audit run before this publish caught three files that have been
leaking operator-specific paths to PyPI in v2.1.6, v2.1.7, and v2.2.0:

- claude_desktop_config.json (personal Claude Desktop config snapshot)
- mcp-pdf-tools-launcher.sh (obsolete — uvx replaces it)
- mcp-config-example.json (had hardcoded /home/rpm path + old package name)

Fix:
- Delete the personal config and obsolete launcher
- Sanitize the example to use uvx with the [markdown] extra (matches docs)
- Add [tool.hatch.build.targets.sdist] exclude block per
  ~/.claude/rules/python.md to prevent recurrence — covers dev artifacts,
  fixture PDFs, internal architecture notes, and CI scripts

Side benefit: sdist size dropped from 2.4 MB to 304 KB (8× reduction),
mostly from excluding examples/*.pdf and the tests/ fixture PDF.

The /home/rpm leaks in prior versions are not credentials, just operator
paths — not yanking. Going forward the unpacked-sdist grep is mandatory
before each publish.
2026-05-05 17:38:13 -06:00
c902e81e4d Initial commit: Complete MCP PDF Tools server implementation
Features:
- 8 comprehensive PDF processing tools with intelligent fallbacks
- Text extraction (PyMuPDF, pdfplumber, pypdf with auto-selection)
- Table extraction (Camelot → pdfplumber → Tabula fallback chain)
- OCR processing with Tesseract and preprocessing options
- Document analysis (structure, metadata, scanned detection)
- Image extraction with filtering capabilities
- PDF to markdown conversion with metadata
- Built on FastMCP framework with full MCP protocol support
- Comprehensive error handling and user-friendly messages
- Docker support and cross-platform compatibility
- Complete test suite and examples

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-10 16:36:21 -06:00