Compare commits

..

113 Commits

Author SHA1 Message Date
38df6ee12a Expand docs site to 15 pages, add project URLs to pyproject.toml
Some checks failed
Build Ghidra Plugin / build (push) Has been cancelled
9 new pages organized by diataxis: guides (workflows, cursor
pagination, troubleshooting), reference (REST API, MCP resources,
configuration), concepts (architecture, prior art), and changelog.

Rewrote mcp-tools.md to cover all 64 tools across 14 categories.
Updated overview with architecture diagram and capability summary.
Added Claude Desktop config paths to installation page.

Sidebar now has 5 sections with 12 navigable entries.
Version bumped to 2026.3.7 with docs/repo/issues URLs for PyPI.
2026-03-07 17:21:03 -07:00
1db36464ed Add docs site for mcghidra.warehack.ing
Some checks are pending
Build Ghidra Plugin / build (push) Waiting to run
Starlight/Astro docs site following the warehack.ing cookie-cutter
pattern. Landing page with architecture overview, getting-started
guide with install and firmware import examples, Docker reference
with env vars and port pool docs, and MCP tools reference.

Warm amber/rust color scheme. Caddy prod + Node dev Docker stages.
2026-03-06 22:35:27 -07:00
0250c2df01 Add language, base address, and loader support for raw firmware import
Some checks are pending
Build Ghidra Plugin / build (push) Waiting to run
Wire GHIDRA_LANGUAGE, GHIDRA_BASE_ADDRESS, GHIDRA_LOADER through the
Docker entrypoint and MCP tools so raw binaries (e.g., ARM7TDMI firmware)
get the correct processor, memory map, and loader instead of relying on
auto-detection. Auto-sets BinaryLoader when language is specified.

Input validation at both Python and bash layers prevents malformed values
from reaching analyzeHeadless.
2026-03-06 21:46:22 -07:00
83949683ae Add /health endpoint to Java plugin and update health checks
Some checks are pending
Build Ghidra Plugin / build (push) Waiting to run
New GET /health endpoint returns status, uptime, api_version, and
loaded program without depending on program state. Lightweight
enough for Docker HEALTHCHECK and monitoring probes.

Python docker_health tool tries /health first, falls back to root
endpoint for older plugin versions. Docker HEALTHCHECK updated to
use /health instead of /.
2026-03-06 14:40:23 -07:00
14b2b575c8 Remove AI buzzwords from descriptions and metadata
Some checks failed
Build Ghidra Plugin / build (push) Has been cancelled
Describe what the tool does, not how it works — builds trust
and sets accurate expectations.
2026-03-02 05:07:44 -07:00
f4cf1cef9e Fix port allocation Groundhog Day loop, expand pool to 128 ports
Some checks are pending
Build Ghidra Plugin / build (push) Waiting to run
Port allocator would spin on the same Docker-occupied ports because
releasing a flock and re-calling allocate() restarts from port 8192.
Now holds flocks on occupied ports during the scan so allocate()
advances past them.

Also expands default pool from 32 to 128 ports (8192-8319), and
makes range configurable via MCGHIDRA_PORT_START/MCGHIDRA_PORT_END
environment variables.
2026-03-02 05:06:27 -07:00
112c1969c8 Fix port allocation to skip ports used by external Docker containers
Some checks failed
Build Ghidra Plugin / build (push) Has been cancelled
When port 8192 was already in use by a non-MCGhidra container (e.g.,
LTspice), docker_start would fail instead of trying the next port.
Now loops through the pool, checking each candidate against Docker's
published ports before using it.

Also includes Docker build retry improvements from earlier session.
2026-02-11 05:37:40 -07:00
57f042a802 Fix exception handling for functions_create and data_create
- Change from 'except Exception' to bare 'except' to catch Java
  exceptions from Ghidra that don't inherit from Python Exception
- Use sys.exc_info() to safely extract error messages when str(e)
  might fail on certain Java exception types
- Add null checks after getAddress() since it can return None
  instead of throwing for invalid addresses
- Add last-resort response handling to prevent silent connection
  drops when exception handling itself fails

These endpoints now return proper JSON error responses instead of
causing "Empty reply from server" errors.
2026-02-07 06:22:25 -07:00
842035ca92 Remove dead UI tools that can never work in headless MCP mode
ui_get_current_address and ui_get_current_function require Ghidra GUI
context to know what the user has selected. Since MCP always runs
headless (Docker container), these tools always fail with HEADLESS_MODE
error. Removed them to avoid confusion.

Alternative: Use explicit addresses with functions_get(address=...) or
data_list(addr=...) instead.
2026-02-07 06:01:30 -07:00
c930e7c059 fix: Complete rename of remaining ghydra references
Some checks failed
Build Ghidra Plugin / build (push) Has been cancelled
- Rename docker/GhydraMCPServer.py → MCGhidraServer.py
- Update extension.properties, MANIFEST.MF, Module.manifest
- Update .env and .env.example env var names
2026-02-07 02:28:54 -07:00
1143489924 refactor: Rename project from ghydramcp to mcghidra
Some checks are pending
Build Ghidra Plugin / build (push) Waiting to run
- Rename src/ghydramcp → src/mcghidra
- Rename GhydraMCPPlugin.java → MCGhidraPlugin.java
- Update all imports, class names, and references
- Update pyproject.toml package name and script entry
- Update Docker image names and container prefixes
- Update environment variables: GHYDRA_* → MCGHIDRA_*
- Update all documentation references
2026-02-07 02:13:53 -07:00
d1750cb339 fix: Address code review issues across core modules
Some checks are pending
Build Ghidra Plugin / build (push) Waiting to run
- http_client: Defensive copy before .pop() to avoid mutating caller's dict
- analysis.py: Add debug logging for fallback paths instead of silent swallow
- docker.py: Add debug logging to PortPool exception handlers
- docker.py: Fix file descriptor leak in _try_acquire_port with inner try/except
- docker.py: Lazy PortPool initialization via property to avoid side effects
- server.py: Wrap initial discovery in _instances_lock for thread safety
- server.py: Call configure_logging() at startup with GHYDRAMCP_DEBUG support
- pagination.py: Use SHA-256 instead of MD5 for query hash consistency
- base.py: Add proper type annotations (Dict[str, Any])
- filtering.py: Use List[str] from typing for consistency
- filtering.py: Add docstrings to private helper methods
- structs.py: Rename project_fields param to fields for API consistency
- logging.py: Fix import path from deprecated mcp.server.fastmcp to fastmcp
2026-02-06 04:50:47 -07:00
04f3011413 docs: Rewrite README for clarity and current features
Some checks are pending
Build Ghidra Plugin / build (push) Waiting to run
- Punchy hero section with terminal-style demo
- Feature table showing all 64 tools across 12 categories
- Docker quickstart as primary path (easiest)
- Clear usage patterns: current instance, Docker workflow, pagination
- Concise tool reference grouped by category
- Architecture notes explaining AI-agent design decisions
- Removed outdated v2.1 references and verbose API examples
2026-02-06 00:58:52 -07:00
41bd8445e9 fix: Make docker_health use current instance port by default
docker_health now uses get_instance_port(port) like all other tools,
so it defaults to the current working instance when no port is specified.

Workflow:
1. docker_auto_start(binary) -> returns port
2. Poll docker_health(port=N) until healthy
3. instances_use(port=N) to set as current
4. All subsequent analysis calls omit port
2026-02-06 00:49:41 -07:00
d298a89f5f refactor: Remove docker_wait tool entirely
docker_wait was the same anti-pattern as wait param - it blocked
a single tool call for up to 5 minutes with no visibility.

LLMs should poll docker_health(port) in their own loop. This gives:
- Visibility into progress between polls
- Ability to check docker_logs while waiting
- Control over timeout and retry logic
- Opportunity to bail out early
2026-02-06 00:48:26 -07:00
5300fb24b8 refactor: Remove wait/timeout params from docker_auto_start
The wait parameter was a convenience anti-pattern that caused LLMs
to block on a single tool call for up to 5 minutes with no visibility
into progress.

Now docker_auto_start always returns immediately. Clients should use
docker_wait(port) separately to poll for container readiness. This
gives visibility into progress and allows early bailout.
2026-02-06 00:44:44 -07:00
6662c8411a fix: Make all Docker subprocess calls non-blocking
Previously only docker_health was fixed to use run_in_executor(),
but all other Docker operations (docker_status, docker_start,
docker_stop, docker_logs, docker_build, docker_cleanup) still
used synchronous subprocess.run() which blocked the async event
loop. This caused docker_auto_start(wait=True) to freeze the
entire MCP server.

Now _run_docker_cmd is async and runs subprocess calls in thread
executor. All callers updated to use await.
2026-02-06 00:41:25 -07:00
f1986db6cc docs: Update CHANGELOG with Sprint 3+4 features and stability fixes
Some checks are pending
Build Ghidra Plugin / build (push) Waiting to run
Added:
- Symbol CRUD operations (create/rename/delete/imports/exports)
- Bookmark management tools
- Enum and typedef creation
- Variable management (list/rename/functions_variables)
- Namespace and class tools
- Memory segment tools

Changed:
- Docker port allocation now auto-allocated from pool (8192-8223)
- docker_auto_start defaults to wait=False

Fixed:
- instances_use hanging (lazy registration pattern)
- Event loop blocking in docker_health (run_in_executor)
- Session isolation for docker_stop/docker_cleanup
- Background discovery thread timeout (30s → 0.5s)
- Typedef/variable type resolution
2026-02-05 10:39:18 -07:00
7eefdda9f8 Merge feat/api-gap-fill: Session isolation, non-blocking I/O, CRUD operations
Some checks are pending
Build Ghidra Plugin / build (push) Waiting to run
Sprint 3+4 API gap filling plus critical stability fixes:

Features:
- Symbol CRUD (create, rename, delete)
- Bookmark management (list, create, delete)
- Enum/typedef creation
- Variable rename with type resolution

Stability fixes:
- Lazy instances_use (no blocking HTTP calls)
- Non-blocking health checks via thread executor
- Session isolation for docker_stop/cleanup
- Auto port allocation (removed client-specified ports)
- wait=False default for docker_auto_start
2026-02-05 09:27:26 -07:00
19f1428fb6 chore: track binaries/.gitkeep for test fixtures 2026-02-05 09:05:30 -07:00
48ccc2aff3 fix: non-blocking health checks and wait defaults
- docker_health now runs HTTP call in thread executor instead of
  blocking the async event loop (prevents MCP server freeze during polls)
- docker_auto_start defaults to wait=False so tool returns immediately
  (clients should call docker_wait separately if needed)
2026-02-02 14:25:07 -07:00
77ce01d313 fix: session isolation for docker_stop and docker_cleanup
- docker_stop now validates container belongs to current session
  before stopping (prevents one agent from stopping another's work)
- docker_cleanup now defaults to session_only=True for safety
  (agents can still use session_only=False with caution)

Addresses audit finding: tools could cause cross-session interference
2026-02-02 11:41:08 -07:00
d1f8779f05 fix: Remove client-specified port from docker_start/auto_start
Ports are now always allocated from the pool (8192-8199) automatically.
This prevents session collisions where different agents would specify
the same port and interfere with each other.

Clients can't accidentally (or intentionally) override the port allocation
— the pool manager handles all assignments.
2026-02-02 11:36:45 -07:00
458d4fb35b fix: Eliminate blocking HTTP call from instances_use
instances_use previously called register_instance which made a
blocking safe_get (30s timeout) to validate the connection. If the
Ghidra server was slow or unresponsive, this could hang the MCP tool
call indefinitely from the client's perspective.

Now instances_use creates a lazy stub entry and sets the port
immediately — pure in-memory, no network I/O. The first actual tool
call validates the connection naturally.

Also fix background discovery thread using request_timeout (30s)
instead of discovery_timeout (0.5s) per port — worst case went from
300s to 5s per scan cycle.
2026-01-31 20:20:30 -07:00
c93abaf86c feat: Complete Sprint 3+4 — CRUD operations, bookmarks, enums, typedefs
Sprint 3 (Symbol & Variable CRUD):
- Add symbols_create, symbols_rename, symbols_delete MCP tools
- Add variables_rename MCP tool with optional type change
- Implement corresponding Jython HTTP handlers in headless server

Sprint 4 (Bookmarks & Data Types):
- Add BookmarksMixin (bookmarks_list, bookmarks_create, bookmarks_delete)
- Add DataTypesMixin (enums_list, enums_create, typedefs_list, typedefs_create)
- Register both mixins in server.py, add resource caps in config.py

Fixes:
- Use resolve_data_type() for typedef creation and variable retyping
  (was missing builtin types like int, char, void)
- Fix docker_auto_start reusing containers with wrong binary loaded
  (now compares requested binary name against running instance)

Headless server (GhydraMCPServer.py): +14 routes, 58 total
MCP tools: 75 registered
Tested: 24/24 endpoint tests passing
2026-01-31 15:16:39 -07:00
0d25a0dc24 feat: Add symbols, segments, variables, namespaces mixins and search enhancements
New mixins wrapping existing Java HTTP endpoints:
- SymbolsMixin: symbols_list, symbols_imports, symbols_exports (+3 resources)
- SegmentsMixin: segments_list (+1 resource)
- VariablesMixin: variables_list, functions_variables (+1 resource)
- NamespacesMixin: namespaces_list, classes_list (+2 resources)

Additions to existing mixins:
- comments_get in AnalysisMixin (read complement to comments_set)
- program_info tool + resource in InstancesMixin

Search enhancements (Sprint 2):
- functions_list now passes name_contains, name_regex, addr to Java API
  for server-side filtering on large binaries

Brings tool count from 42 to 52 (excl. feedback), resources from 11 to 19.
2026-01-31 10:05:50 -07:00
1b42ab251e fix: Deadlock in instances_use when port not yet registered
Some checks failed
Build Ghidra Plugin / build (push) Has been cancelled
instances_use held _instances_lock while calling register_instance,
which also acquires the same lock — non-reentrant Lock = hang forever.

- Release lock before calling register_instance (avoids blocking
  other threads during the HTTP health check too)
- Upgrade Lock → RLock as safety net for any other reentrant paths
2026-01-30 19:25:25 -07:00
290252c0db feat: Add feedback collection via fastmcp-feedback
Some checks are pending
Build Ghidra Plugin / build (push) Waiting to run
Allows AI clients to submit feedback about tool quality, report issues,
and track statistics. Persists to ~/.ghydramcp/feedback.db (SQLite).

- Add fastmcp-feedback dependency
- Add feedback_enabled / feedback_db_path config fields
- Wire add_feedback_tools() into create_server() with graceful fallback
- Show feedback path in startup banner

Disable with GHYDRA_FEEDBACK=false
2026-01-30 10:09:26 -07:00
70f226f68e feat: Add response size guard with field projection and server-side grep
return_all=True on large binaries (1800+ functions) produced 72K char
responses that exceeded the MCP tool result limit. Instead of truncating,
oversized responses now return a structured summary with sample data,
available fields, and actionable instructions for narrowing the query.

Three layers of filtering:
- Server-side grep: Jython HTTP handlers filter during Ghidra iteration
- Field projection: jq-style key selection strips unneeded fields
- Token budget guard: responses exceeding 8k tokens return a summary

New files: core/filtering.py (project_fields, apply_grep, estimate_and_guard)
Modified: config, pagination, base mixin, all 5 domain mixins, headless server
2026-01-29 16:07:06 -07:00
4c112a2421 feat(headless): Expand Python server to full API parity
Rewrite GhydraMCPServer.py from 348 to 2138 lines, implementing all 45
routes that the MCP client expects. Previously, most endpoints returned
{"error": "Not found"}, breaking tools like data_list, xrefs_list, and
memory_read.

Key changes:
- Regex-based routing table with method-aware dispatch
- Thread-safe Ghidra transactions via threading.Lock()
- Full read endpoints: functions, data, strings, memory, xrefs, structs
- Full write endpoints: rename, comment, signature, create function/data
- Analysis endpoints: callgraph traversal, dataflow, run analysis
- Jython/Python 2 compatible (no f-strings, type hints, or walrus ops)

Tested with Docker build and curl against all major endpoint groups.
MCP client integration verified working.
2026-01-27 16:23:27 -07:00
2d837d95fc feat: Add dynamic container provisioning for multi-process isolation
Some checks failed
Build Ghidra Plugin / build (push) Has been cancelled
Refactors Docker mixin to support multiple Claude processes sharing
the same MCP server without port/container conflicts:

- PortPool class with flock-based cross-process synchronization
- Session-scoped container naming with UUID prefixes
- Docker label-based tracking for cross-process container discovery
- Automatic port allocation from pool (8192-8199)
- Cleanup mechanism for orphaned containers and stale locks
- New tools: docker_cleanup, docker_session_info
2026-01-26 18:40:01 -07:00
28b81ff359 feat: Add Python MCP bridge and build tooling
Some checks are pending
Build Ghidra Plugin / build (push) Waiting to run
- Add ghydramcp Python package with FastMCP server implementation
- Add docker-compose.yml for easy container management
- Add Makefile with build/run targets
- Add QUICKSTART.md for getting started
- Add uv.lock for reproducible dependencies
2026-01-26 13:51:12 -07:00
a3ad70d302 refactor: Simplify Docker setup and modernize project structure
Some checks are pending
Build Ghidra Plugin / build (push) Waiting to run
- Switch from Java to Python scripts (avoids OSGi bundle issues)
- Update pyproject.toml with proper src layout and ruff config
- Add binaries/ and ghidra-src/ to gitignore
- Clean up Module.manifest
2026-01-26 13:50:36 -07:00
88e1fe6ca8 feat: Add headless HTTP server and entrypoint scripts
Some checks are pending
Build Ghidra Plugin / build (push) Waiting to run
- Add GhydraMCPServer.py with fixed strings endpoint (Jython compatible)
- Fix strings endpoint to iterate through defined data instead of using
  DefinedDataIterator.definedStrings() which isn't accessible in Jython
- Add entrypoint.sh for Docker container initialization
2026-01-26 13:11:45 -07:00
ee82f3b100 perf: Optimize Dockerfile for faster rebuilds on code changes
Some checks are pending
Build Ghidra Plugin / build (push) Waiting to run
Separate Maven dependency resolution from compilation:
- COPY pom.xml first, run dependency:resolve (cached layer)
- COPY src second (only this invalidates on code changes)
- Build step reuses cached dependencies

Result: Code changes rebuild in ~30s instead of 3-5 min
(Ghidra download and Maven deps stay cached)
2026-01-26 04:34:26 -07:00
ac06111288 fix: Add Gson dependency for headless script support
GhydraMCPServer.java imports Gson but headless scripts run in a
separate OSGi classloader that can't access extension lib JARs.

Fix: Download gson-2.13.1.jar to Framework/Generic/lib/ where it's
available to all scripts regardless of execution mode.

Closes issue documented in BUG_REPORT_HEADLESS_GSON.md
2026-01-26 03:28:33 -07:00
f640df70ca feat: Add progress reporting for long-running prompts
Add real-time progress notifications to 7 MCP prompts that perform
multi-step scanning operations. Uses FastMCP's Context.report_progress()
and Context.info() for numeric progress and descriptive step messages.

Prompts updated:
- malware_triage (21 steps)
- analyze_imports (12 capability categories)
- identify_crypto (20 pattern scans)
- find_authentication (30 auth patterns)
- find_main_logic (22 entry point searches)
- find_error_handlers (35 error patterns)
- find_config_parsing (23 config patterns)

Infrastructure added:
- ProgressReporter class for throttled progress updates
- report_step() and report_progress() helper functions
- Graceful handling when ctx=None for backwards compatibility
2025-12-03 10:55:03 -07:00
c747abe813 feat: Add cursor-based pagination with grep filtering
Some checks failed
Build Ghidra Plugin / build (push) Has been cancelled
- Implement pagination system for large responses (10K+ items)
- Add grep/regex filtering capability to results
- Session isolation for multi-client MCP scenarios
- Cursor management tools (next, list, delete, delete_all)
- Upgrade to mcp>=1.22.0 for FastMCP Context support
- Switch to date-based versioning (2025.12.1)
- Add prominent _message field to guide LLMs on cursor usage

10 tools with pagination support:
- functions_list - list all functions
- functions_decompile - decompiled code (line pagination)
- functions_disassemble - assembly (instruction pagination)
- functions_get_variables - function variables
- data_list - defined data items
- data_list_strings - string data
- xrefs_list - cross-references
- structs_list - struct types
- analysis_get_callgraph - call graph edges
- analysis_get_dataflow - data flow steps
2025-12-02 09:53:23 -07:00
Teal Bauer
662e202482 feat: add line filtering to decompile for context management
Add start_line, end_line, and max_lines parameters to decompile
functions, allowing AI models to retrieve only specific portions
of decompiled code for better context management.

Parameters:
- start_line: Start at this line number (1-indexed)
- end_line: End at this line number (inclusive)
- max_lines: Maximum lines to return (overrides end_line)

Response includes filter metadata with total_lines when filtering
is applied, helping models understand what portion they're viewing.

Examples:
- Get first 20 lines: max_lines=20
- Get lines 10-30: start_line=10, end_line=30
- Get 15 lines from line 25: start_line=25, max_lines=15
2025-11-14 17:49:19 +01:00
Teal Bauer
60124d2315 fix: clear sufficient space when changing data types
When applying a larger struct to an address, clear enough space
for the new data type rather than just the old data's length.
This prevents 'Conflicting data exists' errors when the new
type is larger than the existing data.

Fixes issue where ConfigParametersStruct couldn't be applied
due to conflicting smaller data items in the address range.
2025-11-14 13:20:49 +01:00
Teal Bauer
f32dc5504c Merge branch 'feat/struct-management' 2025-11-14 13:20:09 +01:00
Teal Bauer
afc09f1cd9 docs: update README and bump version to v2.1
Some checks failed
Build Ghidra Plugin / build (push) Has been cancelled
Comprehensive documentation overhaul reflecting all enhancements since fork:
- Documented new namespaced tool organization (instances_*, functions_*, data_*, structs_*, etc.)
- Added complete struct management API documentation
- Updated example usage with realistic reverse engineering workflows
- Added uvx installation method (recommended: local install for version sync)
- Updated all repository URLs to starsong-consulting
- Bumped plugin and bridge versions to v2.1.0, API version to 2010
2025-11-14 12:58:55 +01:00
Teal Bauer
98a3ec4c34
Merge pull request #10 from nblog/build/uv
add UV management support
2025-11-14 12:19:00 +01:00
Teal Bauer
8268e55a08
Merge pull request #11 from starsong-consulting/feat/struct-management
feat: add struct data type management API
2025-11-14 12:16:53 +01:00
Teal Bauer
30d9bb17da feat: add struct data type management API
Add endpoints and MCP tools to create, read, update, and delete struct
data types in Ghidra's data type manager. Enables programmatic definition
of complex data structures for reverse engineering workflows.

Includes pagination, category filtering, and field-level operations
(add, update by name or offset).
2025-11-14 12:10:34 +01:00
nblog
f4212b8666 feat: update project version to 2.0.0 2025-11-12 15:30:31 +08:00
nblog
301c1849f8 feat: add initial project configuration with python version and script entry point 2025-11-12 14:44:33 +08:00
Teal Bauer
24f5f1698a docs: add Cline and Claude Code configuration
Add setup instructions for Cline and Claude Code MCP clients.
Include configuration examples and OS-specific paths.

Fixes #6
2025-11-11 13:44:06 +01:00
Teal Bauer
0c451a447c docs: clarify max_depth parameter for call graph analysis
Improve documentation to make it clear that max_depth can be increased
for deeper call chains. This helps AI models understand when and how to
use larger depth values.

Refs #8
2025-11-11 13:43:35 +01:00
Teal Bauer
3222cf9866 feat: upgrade to Ghidra 11.4.2 with improved transaction handling
Some checks failed
Build Ghidra Plugin / build (push) Has been cancelled
- Update all Ghidra JAR dependencies to 11.4.2
- Improve TransactionHelper to properly handle endTransaction return value
- Add GHIDRA_HOME environment variable support for flexible builds
- Update version references in extension.properties and MANIFEST.MF

The transaction fix now checks the return value from endTransaction() and
properly reports transaction failures, providing better error handling for
Ghidra 11.3.2+ compatibility.

Refs #7
2025-11-11 13:01:47 +01:00
Teal Bauer
bc1e137878 chore: prepare v2.0.0 release
- Update version to v2.0.0 in ApiConstants.java and bridge_mcp_hydra.py
- Create CHANGELOG v2.0.0 section with release date
- Fix Ghidra 11.3.2+ compatibility in TransactionHelper (endTransaction signature)
- Clarify instances_list vs instances_discover usage in documentation
- Remove commented-out code in pom.xml

Fixes #7
Closes #5
2025-11-11 12:54:03 +01:00
Teal Bauer
4379bea14f Release v2.0.0-beta.5: Critical fixes for stable release
- Fixed memory_write() endpoint to use correct ProgramEndpoints path
- Standardized all error responses to structured format with error codes
- Enhanced instances_discover() to return both new and existing instances
- Updated API version to 2005 for compatibility tracking
- Verified all bridge-to-plugin endpoint mappings are correct
- Confirmed route registration order follows proper specificity

Ready for v2.0.0 stable release after final testing.
2025-06-18 00:51:00 +02:00
Teal Bauer
977791432f fix: add addr filter to data list 2025-05-22 15:36:43 +02:00
Teal Bauer
2ec406b913 fix: remove duplicated decompilation text 2025-05-22 15:36:35 +02:00
Teal Bauer
52312f9ed3 chore: bump versions
Some checks failed
Build Ghidra Plugin / build (push) Has been cancelled
2025-05-22 14:51:56 +02:00
Teal Bauer
f8cf202a50 fix: add data_list_strings tool for string data access 2025-05-22 14:42:06 +02:00
Teal Bauer
f04223d23a refactor: clean up python comments and consolidate data test files 2025-05-22 14:35:59 +02:00
Teal Bauer
89fa811284 fix: update function signature test to use different data types 2025-05-22 14:35:59 +02:00
Teal Bauer
384f8b272f fix: update callgraph endpoint to accept both name and address parameters
- Modified ProgramEndpoints.java to support the name and address parameters
- Updated bridge MCP tool analysis_get_callgraph to use both parameters
- Updated tests to verify functionality with both parameters
- Added the change to CHANGELOG.md
2025-05-22 14:35:59 +02:00
Teal Bauer
b37be370f8 fix: add URIs to resources and split resource access by name and address 2025-05-22 14:34:40 +02:00
Teal Bauer
8313b2bd7b feat: Add reverse_engineer_binary prompt for comprehensive binary analysis 2025-05-22 14:34:40 +02:00
Teal Bauer
0f9aa2bb47 feat: Refactor bridge for better MCP integration
Implemented the refactoring proposal to optimize the bridge for better MCP integration:
- Added resources for context loading (instances, functions, disassembly)
- Added prompts for common analysis patterns
- Reorganized tools into namespaced functions for better discoverability
- Implemented current working instance concept for simpler usage
- Added documentation for the namespaces-based approach
2025-05-22 14:34:40 +02:00
Teal Bauer
8aded2e6c3 fix: Add another address check for creating functions 2025-05-22 14:25:09 +02:00
Teal Bauer
58f22316bd chore: Bump version identifiers 2025-05-22 09:12:10 +02:00
Teal Bauer
5e1b1d4cbd docs: Add CONTRIBUTING.md 2025-05-22 09:11:00 +02:00
Teal Bauer
24f84fc4ea docs: Update CHANGELOG 2025-05-22 08:17:16 +02:00
Teal Bauer
f377a34442 fix: Fix handling of variable operations in URL paths
Some checks failed
Build Ghidra Plugin / build (push) Has been cancelled
This commit fixes an issue where variable operations with paths like
/functions/by-name/FunctionName/variables/varName were not being properly
processed. The handleFunctionResource method now checks for paths that
start with 'variables/' and extracts the variable name for processing.

Added implementation to handle renaming of decompiler-generated variables.
2025-05-22 08:14:57 +02:00
Teal Bauer
c4d170cdca fix: make decompiler variables renameable 2025-05-21 18:04:30 +02:00
Teal Bauer
7cf426ef53 docs: Add strings endpoint documentation
Add documentation for the new strings endpoint in both:
- README.md: List as a feature and add example usage
- GHIDRA_HTTP_API.md: Add complete API reference with parameters and response format
2025-05-21 17:23:17 +02:00
Teal Bauer
f71f4aa43b feat: Add strings endpoint to list string data in the binary
The new endpoint provides:
- GET /strings endpoint with pagination and filtering
- Python bridge support via list_strings() function
- Searching of string data types across memory blocks
- Filtering options for string content
2025-05-21 17:15:53 +02:00
Teal Bauer
25f353a4f3 fix: Attempt to disassemble memory before creating functions 2025-05-21 13:13:15 +02:00
Teal Bauer
4eadbc9859 Prepare CHANGELOG for 2.0.0
Some checks failed
Build Ghidra Plugin / build (push) Has been cancelled
2025-04-15 10:39:36 +02:00
Teal Bauer
3fd0cf499a docs: Update README for v2.0.0-beta.1
Some checks failed
Build Ghidra Plugin / build (push) Has been cancelled
- Add comprehensive description of v2.0.0 features and capabilities
- Update API reference to include all available tools and operations
- Document HATEOAS architecture and response format
- Add detailed examples of using the new data manipulation API
- Update installation instructions for v2.0.0-beta.1
2025-04-14 21:37:42 +02:00
Teal Bauer
fedd2d0a55 feat: Update version to 2.0.0-beta.1 and add API version check
- Update Java plugin version to v2.0.0-beta.1
- Add version identifier to bridge script
- Add API version check in bridge to verify plugin compatibility
- Bridge script will now check for API version 2 compatibility
2025-04-14 21:28:50 +02:00
Teal Bauer
6c28553c58 fix: Implement create_data and delete_data functionality
- Add handleCreateData method to Java plugin to support creating new data
- Add data type mapping to support common types like byte, word, dword, string
- Implement delete_data functionality with graceful handling of missing data
- Add proper error handling when conflicts are detected
- Add comprehensive tests for both create_data and delete_data functionality
2025-04-14 21:23:45 +02:00
Teal Bauer
28870e9b54 feat: Improve data manipulation API
- Add comprehensive data manipulation capabilities
- Implement separate pathways for rename-only, type-only, and combined operations
- Fix HTTP request body consumption issue in DataEndpoints
- Standardize on 'type' parameter name instead of 'dataType'
- Add thorough test coverage with dedicated test_data_update.py script
- Update API documentation to version 2 with full endpoint descriptions
- Update CHANGELOG with detailed information about data manipulation features
2025-04-14 19:24:14 +02:00
Teal Bauer
5797fb38e7 feat: Add data renaming and type setting capabilities
- Add support for setting data types and renaming data items
- Fix Java API implementation for data operation endpoints
- Create rename_data and set_data_type tools for clearer separation of concerns
- Add comprehensive test scripts for data operations
- Successfully test changing data types and naming
2025-04-14 12:07:44 +02:00
Teal Bauer
2a1607cacf fix: Update set_decompiler_comment to use function comments 2025-04-14 11:44:42 +02:00
Teal Bauer
bd4e6b1194 ci: Update GitHub workflow to run on feature branches 2025-04-14 11:27:47 +02:00
Teal Bauer
9b19011b7d finalize HATEOAS updates 2025-04-14 11:25:22 +02:00
Teal Bauer
4268d3e2c5 test: Add set_function_signature test to MCP client test
- Add comprehensive test for the set_function_signature tool
- Update test_mcp_client.py with modernized API naming
- Fix HATEOAS link detection to handle both _links and api_links
2025-04-14 11:24:51 +02:00
Teal Bauer
48dc0b2e96 fix: Update HTTP test to properly test xrefs 2025-04-14 09:52:23 +02:00
Teal Bauer
9443101298 fix: Update XrefsEndpoints to use Ghidra API correctly
- Replace getDefinedAddresses with proper Listing methods
- Use getInstructions and getDefinedData instead
- Check both instructions and data for a valid starting address
- Improve fallback strategy for current address retrieval
2025-04-14 09:46:45 +02:00
Teal Bauer
0eaa19a6e8 fix: Resolve compatibility issues with Ghidra API
- Add helper method to handle ReferenceIterator conversion
- Support both Reference[] and ReferenceIterator types
- Replace LocationService with reflection-based approach
- Use listing's defined addresses as fallback
2025-04-14 09:27:49 +02:00
Teal Bauer
2b1fe6c4e1 fix: Update XrefsEndpoints.java for Ghidra 11 API compatibility
- Fix reference handling using ReferenceIterator
- Use proper Ghidra 11 services for current address retrieval
- Implement location tracking via LocationService and SelectionService
2025-04-14 08:54:18 +02:00
Teal Bauer
5dc59ced59 fix: Resolve compilation errors in XrefsEndpoints for Ghidra 11+
- Update reference handling to use arrays instead of iterators
- Simplify getCurrentAddress implementation for Ghidra 11+
2025-04-14 01:41:39 +02:00
Teal Bauer
96788f35fc feat: Implement proper cross-references (xrefs) functionality
- Java plugin now uses Ghidra ReferenceManager to find real cross-references
- Added detailed information about xrefs, including related functions and instructions
- Bridge script now provides simplified and human-readable text for xrefs
- Support bi-directional search for references to/from addresses
- Added filtering by reference type
- Properly implement getCurrentAddress using Ghidra service APIs
2025-04-14 01:37:41 +02:00
Teal Bauer
4f3042f6ee feat: Update bridge_mcp_hydra.py to work with HATEOAS API
- Remove all legacy API compatibility code
- Implement simplified response structure for AI agents
- Add text representations of structured data like disassembly
- Support addressing functions by name or address
- Improve parameter validation and error handling
- Implement all endpoints from the API specification
2025-04-14 01:28:35 +02:00
Teal Bauer
6c865c456e perf: Optimize variables endpoint with efficient pagination
- Implemented efficient pagination for variables endpoints to avoid timeout
- Added globalOnly parameter to allow fetching just global variables
- Limited decompilation to only process functions needed for current page
- Improved estimation of total count for better pagination links
- Reduced decompilation timeout to improve performance
2025-04-14 00:49:46 +02:00
Teal Bauer
3df129f3fd fix: Implement real instruction disassembly and fix xrefs and memory endpoints
- Fixed disassembly endpoint to show real instructions instead of placeholders
- Improved memory endpoint to handle address errors gracefully
- Fixed address arithmetic error in xrefs endpoint
- Added proper error handling and fallbacks in all endpoints
2025-04-14 00:16:49 +02:00
Teal Bauer
3311e88565 WIP fix endpoints 2025-04-14 00:08:10 +02:00
Teal Bauer
5d6b202599 Restore the correct API doc 2025-04-13 20:46:49 +02:00
Teal Bauer
4bc22674ec feat: Implement HATEOAS-compliant API endpoints
- Add ProgramEndpoints for proper HATEOAS URL structure
- Fix response structure to include required HATEOAS links
- Ensure proper result formats for segments, decompiled functions, and variables
- Reorganize endpoints to use nested resource pattern (/programs/current/functions/{address})
- Fix all tests to ensure HATEOAS compliance

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-04-13 20:29:11 +02:00
Teal Bauer
41bfa40d3a Fix bridge error 2025-04-13 09:38:13 +02:00
Teal Bauer
9879e71e88 WIP big refactor 2025-04-10 14:42:53 +02:00
Teal Bauer
454c73908c feat: Complete Ghidra HTTP API implementation
- Added memory operations (read/write)
- Implemented cross-reference analysis
- Added program analysis tools (callgraph, dataflow)
- Standardized all endpoints to use structured JSON
- Improved error handling and response metadata
- Added comprehensive API documentation
2025-04-10 00:01:04 +02:00
Teal Bauer
57584581bc WIP update APIs 2025-04-09 23:36:05 +02:00
Teal Bauer
6b2e572bd4 API evolution plan 2025-04-09 20:52:55 +02:00
Teal Bauer
ba7781643f chore: Completed conversion of bridge/plugin protocol to pure JSON 2025-04-09 14:35:58 +02:00
Teal Bauer
9a1f97fa80 Make GHYDRAMCP host and port configurable for tests 2025-04-09 10:15:49 +02:00
Teal Bauer
5d588ba853 Added API documentation and improved bridge functionality
- Added comprehensive GHIDRA_HTTP_API.md documenting the Java plugin's endpoints
- Improved bridge_mcp_hydra.py with better docstrings and type hints
- Updated port range calculations to use DEFAULT_GHIDRA_PORT
- Cleaned up comments and simplified code
- Improved error handling and response formatting
- Standardized API response structure across all endpoints
2025-04-09 10:13:15 +02:00
Teal Bauer
4fe3c16d25 feat: Add decompiler configuration options
Add toggleCCode, toggleSyntaxTree and setSimplificationStyle controls to
the Java plugin. These allow controlling decompiler output format between
C code (default) and raw decompiler output with syntax trees.

Example usage:
decompile_function_by_address(port=8192, address='0x1000', cCode=True)
decompile_function_by_address(port=8192, address='0x1000', syntaxTree=True)
2025-04-08 21:35:37 +02:00
Teal Bauer
bd56f5b6cc fix: failure response for decompile_function with address 2025-04-08 10:50:18 +02:00
Teal Bauer
52e0142f83 Update CHANGELOG.md for v1.4.0 release 2025-04-08 10:43:56 +02:00
Teal Bauer
c1d12a1ac8 fix: build complete package in package phase
Some checks failed
Build Ghidra Plugin / build (push) Has been cancelled
2025-04-08 10:40:14 +02:00
Teal Bauer
e94f91dc01 fix: We don't need destName, we can keep the version info
Some checks failed
Build Ghidra Plugin / build (push) Has been cancelled
2025-04-08 10:35:08 +02:00
Teal Bauer
c530976a6a fix: Force inner ZIP to be included 2025-04-08 10:34:06 +02:00
Teal Bauer
8f3fa16ad8 Explicitly require ProgramManager 2025-04-08 09:44:11 +02:00
Teal Bauer
a469b38ec4 fix: Fix versioning once and for all, maybe? 2025-04-08 09:43:53 +02:00
Teal Bauer
ea3776e485 fix: make fetching closest git tag work 2025-04-08 09:28:32 +02:00
Teal Bauer
d4122402b4 fix: Make sure jar is always named GhydraMCP.jar 2025-04-08 09:02:59 +02:00
Teal Bauer
3a1772a3c9 fix: ensure extension zip included in complete package
Some checks failed
Build Ghidra Plugin / build (push) Has been cancelled
Moved plugin assembly execution to prepare-package phase to ensure
the extension zip artifact is created before the package phase.

Added maven-resources-plugin execution (copy-plugin-zip) to the
package phase to copy the generated extension zip into a reliable
'target/staging' directory.

Updated complete-package.xml assembly descriptor to use a fileSet
referencing 'target/staging' instead of relying on dependencySets
or potentially unstable property expansion for artifact inclusion.

This resolves the issue where the complete package artifact was
missing the Ghidra extension zip.
2025-04-07 22:50:26 +02:00
Teal Bauer
908d0decfe fix: Correct GHA complete package assembly
Some checks failed
Build Ghidra Plugin / build (push) Has been cancelled
Updated complete-package.xml to use correct properties
(${git.commit.id.abbrev}-${maven.build.timestamp}) for including the
Ghidra extension zip artifact, resolving the incomplete package issue
in GHA builds.
2025-04-07 22:39:30 +02:00
120 changed files with 40510 additions and 3406 deletions

View File

@ -2,7 +2,7 @@ name: Build Ghidra Plugin
on:
push:
branches: [ main ]
branches: [ main, 'api-*', 'feature/*', 'bugfix/*' ]
tags:
- 'v*'
pull_request:

7
.gitignore vendored
View File

@ -50,3 +50,10 @@ hs_err_pid*
replay_pid*
/__pycache__
# Docker volumes and binaries
/binaries/*
!/binaries/.gitkeep
# Ghidra source (fetched separately)
/ghidra-src/

1
.python-version Normal file
View File

@ -0,0 +1 @@
3.11

115
BUG_REPORT_HEADLESS_GSON.md Normal file
View File

@ -0,0 +1,115 @@
# Bug Report: Docker Headless Mode Fails - Missing Gson Dependency
## Summary
The MCGhidra Docker container fails to start the HTTP API server because `MCGhidraServer.java` imports Gson, but Gson is not available in Ghidra's headless script classpath.
## Environment
- MCGhidra Docker image: `mcghidra:latest`
- Ghidra Version: 11.4.2
- Build Date: 2025-08-26
## Steps to Reproduce
1. Build the Docker image:
```bash
docker build -t mcghidra:latest -f docker/Dockerfile .
```
2. Run with a binary:
```bash
docker run -p 8192:8192 -v /path/to/binary:/binaries/test mcghidra:latest /binaries/test
```
3. Check logs:
```bash
docker logs <container_id>
```
## Expected Behavior
Container should start and expose HTTP API on port 8192.
## Actual Behavior
Analysis completes but the script fails to load:
```
INFO REPORT: Analysis succeeded for file: file:///binaries/cardv (HeadlessAnalyzer)
ERROR REPORT SCRIPT ERROR: MCGhidraServer.java : The class could not be found.
It must be the public class of the .java file: Failed to get OSGi bundle containing script:
/opt/ghidra/scripts/MCGhidraServer.java (HeadlessAnalyzer)
```
The health check fails because the HTTP server never starts:
```json
{"healthy":false,"port":8192,"error":"[Errno 111] Connection refused"}
```
## Root Cause Analysis
`MCGhidraServer.java` (lines 22-24) imports Gson:
```java
import com.google.gson.Gson;
import com.google.gson.GsonBuilder;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
```
However:
1. Gson is **not** bundled with Ghidra
2. The MCGhidra extension JAR includes Gson, but headless scripts run in a **separate OSGi classloader** without access to extension lib dependencies
3. The Dockerfile doesn't copy Gson to Ghidra's script classpath
## Verification
```bash
# Check if Gson is in the built extension
unzip -l target/MCGhidra-*.zip | grep -i gson
# Result: No matches
# Check Ghidra's lib directories
ls /opt/ghidra/Ghidra/Framework/*/lib/ | grep -i gson
# Result: No matches
```
## Proposed Solutions
### Option 1: Bundle Gson JAR with Scripts (Recommended)
Add Gson JAR to Ghidra's script classpath in Dockerfile:
```dockerfile
# Download Gson and add to Ghidra lib
RUN curl -fsSL "https://repo1.maven.org/maven2/com/google/gson/gson/2.10.1/gson-2.10.1.jar" \
-o /opt/ghidra/Ghidra/Framework/Generic/lib/gson-2.10.1.jar
```
### Option 2: Use Built-in JSON (No External Dependencies)
Rewrite `MCGhidraServer.java` to use only JDK classes:
- Replace Gson with `javax.json` or manual JSON string building
- This ensures the script works without any external dependencies
### Option 3: Pre-compiled Script JAR
Compile `MCGhidraServer.java` with Gson into a JAR and place it in the extension, then reference it differently in headless mode.
## Impact
- **Severity**: High - Docker deployment is completely broken
- **Affected**: All users attempting to use Docker/headless mode
- **Workaround**: None currently (must use GUI mode)
## Additional Context
The main MCGhidra plugin works fine in GUI mode because the extension's lib dependencies are loaded. This only affects the headless Docker workflow where scripts are loaded separately from the extension.
---
**Reported by**: Firmware analysis session
**Date**: 2026-01-26
**Binary being analyzed**: WOLFBOX G850 dashcam `cardv` (ARM 32-bit)

View File

@ -6,17 +6,197 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
## [Unreleased]
### Added
- **Symbol CRUD Operations:** Full create/rename/delete support for symbols and labels:
- `symbols_create` - Create new label/symbol at an address
- `symbols_rename` - Rename existing symbol
- `symbols_delete` - Delete symbol at an address
- `symbols_imports` - List imported symbols with pagination
- `symbols_exports` - List exported symbols with pagination
- **Bookmark Management:** Tools for managing Ghidra bookmarks:
- `bookmarks_list` - List bookmarks with type/category filtering
- `bookmarks_create` - Create bookmark at address (Note, Warning, Error, Info types)
- `bookmarks_delete` - Delete bookmarks at an address
- **Enum & Typedef Creation:** Data type creation tools:
- `enums_create` - Create new enum data type
- `enums_list` - List enum types with members
- `typedefs_create` - Create new typedef
- `typedefs_list` - List typedef data types
- **Variable Management:** Enhanced variable operations:
- `variables_list` - List variables with global_only filter
- `variables_rename` - Rename and retype function variables
- `functions_variables` - List local variables and parameters for a function
- **Namespace & Class Tools:**
- `namespaces_list` - List all non-global namespaces
- `classes_list` - List class namespaces with qualified names
- **Memory Segment Tools:**
- `segments_list` - List memory segments with R/W/X permissions and size info
- **Progress Reporting for Long Operations:** 7 MCP prompts now report real-time progress during multi-step scanning operations:
- `malware_triage` - Reports progress across 21 scanning steps
- `analyze_imports` - Reports progress across 12 capability categories
- `identify_crypto` - Reports progress across 20 pattern scans
- `find_authentication` - Reports progress across 30 auth pattern scans
- `find_main_logic` - Reports progress across 22 entry point searches
- `find_error_handlers` - Reports progress across 35 error pattern scans
- `find_config_parsing` - Reports progress across 23 config pattern scans
- Uses FastMCP's `Context.report_progress()` for numeric progress updates
- Uses `Context.info()` for descriptive step notifications
- Helper functions `report_step()` and `report_progress()` for consistent reporting
- **Specialized Analysis Prompts:** 13 new MCP prompts for common reverse engineering workflows:
- `analyze_strings` - String analysis with categorization and cross-reference guidance
- `trace_data_flow` - Data flow and taint analysis through functions
- `identify_crypto` - Cryptographic function and constant identification
- `malware_triage` - Quick malware analysis with capability assessment checklist
- `analyze_protocol` - Network/file protocol reverse engineering framework
- `find_main_logic` - Navigate past CRT initialization to find actual program logic
- `analyze_imports` - Categorize imports by capability with suspicious pattern detection
- `find_authentication` - Locate auth, license checks, and credential handling code
- `analyze_switch_table` - Reverse engineer command dispatchers and jump tables
- `find_config_parsing` - Identify configuration file parsing and settings management
- `compare_functions` - Compare two functions for similarity (patches, variants, libraries)
- `document_struct` - Comprehensively document data structure fields and usage
- `find_error_handlers` - Map error handling, cleanup routines, and exit paths
### Changed
- **Docker Port Allocation:** Ports are now auto-allocated from pool (8192-8223) instead of client-specified. Prevents session collisions in multi-agent environments.
- **docker_auto_start:** Removed `wait` and `timeout` parameters. Always returns immediately after starting container.
- **Removed docker_wait tool:** This tool blocked for up to 5 minutes in a single call. LLMs should poll `docker_health(port)` in their own loop instead — this gives visibility into progress and ability to check logs between polls.
### Fixed
- **instances_use Hanging:** Eliminated 4+ hour hangs by removing blocking HTTP call. Now uses lazy registration — just creates a stub entry, validates on first real tool call.
- **All Docker Operations Non-Blocking:** ALL Docker subprocess calls (`docker ps`, `docker run`, `docker stop`, etc.) now run in thread executor via `run_in_executor()`. Previously only `docker_health` was fixed, but `docker_status`, `docker_start`, `docker_stop`, `docker_logs`, `docker_build`, and `docker_cleanup` still blocked the event loop. This caused `docker_auto_start(wait=True)` to freeze the MCP server.
- **Session Isolation:** `docker_stop` now validates container belongs to current session before stopping. `docker_cleanup` defaults to `session_only=True` to prevent cross-session interference.
- **Background Discovery Thread:** Fixed timeout from 30s to 0.5s for port scanning, reducing discovery cycle from 300s+ to ~15s.
- **Typedef/Variable Type Resolution:** Fixed `handle_typedef_create` and `handle_variable_rename` to use shared `resolve_data_type()` for builtin types (int, char, etc.).
- **DockerMixin Inheritance:** Fixed crash when `DockerMixin` called `get_instance_port()` — was inheriting from wrong base class.
- **Deprecated asyncio API:** Replaced `asyncio.get_event_loop()` with `asyncio.get_running_loop()` for Python 3.10+ compatibility.
- **HTTP Client Data Mutation:** `safe_post`, `safe_put`, and `safe_patch` no longer mutate the caller's data dict via `.pop()`.
- **Race Condition in Discovery:** Initial instance discovery in `main()` now uses `_instances_lock` for thread safety.
- **Silent Exception Handling:** Added debug logging to PortPool exception handlers and analysis fallback paths.
- **File Descriptor Leak:** Fixed potential leak in `PortPool._try_acquire_port()` if write operations fail after lock acquisition.
- **Hash Algorithm Consistency:** Changed query hash from MD5 to SHA-256 in pagination module for consistency with cursor ID generation.
- **Lazy PortPool Initialization:** `PortPool` now created on first use, avoiding `/tmp/mcghidra-ports` directory creation when Docker tools are never used.
- **Logging Configuration:** `configure_logging()` now called during server startup — debug messages actually work now.
- **Type Hint Consistency:** Aligned `filtering.py` to use `List[T]` from typing module like rest of codebase.
- **Parameter Naming:** Renamed `project_fields` to `fields` in `structs_get()` for consistency with other tools.
- **Import Path:** Fixed `logging.py` to import `Context` from `fastmcp` (not deprecated `mcp.server.fastmcp` path).
### Added
- **Debug Logging Environment Variable:** Set `MCGHIDRA_DEBUG=1` to enable DEBUG-level logging for troubleshooting.
## [2025.12.1] - 2025-12-01
### Added
- **Cursor-Based Pagination System:** Implemented efficient pagination for large responses (10K+ items) without filling context windows.
- `page_size` parameter (default: 50, max: 500) for controlling items per page
- `cursor_id` returned for navigating to subsequent pages
- Session isolation prevents cursor cross-contamination between MCP clients
- TTL-based cursor expiration (5 minutes) with LRU eviction (max 100 cursors)
- **Grep/Regex Filtering:** Added `grep` and `grep_ignorecase` parameters to filter results with regex patterns before pagination.
- **Bypass Option:** Added `return_all` parameter to retrieve complete datasets (with large response warnings).
- **Cursor Management Tools:** New MCP tools for cursor lifecycle management:
- `cursor_next(cursor_id)` - Fetch next page of results
- `cursor_list()` - List active cursors for current session
- `cursor_delete(cursor_id)` - Delete specific cursor
- `cursor_delete_all()` - Delete all session cursors
- **Enumeration Resources:** New lightweight MCP resources for quick data enumeration (more efficient than tool calls):
- `ghidra://instances` - List all active Ghidra instances
- `ghidra://instance/{port}/summary` - Program overview with statistics
- `ghidra://instance/{port}/functions` - List functions (capped at 1000)
- `ghidra://instance/{port}/strings` - List strings (capped at 500)
- `ghidra://instance/{port}/data` - List data items (capped at 1000)
- `ghidra://instance/{port}/structs` - List struct types (capped at 500)
- `ghidra://instance/{port}/xrefs/to/{address}` - Cross-references to an address
- `ghidra://instance/{port}/xrefs/from/{address}` - Cross-references from an address
### Changed
- **MCP Dependency Upgrade:** Updated from `mcp==1.6.0` to `mcp>=1.22.0` for FastMCP Context support.
- **Version Strategy:** Switched to date-based versioning (YYYY.MM.D format).
- **Tool Updates:** 11 tools now support pagination with grep filtering:
- `functions_list` - List functions with pagination
- `functions_decompile` - Decompiled code with line pagination (grep for code patterns)
- `functions_disassemble` - Assembly with instruction pagination (grep for opcodes)
- `functions_get_variables` - Function variables with pagination
- `data_list` - List data items with pagination
- `data_list_strings` - List strings with pagination
- `xrefs_list` - List cross-references with pagination
- `structs_list` - List struct types with pagination
- `structs_get` - Struct fields with pagination (grep for field names/types)
- `analysis_get_callgraph` - Call graph edges with pagination
- `analysis_get_dataflow` - Data flow steps with pagination
- **LLM-Friendly Responses:** Added prominent `_message` field to guide LLMs on cursor continuation.
### Fixed
- **FastMCP Compatibility:** Removed deprecated `version` parameter from FastMCP constructor.
### Security
- **ReDoS Protection:** Added validation for grep regex patterns to prevent catastrophic backtracking attacks.
- Pattern length limit (500 chars)
- Repetition operator limit (15 max)
- Detection of dangerous nested quantifier patterns like `(a+)+`
- **Session Spoofing Prevention:** Removed user-controllable `session_id` parameter from all tools.
- Sessions now derived from FastMCP context (`ctx.session`, `ctx.client_id`)
- Prevents users from accessing or manipulating other sessions' cursors
- **Recursion Depth Limit:** Added depth limit (10) to grep matching to prevent stack overflow on deeply nested data.
## [2.0.0] - 2025-11-11
### Added
- **MCP Integration Refactor:** Refactored the Python bridge for improved MCP integration. (337f89e)
- Introduced MCP resources for loading context (e.g., instances, functions, disassembly).
- Added namespaced tools (e.g., `instance.*`, `function.*`, `data.*`) for better organization and discoverability.
- Implemented a "current working instance" concept to simplify commands by implicitly targeting the active Ghidra instance.
- **Analysis Prompts:** Added pre-defined prompts for common analysis tasks, including `reverse_engineer_binary` for comprehensive analysis. (337f89e, 3134581)
- **String Data Listing:** Added a new endpoint to list string data in the binary, with pagination and filtering by content. Python bridge support via `list_strings()` function. (f71f4aa)
- **Comprehensive Data Manipulation:** Added tools/endpoints for creating (`create_data`), deleting (`delete_data`), renaming (`rename_data`), changing type (`set_data_type`), and combined updates (`update_data`) for data items. Supports common types (byte, word, dword, string, etc.). (6c28553, 5797fb3, 28870e9)
- **Enhanced Cross-Reference (Xrefs) Analysis:** Implemented accurate xref tools (`get_references_to`, `get_references_from`) using Ghidra's ReferenceManager. Features include detailed info, bi-directional search, type filtering, and simplified bridge output. (96788f3)
- **Memory Operations:** Added tools/endpoints for reading (`read_memory`) and writing (`write_memory`) to program memory. (454c739)
- **Function Addressing Flexibility:** MCP bridge now supports addressing functions by name or address. (4f3042f)
- **API Version Check:** Bridge script now verifies compatibility with the Java plugin (expects API v2). (fedd2d0)
- **Enhanced Decompiler Controls:** Added options for raw vs. clean pseudocode output and multiple simplification styles. (454c739)
### Changed
- **Bridge Refactor & Namespacing:** Reorganized bridge tools into namespaces (e.g., `instance.list_instances`, `function.get_function_details`) as part of the MCP integration refactor. (337f89e)
- **Breaking: HATEOAS API v2 & Bridge Update:** Migrated fully to a HATEOAS-driven API (v2). The Python bridge (`bridge_mcp_hydra.py`) now *exclusively* uses this API, removing legacy support. Responses are simplified for AI agents, including text representations for structured data (e.g., disassembly). All endpoints require HATEOAS compliance (e.g., `_links`). (4bc2267, 4f3042f)
- **Optimized Variable Listing:** Improved performance of the `/variables` endpoint with efficient pagination and a `globalOnly` filter. (6c865c4)
- **Standardized Responses:** Unified all endpoints to use structured JSON and standardized HATEOAS links. (454c739, 4bc2267)
- **Improved Error Handling:** Enhanced error reporting and parameter validation across the API and bridge. (454c739, 4f3042f, 3df129f)
- **API Documentation:** Updated documentation to reflect the HATEOAS v2 API and new features. (28870e9, 3fd0cf4)
### Fixed
- **Real Instruction Disassembly:** The `/disassembly` endpoint now provides actual instruction disassembly instead of placeholders. (3df129f)
- **Ghidra 11+ Compatibility:** Resolved various API compatibility issues, particularly for cross-references (`XrefsEndpoints`). (5dc59ce, 2b1fe6c, 0eaa19a, 9443101)
- **Data Operations:** Fixed issues with HTTP request body consumption, parameter naming (`type` vs `dataType`), and name preservation during type changes. (28870e9)
- **Function Commenting:** Corrected `set_decompiler_comment` to apply comments at the function level. (2a1607c)
- **Call Graph Parameter Handling:** Updated the CallGraph endpoint to properly accept both function name and address parameters for flexibility. (fa8cc64)
- **Endpoint Functionality:** Addressed various issues including endpoint registration, handling of program-dependent endpoints, URL encoding, transaction management, and inconsistent response formats. (various commits, e.g., 4bc2267)
## [1.4.0] - 2025-04-08
### Added
- Structured JSON communication between Python bridge and Java plugin
- Consistent response format with metadata (timestamp, port, instance type)
- Comprehensive test suites for HTTP API and MCP bridge
- Test runner script for easy test execution
- Detailed testing documentation in TESTING.md
- Origin checking for API requests
- Mutating tests for API functionality
### Changed
- Improved error handling in API responses
- Enhanced JSON parsing in the Java plugin
- Updated documentation with JSON communication details
- Standardized API responses across all endpoints
- Improved version handling in build system
### Fixed
- Build complete package in `package` phase
- Versioning and naming of JAR files
- GitHub Actions workflow permissions
- Extension ZIP inclusion in complete package
- ProgramManager requirement
- Git tag fetching functionality
- MCP bridge test failures
## [1.3.0] - 2025-04-02
@ -59,7 +239,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
## [1.1] - 2025-03-30
### Added
- Initial release of GhydraMCP bridge
- Initial release of MCGhidra bridge
- Basic Ghidra instance management tools
- Function analysis tools
- Variable manipulation tools
@ -70,8 +250,11 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
- Initial project setup
- Basic MCP bridge functionality
[unreleased]: https://github.com/teal-bauer/GhydraMCP/compare/v1.3.0...HEAD
[1.3.0]: https://github.com/teal-bauer/GhydraMCP/compare/v1.2...v1.3.0
[1.2]: https://github.com/teal-bauer/GhydraMCP/compare/v1.1...v1.2
[1.1]: https://github.com/teal-bauer/GhydraMCP/compare/1.0...v1.1
[1.0]: https://github.com/teal-bauer/GhydraMCP/releases/tag/1.0
[unreleased]: https://github.com/teal-bauer/MCGhidra/compare/v2025.12.1...HEAD
[2025.12.1]: https://github.com/teal-bauer/MCGhidra/compare/v2.0.0...v2025.12.1
[2.0.0]: https://github.com/teal-bauer/MCGhidra/compare/v1.4.0...v2.0.0
[1.4.0]: https://github.com/teal-bauer/MCGhidra/compare/v1.3.0...v1.4.0
[1.3.0]: https://github.com/teal-bauer/MCGhidra/compare/v1.2...v1.3.0
[1.2]: https://github.com/teal-bauer/MCGhidra/compare/v1.1...v1.2
[1.1]: https://github.com/teal-bauer/MCGhidra/compare/1.0...v1.1
[1.0]: https://github.com/teal-bauer/MCGhidra/releases/tag/1.0

247
CONTRIBUTING.md Normal file
View File

@ -0,0 +1,247 @@
# Contributing to MCGhidra
Thank you for your interest in contributing to MCGhidra! This document provides guidelines and information for contributors.
## Table of Contents
- [Project Structure](#project-structure)
- [Development Setup](#development-setup)
- [Versioning](#versioning)
- [Code Standards](#code-standards)
- [Pull Request Process](#pull-request-process)
- [Release Process](#release-process)
## Project Structure
MCGhidra consists of two main components:
1. **Java Plugin for Ghidra** (`src/main/java/eu/starsong/ghidra/`):
- Main class: `MCGhidraPlugin.java`
- API constants: `api/ApiConstants.java`
- Endpoints: `endpoints/` directory
- Data models: `model/` directory
- Utilities: `util/` directory
2. **Python MCP Bridge** (`bridge_mcp_hydra.py`):
- Implements the Model Context Protocol (MCP)
- Connects AI assistants to the Ghidra plugin via HTTP
## Development Setup
### Prerequisites
- Java 21 (required for plugin development)
- Maven 3.8+
- Python 3.11+
- Ghidra (latest stable version recommended)
### Building from Source
```bash
# Clone the repository
git clone https://github.com/starsong-consulting/MCGhidra.git
cd MCGhidra
# Build the project
mvn clean package
```
This creates:
- `target/MCGhidra-[version].zip` - The Ghidra plugin only
- `target/MCGhidra-Complete-[version].zip` - Complete package with plugin and bridge script
### Installing for Development
1. Build the project as described above
2. In Ghidra, go to `File` -> `Install Extensions`
3. Click the `+` button
4. Select the `MCGhidra-[version].zip` file
5. Restart Ghidra
6. Enable the plugin in `File` -> `Configure` -> `Developer`
### Python Bridge Setup
Install the required Python dependencies:
```bash
pip install mcp==1.6.0 requests==2.32.3
```
Or use uv:
```bash
uv pip install mcp==1.6.0 requests==2.32.3
```
## Versioning
MCGhidra follows semantic versioning (SemVer) and uses explicit API versions:
### Version Numbers
When making changes, update version numbers in these locations:
1. **Plugin Version** in `src/main/java/eu/starsong/ghidra/api/ApiConstants.java`:
```java
public static final String PLUGIN_VERSION = "v2.0.0";
```
2. **Bridge Version** in `bridge_mcp_hydra.py`:
```python
BRIDGE_VERSION = "v2.0.0"
```
### API Versions
The API version is tracked separately from the implementation version:
1. **API Version** in `src/main/java/eu/starsong/ghidra/api/ApiConstants.java`:
```java
public static final int API_VERSION = 2;
```
2. **Required API Version** in `bridge_mcp_hydra.py`:
```python
REQUIRED_API_VERSION = 2
```
### When to Update Versions
- **Patch Version** (`x.y.Z`): Bug fixes and minor changes that don't affect API compatibility
- **Minor Version** (`x.Y.z`): New features that are backward compatible
- **Major Version** (`X.y.z`): Breaking changes that aren't backward compatible
- **API Version**: Only increment when making incompatible API changes
### Important Versioning Rules
1. **Bridge Modifications**: When modifying the MCP bridge script (`bridge_mcp_hydra.py`), update the `BRIDGE_VERSION` string but only update the `REQUIRED_API_VERSION` if the changes require API compatibility changes.
2. **Java Plugin Modifications**: When making changes to the Java plugin:
- Update `PLUGIN_VERSION` string for all changes
- Only increment `API_VERSION` when introducing breaking changes to the API
3. **API Compatibility**: The bridge script and Java plugin must have matching API versions to work together. The bridge verifies this at runtime.
## Code Standards
### Java Code Standards
- Follow Java naming conventions
- Add comprehensive JavaDoc comments for public methods
- Ensure proper exception handling
- Follow Ghidra extension development best practices
- Follow HATEOAS principles for API endpoints
- Implement proper null checks and input validation
### Python Code Standards
- Follow PEP 8 style guidelines
- Add docstrings for all functions
- Use type hints for function parameters and return types
- Implement proper error handling
- Ensure thread-safety for multi-threaded operations
### Commit Message Standards
Follow the conventional commits format:
```
<type>: <description>
[optional body]
[optional footer(s)]
```
Types:
- `feat`: New feature
- `fix`: Bug fix
- `docs`: Documentation changes
- `style`: Code style changes (formatting, etc.)
- `refactor`: Code refactoring without functionality changes
- `perf`: Performance improvements
- `test`: Adding or updating tests
- `build`: Build system changes
- `ci`: CI configuration changes
- `chore`: Other changes that don't modify src or test files
Example:
```
feat: Add support for string listing in binary files
This adds a new endpoint to list all defined strings in the binary
with pagination and filtering by content.
Closes #123
```
## Pull Request Process
1. **Branch Naming Convention**:
- Features: `feature/short-description`
- Fixes: `fix/issue-description`
- Documentation: `docs/description`
2. **Before Creating a PR**:
- Ensure all tests pass
- Update documentation if needed
- Update version numbers if needed
- Add appropriate entries to CHANGELOG.md
3. **PR Template**:
- Clearly describe the changes
- Reference any related issues
- Include any special testing instructions
- List any breaking changes
4. **Review Process**:
- At least one core contributor must review and approve
- Address all requested changes
- Ensure CI checks pass
## Release Process
1. **Preparation**:
- Ensure all tests pass
- Update version numbers in:
- `src/main/java/eu/starsong/ghidra/api/ApiConstants.java`
- `bridge_mcp_hydra.py`
- Update CHANGELOG.md with release notes
- Move content from "Unreleased" section to a new release section
- Merge all changes to the main branch
2. **Creating a Release**:
- Tag the release commit with the version number (e.g., `v2.0.0`)
- Push the tag to GitHub
- The GitHub Actions workflow will automatically:
- Build the release artifacts
- Create a GitHub release
- Upload the artifacts
3. **Post-Release**:
- Update version numbers to next development version
- Create a new "Unreleased" section in CHANGELOG.md
- Announce the release in appropriate channels
## Testing
Please include appropriate tests for your changes:
1. **HTTP API Tests**: For Java plugin endpoint changes
2. **MCP Bridge Tests**: For Python bridge functionality
Run the tests with:
```bash
python run_tests.py
```
See TESTING.md for more detailed information on testing procedures.
## Questions and Support
If you have questions or need help, please:
1. Open an issue on GitHub
2. Check existing documentation
3. Reach out to the maintainers directly
Thank you for contributing to MCGhidra!

741
GHIDRA_HTTP_API.md Normal file
View File

@ -0,0 +1,741 @@
# MCGhidra Ghidra Plugin HTTP API v2
## Overview
This API provides a Hypermedia-driven interface (HATEOAS) to interact with Ghidra's CodeBrowser, enabling AI-driven and automated reverse engineering workflows. It allows interaction with Ghidra projects, programs (binaries), functions, symbols, data, memory segments, cross-references, and analysis features. Each program open in Ghidra will have its own plugin instance, so all resources are specific to that program.
## General Concepts
### Request Format
- Use standard HTTP verbs:
- `GET`: Retrieve resources or lists.
- `POST`: Create new resources.
- `PATCH`: Modify existing resources partially.
- `PUT`: Replace existing resources entirely (Use with caution, `PATCH` is often preferred).
- `DELETE`: Remove resources.
- Request bodies for `POST`, `PUT`, `PATCH` should be JSON (`Content-Type: application/json`).
- Include an optional `X-Request-ID` header with a unique identifier for correlation.
### Response Format
All non-error responses are JSON (`Content-Type: application/json`) containing at least the following keys:
```json
{
"id": "[correlation identifier]",
"instance": "[instance url]",
"success": true,
"result": Object | Array<Object>,
"_links": { // Optional: HATEOAS links
"self": { "href": "/path/to/current/resource" },
"related_resource": { "href": "/path/to/related" }
// ... other relevant links
}
}
```
- `id`: The identifier from the `X-Request-ID` header if provided, or a random opaque identifier otherwise.
- `instance`: The URL of the Ghidra plugin instance that handled the request.
- `success`: Boolean `true` for successful operations.
- `result`: The main data payload, either a single JSON object or an array of objects for lists.
- `_links`: (Optional) Contains HATEOAS-style links to related resources or actions, facilitating discovery.
#### List Responses
List results (arrays in `result`) will typically include pagination information and a total count:
```json
{
"id": "req-123",
"instance": "http://localhost:8192",
"success": true,
"result": [ ... objects ... ],
"size": 150, // Total number of items matching the query across all pages
"offset": 0,
"limit": 50,
"_links": {
"self": { "href": "/functions?offset=0&limit=50" },
"next": { "href": "/functions?offset=50&limit=50" }, // Present if more items exist
"prev": { "href": "/functions?offset=0&limit=50" } // Present if not the first page
}
}
```
### Error Responses
Errors use appropriate HTTP status codes (4xx, 5xx) and have a JSON payload with an `error` key:
```json
{
"id": "[correlation identifier]",
"instance": "[instance url]",
"success": false,
"error": {
"code": "RESOURCE_NOT_FOUND", // Optional: Machine-readable code
"message": "Descriptive error message"
// Potentially other details like invalid parameters
}
}
```
Common HTTP Status Codes:
- `200 OK`: Successful `GET`, `PATCH`, `PUT`, `DELETE`.
- `201 Created`: Successful `POST` resulting in resource creation.
- `204 No Content`: Successful `DELETE` or `PATCH`/`PUT` where no body is returned.
- `400 Bad Request`: Invalid syntax, missing required parameters, invalid data format.
- `401 Unauthorized`: Authentication required or failed (if implemented).
- `403 Forbidden`: Authenticated user lacks permission (if implemented).
- `404 Not Found`: Resource or endpoint does not exist, or query yielded no results.
- `405 Method Not Allowed`: HTTP verb not supported for this endpoint.
- `500 Internal Server Error`: Unexpected error within the Ghidra plugin.
### Addressing and Searching
Resources like functions, data, and symbols often exist at specific memory addresses and may have names.
- **By Address:** Use the resource's path with the address (hexadecimal, e.g., `0x401000` or `08000004`).
- Example: `GET /functions/0x401000`
- **Querying Lists:** List endpoints (e.g., `/functions`, `/symbols`, `/data`) support filtering via query parameters:
- `?addr=[address in hex]`: Find item at a specific address.
- `?name=[full_name]`: Find item(s) with an exact name match (case-sensitive).
- `?name_contains=[substring]`: Find item(s) whose name contains the substring (case-insensitive).
- `?name_matches_regex=[regex]`: Find item(s) whose name matches the Java-compatible regular expression.
### Pagination
List endpoints support pagination using query parameters:
- `?offset=[int]`: Number of items to skip (default: 0).
- `?limit=[int]`: Maximum number of items to return (default: implementation-defined, e.g., 100).
## Meta Endpoints
### `GET /plugin-version`
Returns the version of the running Ghidra plugin and its API. Essential for compatibility checks by clients like the MCP bridge.
```json
{
"id": "req-meta-ver",
"instance": "http://localhost:8192",
"success": true,
"result": {
"plugin_version": "v2.0.0", // Example plugin build version
"api_version": 2 // Ordinal API version
},
"_links": {
"self": { "href": "/plugin-version" },
"root": { "href": "/" }
}
}
```
### `GET /info`
Returns information about the current plugin instance, including details about the loaded program and project.
```json
{
"id": "req-info",
"instance": "http://localhost:8192",
"success": true,
"result": {
"isBaseInstance": true,
"file": "example.exe",
"architecture": "x86:LE:64:default",
"processor": "x86",
"addressSize": 64,
"creationDate": "2023-01-01T12:00:00Z",
"executable": "/path/to/example.exe",
"project": "MyProject",
"projectLocation": "/path/to/MyProject",
"serverPort": 8192,
"serverStartTime": 1672531200000,
"instanceCount": 1
},
"_links": {
"self": { "href": "/info" },
"root": { "href": "/" },
"instances": { "href": "/instances" },
"program": { "href": "/program" }
}
}
```
### `GET /instances`
Returns information about all active MCGhidra plugin instances.
```json
{
"id": "req-instances",
"instance": "http://localhost:8192",
"success": true,
"result": [
{
"port": 8192,
"url": "http://localhost:8192",
"type": "base",
"project": "MyProject",
"file": "example.exe",
"_links": {
"self": { "href": "/instances/8192" },
"info": { "href": "http://localhost:8192/info" },
"connect": { "href": "http://localhost:8192" }
}
},
{
"port": 8193,
"url": "http://localhost:8193",
"type": "standard",
"project": "MyProject",
"file": "library.dll",
"_links": {
"self": { "href": "/instances/8193" },
"info": { "href": "http://localhost:8193/info" },
"connect": { "href": "http://localhost:8193" }
}
}
],
"_links": {
"self": { "href": "/instances" },
"register": { "href": "/registerInstance", "method": "POST" },
"unregister": { "href": "/unregisterInstance", "method": "POST" },
"programs": { "href": "/programs" }
}
}
```
## Resource Types
Each Ghidra plugin instance runs in the context of a single program, so all resources are relative to the current program. The program's details are available through the `GET /info` and `GET /program` endpoints.
### 1. Project
Represents the current Ghidra project, which is a container for programs.
- **`GET /project`**: Get details about the current project (e.g., location, list of open programs within it via links).
### 2. Program
Represents the current binary loaded in Ghidra.
- **`GET /program`**: Get metadata for the current program (e.g., name, architecture, memory layout, analysis status).
```json
// Example Response Fragment for GET /program
"result": {
"programId": "myproject:/path/to/mybinary.exe",
"name": "mybinary.exe",
"isOpen": true,
"languageId": "x86:LE:64:default",
"compilerSpecId": "gcc",
"imageBase": "0x400000",
"memorySize": 1048576,
"analysisComplete": true
},
"_links": {
"self": { "href": "/program" },
"project": { "href": "/project" },
"functions": { "href": "/functions" },
"symbols": { "href": "/symbols" },
"data": { "href": "/data" },
"segments": { "href": "/segments" },
"memory": { "href": "/memory" },
"xrefs": { "href": "/xrefs" },
"analysis": { "href": "/analysis" }
}
```
### 3. Current Location
Provides information about the current cursor position and function in Ghidra's CodeBrowser.
- **`GET /address`**: Get the current cursor position.
```json
// Example Response
"result": {
"address": "0x401000",
"program": "mybinary.exe"
},
"_links": {
"self": { "href": "/address" },
"program": { "href": "/program" },
"memory": { "href": "/memory/0x401000?length=16" },
"function": { "href": "/functions/0x401000" },
"decompile": { "href": "/functions/0x401000/decompile" }
}
```
- **`GET /function`**: Get information about the function at the current cursor position.
```json
// Example Response
"result": {
"name": "main",
"address": "0x401000",
"signature": "int main(int argc, char** argv)",
"size": 256
},
"_links": {
"self": { "href": "/function" },
"program": { "href": "/program" },
"function": { "href": "/functions/0x401000" },
"decompile": { "href": "/functions/0x401000/decompile" },
"disassembly": { "href": "/functions/0x401000/disassembly" },
"variables": { "href": "/functions/0x401000/variables" },
"xrefs": { "href": "/xrefs?to_addr=0x401000" }
}
```
### 4. Functions
Represents functions within the current program.
- **`GET /functions`**: List functions. Supports searching (by name/address/regex) and pagination.
```json
// Example Response Fragment
"result": [
{ "name": "FUN_08000004", "address": "08000004", "_links": { "self": { "href": "/functions/08000004" } } },
{ "name": "init_peripherals", "address": "08001cf0", "_links": { "self": { "href": "/functions/08001cf0" } } }
]
```
- **`POST /functions`**: Create a function at a specific address. Requires `address` in the request body. Returns the created function resource.
- **`GET /functions/{address}`**: Get details for a specific function (name, signature, size, stack info, etc.).
```json
// Example Response Fragment for GET /functions/0x4010a0
"result": {
"name": "process_data",
"address": "0x4010a0",
"signature": "int process_data(char * data, int size)",
"size": 128,
"stack_depth": 16,
"has_varargs": false,
"calling_convention": "__stdcall"
// ... other details
},
"_links": {
"self": { "href": "/functions/0x4010a0" },
"decompile": { "href": "/functions/0x4010a0/decompile" },
"disassembly": { "href": "/functions/0x4010a0/disassembly" },
"variables": { "href": "/functions/0x4010a0/variables" },
"xrefs_to": { "href": "/xrefs?to_addr=0x4010a0" },
"xrefs_from": { "href": "/xrefs?from_addr=0x4010a0" }
}
```
- **`PATCH /functions/{address}`**: Modify a function. Addressable only by address. Payload can contain:
- `name`: New function name.
- `signature`: Full function signature string (e.g., `void my_func(int p1, char * p2)`).
- `comment`: Set/update the function's primary comment.
```json
// Example PATCH payload
{ "name": "calculate_checksum", "signature": "uint32_t calculate_checksum(uint8_t* buffer, size_t length)" }
```
- **`DELETE /functions/{address}`**: Delete the function definition at the specified address.
#### Function Sub-Resources
- **`GET /functions/{address}/decompile`**: Get decompiled C-like code for the function.
- Query Parameters:
- `?syntax_tree=true`: Include the decompiler's internal syntax tree (JSON).
- `?style=[style_name]`: Apply a specific decompiler simplification style (e.g., `normalize`, `paramid`).
- `?timeout=[seconds]`: Set a timeout for the decompilation process.
```json
// Example Response Fragment (without syntax tree)
"result": {
"address": "0x4010a0",
"ccode": "int process_data(char *param_1, int param_2)\n{\n // ... function body ...\n return result;\n}\n"
}
```
- **`GET /functions/{address}/disassembly`**: Get assembly listing for the function. Supports pagination (`?offset=`, `?limit=`).
```json
// Example Response Fragment
"result": [
{ "address": "0x4010a0", "mnemonic": "PUSH", "operands": "RBP", "bytes": "55" },
{ "address": "0x4010a1", "mnemonic": "MOV", "operands": "RBP, RSP", "bytes": "4889E5" },
// ... more instructions
]
```
- **`GET /functions/{address}/variables`**: List local variables defined within the function. Supports searching by name.
- **`PATCH /functions/{address}/variables/{variable_name}`**: Modify a local variable (rename, change type). Requires `name` and/or `type` in the payload.
### 5. Symbols & Labels
Represents named locations (functions, data, labels).
- **`GET /symbols`**: List all symbols in the program. Supports searching (by name/address/regex) and pagination. Can filter by type (`?type=function`, `?type=data`, `?type=label`).
- **`POST /symbols`**: Create or rename a symbol at a specific address. Requires `address` and `name` in the payload. If a symbol exists, it's renamed; otherwise, a new label is created.
- **`GET /symbols/{address}`**: Get details of the symbol at the specified address.
- **`PATCH /symbols/{address}`**: Modify properties of the symbol (e.g., set as primary, change namespace). Payload specifies changes.
- **`DELETE /symbols/{address}`**: Remove the symbol at the specified address.
### 6. Data
Represents defined data items in memory.
- **`GET /data`**: List defined data items. Supports searching (by name/address/regex) and pagination. Can filter by type (`?type=string`, `?type=dword`, etc.).
- **`POST /data`**: Define a new data item. Requires `address`, `type`, and optionally `size` or `length` in the payload.
- **`GET /data/{address}`**: Get details of the data item at the specified address (type, size, value representation).
- **`PATCH /data/{address}`**: Modify a data item (e.g., change `name`, `type`, `comment`). Payload specifies changes.
- **`DELETE /data/{address}`**: Undefine the data item at the specified address.
### 6.1 Strings
Provides access to string data in the binary.
- **`GET /strings`**: List all defined strings in the binary. Supports pagination and filtering.
- Query Parameters:
- `?offset=[int]`: Number of strings to skip (default: 0).
- `?limit=[int]`: Maximum number of strings to return (default: 2000).
- `?filter=[string]`: Only include strings containing this substring (case-insensitive).
```json
// Example Response
"result": [
{
"address": "0x00401234",
"value": "Hello, world!",
"length": 14,
"type": "string",
"name": "aHelloWorld"
},
{
"address": "0x00401250",
"value": "Error: could not open file",
"length": 26,
"type": "string",
"name": "aErrorCouldNotO"
}
],
"_links": {
"self": { "href": "/strings?offset=0&limit=10" },
"next": { "href": "/strings?offset=10&limit=10" }
}
```
### 6.2 Structs
Provides functionality for creating and managing struct (composite) data types.
- **`GET /structs`**: List all struct data types in the program. Supports pagination and filtering.
- Query Parameters:
- `?offset=[int]`: Number of structs to skip (default: 0).
- `?limit=[int]`: Maximum number of structs to return (default: 100).
- `?category=[string]`: Filter by category path (e.g. "/winapi").
```json
// Example Response
"result": [
{
"name": "MyStruct",
"path": "/custom/MyStruct",
"size": 16,
"numFields": 4,
"category": "/custom",
"description": "Custom data structure"
},
{
"name": "FileHeader",
"path": "/FileHeader",
"size": 32,
"numFields": 8,
"category": "/",
"description": ""
}
],
"_links": {
"self": { "href": "/structs?offset=0&limit=100" },
"program": { "href": "/program" }
}
```
- **`GET /structs?name={struct_name}`**: Get detailed information about a specific struct including all fields.
```json
// Example Response for GET /structs?name=MyStruct
"result": {
"name": "MyStruct",
"path": "/custom/MyStruct",
"size": 16,
"category": "/custom",
"description": "Custom data structure",
"numFields": 4,
"fields": [
{
"name": "id",
"offset": 0,
"length": 4,
"type": "int",
"typePath": "/int",
"comment": "Unique identifier"
},
{
"name": "flags",
"offset": 4,
"length": 4,
"type": "dword",
"typePath": "/dword",
"comment": ""
},
{
"name": "data_ptr",
"offset": 8,
"length": 4,
"type": "pointer",
"typePath": "/pointer",
"comment": "Pointer to data"
},
{
"name": "size",
"offset": 12,
"length": 4,
"type": "uint",
"typePath": "/uint",
"comment": ""
}
]
},
"_links": {
"self": { "href": "/structs?name=MyStruct" },
"structs": { "href": "/structs" },
"program": { "href": "/program" }
}
```
- **`POST /structs/create`**: Create a new struct data type.
- Request Payload:
- `name`: Name for the new struct (required).
- `category`: Category path (optional, defaults to root).
- `description`: Description for the struct (optional).
```json
// Example Request Payload
{
"name": "NetworkPacket",
"category": "/network",
"description": "Network packet structure"
}
// Example Response
"result": {
"name": "NetworkPacket",
"path": "/network/NetworkPacket",
"category": "/network",
"size": 0,
"message": "Struct created successfully"
}
```
- **`POST /structs/addfield`**: Add a field to an existing struct.
- Request Payload:
- `struct`: Name of the struct to modify (required).
- `fieldName`: Name for the new field (required).
- `fieldType`: Data type for the field (required, e.g. "int", "char", "pointer").
- `offset`: Specific offset to insert field (optional, appends to end if not specified).
- `comment`: Comment for the field (optional).
```json
// Example Request Payload
{
"struct": "NetworkPacket",
"fieldName": "header",
"fieldType": "dword",
"comment": "Packet header"
}
// Example Response
"result": {
"struct": "NetworkPacket",
"fieldName": "header",
"fieldType": "dword",
"offset": 0,
"length": 4,
"structSize": 4,
"message": "Field added successfully"
}
```
- **`POST /structs/updatefield`**: Update an existing field in a struct (rename, change type, or modify comment).
- Request Payload:
- `struct`: Name of the struct to modify (required).
- `fieldOffset` OR `fieldName`: Identify the field to update (one required).
- `newName`: New name for the field (optional).
- `newType`: New data type for the field (optional).
- `newComment`: New comment for the field (optional).
- At least one of `newName`, `newType`, or `newComment` must be provided.
```json
// Example Request Payload - rename a field
{
"struct": "NetworkPacket",
"fieldName": "header",
"newName": "packet_header",
"newComment": "Updated packet header field"
}
// Example Request Payload - change type by offset
{
"struct": "NetworkPacket",
"fieldOffset": 0,
"newType": "qword"
}
// Example Response
"result": {
"struct": "NetworkPacket",
"offset": 0,
"originalName": "header",
"originalType": "dword",
"originalComment": "Packet header",
"newName": "packet_header",
"newType": "dword",
"newComment": "Updated packet header field",
"length": 4,
"message": "Field updated successfully"
}
```
- **`POST /structs/delete`**: Delete a struct data type.
- Request Payload:
- `name`: Name of the struct to delete (required).
```json
// Example Request Payload
{
"name": "NetworkPacket"
}
// Example Response
"result": {
"name": "NetworkPacket",
"path": "/network/NetworkPacket",
"category": "/network",
"message": "Struct deleted successfully"
}
```
### 7. Memory Segments
Represents memory blocks/sections defined in the program.
- **`GET /segments`**: List all memory segments (e.g., `.text`, `.data`, `.bss`).
- **`GET /segments/{segment_name}`**: Get details for a specific segment (address range, permissions, size).
### 8. Memory Access
Provides raw memory access.
- **`GET /memory/{address}`**: Read bytes from memory.
- Query Parameters:
- `?length=[bytes]`: Number of bytes to read (required, max limit applies).
- `?format=[hex|base64|string]`: How to encode the returned bytes (default: hex).
```json
// Example Response Fragment for GET /programs/proj%3A%2Ffile.bin/memory/0x402000?length=16&format=hex
"result": {
"address": "0x402000",
"length": 16,
"format": "hex",
"bytes": "48656C6C6F20576F726C642100000000" // "Hello World!...."
}
```
- **`PATCH /memory/{address}`**: Write bytes to memory. Requires `bytes` (in specified `format`) and `format` in the payload. Use with extreme caution.
### 9. Cross-References (XRefs)
Provides information about references to/from addresses.
- **`GET /xrefs`**: Search for cross-references. Supports pagination.
- Query Parameters (at least one required):
- `?to_addr=[address]`: Find references *to* this address.
- `?from_addr=[address]`: Find references *from* this address or within the function/data at this address.
- `?type=[CALL|READ|WRITE|DATA|POINTER|...]`: Filter by reference type.
- **`GET /functions/{address}/xrefs`**: Convenience endpoint, equivalent to `GET /xrefs?to_addr={address}` and potentially `GET /xrefs?from_addr={address}` combined or linked.
### 10. Analysis
Provides access to Ghidra's analysis results.
- **`GET /analysis`**: Get information about the analysis status and available analyzers.
```json
// Example Response
"result": {
"program": "mybinary.exe",
"analysis_enabled": true,
"available_analyzers": [
"Function Start Analyzer",
"Basic Block Model Analyzer",
"Reference Analyzer",
"Call Convention Analyzer",
"Data Reference Analyzer",
"Decompiler Parameter ID",
"Stack Analyzer"
]
},
"_links": {
"self": { "href": "/analysis" },
"program": { "href": "/program" },
"analyze": { "href": "/analysis", "method": "POST" },
"callgraph": { "href": "/analysis/callgraph" }
}
```
- **`POST /analysis`**: Trigger a full or partial re-analysis of the program.
```json
// Example Response
"result": {
"program": "mybinary.exe",
"analysis_triggered": true,
"message": "Analysis initiated on program"
}
```
- **`GET /analysis/callgraph`**: Retrieve the function call graph.
- Query Parameters:
- `?function=[function_name]`: Start the call graph from this function (default: entry point).
- `?max_depth=[int]`: Maximum depth of the call graph (default: 3).
```json
// Example Response
"result": {
"root": "main",
"root_address": "0x401000",
"max_depth": 3,
"nodes": [
{
"id": "0x401000",
"name": "main",
"address": "0x401000",
"depth": 0,
"_links": {
"self": { "href": "/functions/0x401000" }
}
},
// ... more nodes
],
"edges": [
{
"from": "0x401000",
"to": "0x401100",
"type": "call",
"call_site": "0x401050"
},
// ... more edges
]
}
```
- **`GET /analysis/dataflow`**: Perform data flow analysis starting from a specific address.
- Query Parameters:
- `?address=[address]`: Starting address for data flow analysis (required).
- `?direction=[forward|backward]`: Direction of data flow analysis (default: forward).
- `?max_steps=[int]`: Maximum number of steps to analyze (default: 50).
```json
// Example Response
"result": {
"start_address": "0x401050",
"direction": "forward",
"max_steps": 50,
"steps": [
{
"address": "0x401050",
"instruction": "MOV EAX, [RBP+0x8]",
"description": "Starting point of data flow analysis"
},
// ... more steps
]
}
```
## Design Considerations for AI Usage
- **Structured responses**: JSON format ensures predictable parsing by AI agents.
- **HATEOAS Links**: `_links` allow agents to discover available actions and related resources without hardcoding paths.
- **Address and Name Resolution**: Key elements like functions and symbols are addressable by both memory address and name where applicable.
- **Explicit Operations**: Actions like decompilation, disassembly, and analysis are distinct endpoints.
- **Pagination & Filtering**: Essential for handling potentially large datasets (symbols, functions, xrefs, disassembly).
- **Clear Error Reporting**: `success: false` and the `error` object provide actionable feedback.
- **No Injected Summaries**: The API should return raw or structured Ghidra data, leaving interpretation and summarization to the AI agent.

171
Makefile Normal file
View File

@ -0,0 +1,171 @@
# MCGhidra Makefile
# Convenient commands for Docker and development operations
.PHONY: help build build-dev up up-dev down down-dev logs logs-dev \
shell status clean analyze test health
# Default target
help:
@echo "MCGhidra Docker Management"
@echo "============================"
@echo ""
@echo "Build commands:"
@echo " make build Build production Docker image"
@echo " make build-dev Build development Docker image"
@echo " make build-all Build both images"
@echo ""
@echo "Run commands:"
@echo " make up Start production container"
@echo " make up-dev Start development container"
@echo " make down Stop production container"
@echo " make down-dev Stop development container"
@echo " make down-all Stop all containers"
@echo ""
@echo "Analysis commands:"
@echo " make analyze FILE=path/to/binary Analyze a binary"
@echo " make analyze-bg FILE=path/to/binary Analyze in background"
@echo ""
@echo "Utility commands:"
@echo " make shell Start interactive shell in container"
@echo " make logs View production container logs"
@echo " make logs-dev View development container logs"
@echo " make status Check container status"
@echo " make health Check API health"
@echo " make clean Remove containers and volumes"
@echo " make clean-all Remove everything including images"
@echo ""
@echo "MCP Server commands:"
@echo " make mcp Start the MCP server (Python)"
@echo " make mcp-dev Start MCP server in development mode"
@echo ""
# =============================================================================
# Build Commands
# =============================================================================
build:
docker compose build mcghidra
build-dev:
docker compose build mcghidra-dev
build-all: build build-dev
# =============================================================================
# Run Commands
# =============================================================================
up:
docker compose --profile prod up -d mcghidra
@echo "MCGhidra starting... checking health in 30 seconds"
@sleep 30
@$(MAKE) health || echo "Server may still be starting up..."
up-dev:
docker compose --profile dev up -d mcghidra-dev
@echo "MCGhidra (dev) starting..."
down:
docker compose --profile prod down
down-dev:
docker compose --profile dev down
down-all:
docker compose --profile prod --profile dev --profile debug down
restart: down up
restart-dev: down-dev up-dev
# =============================================================================
# Analysis Commands
# =============================================================================
# Analyze a binary file
# Usage: make analyze FILE=/path/to/binary
analyze:
ifndef FILE
@echo "Error: FILE is required. Usage: make analyze FILE=/path/to/binary"
@exit 1
endif
@echo "Analyzing: $(FILE)"
docker compose run --rm -v "$(dir $(FILE)):/binaries:ro" mcghidra /binaries/$(notdir $(FILE))
# Analyze in background (detached)
analyze-bg:
ifndef FILE
@echo "Error: FILE is required. Usage: make analyze-bg FILE=/path/to/binary"
@exit 1
endif
@echo "Starting background analysis of: $(FILE)"
docker compose run -d -v "$(dir $(FILE)):/binaries:ro" mcghidra /binaries/$(notdir $(FILE))
# =============================================================================
# Utility Commands
# =============================================================================
shell:
docker compose --profile debug run --rm mcghidra-shell
logs:
docker compose logs -f mcghidra
logs-dev:
docker compose logs -f mcghidra-dev
status:
@echo "=== Container Status ==="
@docker compose ps -a
@echo ""
@echo "=== Resource Usage ==="
@docker stats --no-stream $$(docker compose ps -q 2>/dev/null) 2>/dev/null || echo "No containers running"
health:
@echo "Checking MCGhidra API health..."
@curl -sf http://localhost:$${MCGHIDRA_PORT:-8192}/ | python3 -m json.tool 2>/dev/null \
|| echo "API not responding (server may be starting or binary being analyzed)"
# =============================================================================
# Cleanup Commands
# =============================================================================
clean:
docker compose --profile prod --profile dev --profile debug down -v
@echo "Containers and volumes removed"
clean-all: clean
docker rmi mcghidra:latest mcghidra:dev 2>/dev/null || true
@echo "Images removed"
prune:
docker system prune -f
@echo "Docker system pruned"
# =============================================================================
# MCP Server Commands
# =============================================================================
mcp:
uv run python -m mcghidra
mcp-dev:
uv run python -m mcghidra --verbose
# =============================================================================
# Development Commands
# =============================================================================
test:
uv run pytest tests/ -v
lint:
uv run ruff check src/
format:
uv run ruff format src/
# Check if binaries directory exists
check-binaries:
@mkdir -p binaries
@echo "Binaries directory ready at ./binaries/"

328
QUICKSTART.md Normal file
View File

@ -0,0 +1,328 @@
# MCGhidra Quick Start Guide
## What is MCGhidra?
MCGhidra is a complete reverse engineering platform that combines:
- **Ghidra** - NSA's powerful binary analysis tool
- **Docker** - Containerized, reproducible analysis environment
- **HTTP REST API** - HATEOAS-compliant REST interface
- **MCP Server** - FastMCP-based Model Context Protocol integration
- **ARM Firmware Support** - Tools for analyzing raw embedded firmware
## 5-Minute Quick Start
### 1. Analyze a Standard Binary (ELF/PE/Mach-O)
```bash
cd /home/rpm/claude/mcghidra/MCGhidra
# Build the Docker image (one time)
docker build -t mcghidra:latest -f docker/Dockerfile .
# Analyze any standard binary
docker run -d --name my-analysis \
-p 8192:8192 \
-v $(pwd)/binaries:/binaries \
mcghidra:latest \
/binaries/your-binary
# Wait ~20 seconds for analysis, then access HTTP API
curl http://localhost:8192/
curl http://localhost:8192/functions | jq '.functions[] | {name, address}'
curl http://localhost:8192/functions/<address>/decompile
```
### 2. Analyze ARM Firmware (Raw Binary)
```bash
# Step 1: Create ELF wrapper
python3 docker/arm_firmware_prep.py \
your-firmware.bin \
binaries/your-firmware.elf \
0x00000000
# Step 2: Analyze normally
docker run -d --name arm-firmware \
-p 8192:8192 \
-v $(pwd)/binaries:/binaries \
mcghidra:latest \
/binaries/your-firmware.elf
```
### 3. Use the MCP Server
```bash
# The MCP server is located at:
cd /home/rpm/claude/mcghidra/MCGhidra
./launch.sh
# Or with uv:
cd MCGhidra && uv run mcghidra
```
## HTTP API Overview
Once analysis completes, the API is available at `http://localhost:8192/`:
### Core Endpoints
```bash
# Program information
GET /program
# Functions
GET /functions # List all functions
GET /functions/<address> # Function details
GET /functions/<address>/decompile # Decompiled C code
GET /functions/<address>/disassembly # Assembly listing
GET /functions/<address>/variables # Local variables
# Analysis
GET /analysis/callgraph?name=main&max_depth=3
GET /analysis/dataflow?address=<addr>&direction=forward
# Memory
GET /memory/<address>?length=256&format=hex
POST /memory/<address> # Write bytes
# Data & Structures
GET /data/strings
GET /structs
GET /xrefs?to_addr=<addr>
```
### Response Format (HATEOAS)
All responses include navigation links:
```json
{
"success": true,
"result": {
"name": "main",
"address": "00101380",
"signature": "int main(void)"
},
"_links": {
"self": "/functions/00101380",
"decompile": "/functions/00101380/decompile",
"disassembly": "/functions/00101380/disassembly"
}
}
```
## MCP Tools Overview
Use with Claude Code or any MCP client:
```python
# Functions
functions_list(port=8192, page_size=50)
functions_decompile(address="00101380", port=8192)
functions_get(name="main", port=8192)
# Analysis
analysis_get_callgraph(name="main", max_depth=3, port=8192)
analysis_get_dataflow(address="00101380", direction="forward", port=8192)
# Data
data_list_strings(port=8192, grep="password")
structs_list(port=8192)
# Docker Management
docker_status()
docker_start(binary_path="/path/to/binary", port=8192)
docker_stop(name_or_id="container-name")
docker_logs(name_or_id="container-name", tail=100)
```
## Common Workflows
### Find Interesting Functions
```bash
# List all functions
curl http://localhost:8192/functions | jq '.functions[].name'
# Search for crypto-related functions
curl http://localhost:8192/functions | jq '.functions[] | select(.name | test("crypt|hash|encrypt"; "i"))'
# Get call graph from main
curl 'http://localhost:8192/analysis/callgraph?name=main&max_depth=2' | jq .
```
### Analyze Strings
```bash
# List all strings
curl http://localhost:8192/data/strings | jq '.strings[] | {address, value}'
# Find passwords/keys
curl http://localhost:8192/data/strings | jq '.strings[] | select(.value | test("password|key|secret"; "i"))'
```
### Decompile Entry Point
```bash
# Get program entry point
ENTRY=$(curl -s http://localhost:8192/program | jq -r '.program.entryPoint')
# Decompile it
curl "http://localhost:8192/functions/$ENTRY/decompile" | jq -r '.result'
```
## Docker Management
### List Running Containers
```bash
docker ps | grep mcghidra
```
### View Logs
```bash
docker logs -f my-analysis
```
### Stop Analysis
```bash
docker stop my-analysis
docker rm my-analysis
```
### Persistent Projects
```bash
# Mount project directory for persistence
docker run -d --name persistent \
-p 8192:8192 \
-v $(pwd)/projects:/projects \
-v $(pwd)/binaries:/binaries \
-e PROJECT_NAME=MyProject \
mcghidra:latest \
/binaries/my-binary
# Projects are saved in ./projects/MyProject/
```
## Troubleshooting
### Import Failed
```bash
# Check logs
docker logs my-analysis 2>&1 | grep ERROR
# Common issues:
# 1. Binary not found → Check volume mount path
# 2. AutoImporter failed → Use arm_firmware_prep.py for raw binaries
# 3. Unsupported format → Check file type with `file binary`
```
### Script Errors
```bash
# If you see "Failed to get OSGi bundle" errors
# Fix script permissions in running container:
docker exec my-analysis sh -c 'chmod 644 /opt/ghidra/scripts/*.java'
# Then restart the analysis
```
### Port Already in Use
```bash
# Use different port
docker run -d --name analysis2 \
-p 8193:8192 \
-v $(pwd)/binaries:/binaries \
mcghidra:latest \
/binaries/binary
# Access at http://localhost:8193/
```
## Examples
### Example 1: Analyze Test Binary
```bash
# Create simple test binary
cat > test.c << 'EOF'
#include <stdio.h>
int secret_value = 0x42;
void hidden() { printf("Hidden: %d\n", secret_value); }
int main() { printf("Hello!\n"); return 0; }
EOF
gcc -o binaries/test test.c
# Analyze
docker run -d --name test-analysis \
-p 8192:8192 \
-v $(pwd)/binaries:/binaries \
mcghidra:latest \
/binaries/test
# Find hidden function
sleep 15
curl http://localhost:8192/functions | jq '.functions[] | select(.name == "hidden")'
```
### Example 2: Cisco Phone Firmware
```bash
# Prepare firmware
python3 docker/arm_firmware_prep.py \
cisco-firmware/P003-8-12-00.bin \
binaries/cisco.elf \
0x00000000
# Analyze
docker run -d --name cisco \
-p 8192:8192 \
-v $(pwd)/binaries:/binaries \
mcghidra:latest \
/binaries/cisco.elf
# Explore
sleep 30
curl http://localhost:8192/functions | jq '.functions | length' # Function count
curl http://localhost:8192/data/strings | jq '.strings[] | select(.value | test("SIP|RTP"))'
```
## Next Steps
- **Read ARM_FIRMWARE_SUCCESS.md** for ARM firmware details
- **Check docker/README_ARM_SOLUTION.md** for advanced ARM workflows
- **Explore MCP integration** with Claude Code
- **Build automations** using the HTTP API
## Project Structure
```
MCGhidra/
├── docker/
│ ├── Dockerfile # Main Docker image
│ ├── entrypoint.sh # Container entry point
│ ├── MCGhidraServer.java # HTTP API server (1724 lines)
│ ├── ImportRawARM.java # Raw binary import script
│ ├── arm_firmware_prep.py # ELF wrapper tool ⭐
│ └── README*.md # Documentation
├── src/mcghidra/ # MCP server implementation
│ ├── __init__.py
│ ├── server.py # FastMCP server
│ └── mixins/ # Modular functionality
│ ├── docker.py # Docker management
│ ├── instances.py # Instance registry
│ ├── functions.py # Function operations
│ ├── analysis.py # Analysis tools
│ └── ...
├── binaries/ # Binary files for analysis
├── projects/ # Ghidra project persistence
└── launch.sh # MCP server launcher
```
Happy reverse engineering! 🔍

566
README.md
View File

@ -1,340 +1,302 @@
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://www.apache.org/licenses/LICENSE-2.0)
[![GitHub release (latest by date)](https://img.shields.io/github/v/release/teal-bauer/GhydraMCP)](https://github.com/teal-bauer/GhydraMCP/releases)
[![GitHub stars](https://img.shields.io/github/stars/teal-bauer/GhydraMCP)](https://github.com/teal-bauer/GhydraMCP/stargazers)
[![GitHub forks](https://img.shields.io/github/forks/teal-bauer/GhydraMCP)](https://github.com/teal-bauer/GhydraMCP/network/members)
[![GitHub contributors](https://img.shields.io/github/contributors/teal-bauer/GhydraMCP)](https://github.com/teal-bauer/GhydraMCP/graphs/contributors)
[![Build Status](https://github.com/teal-bauer/GhydraMCP/actions/workflows/build.yml/badge.svg)](https://github.com/teal-bauer/GhydraMCP/actions/workflows/build.yml)
# MCGhidra
# GhydraMCP
**AI-native reverse engineering.** Give Claude (or any MCP client) direct access to Ghidra's analysis engine.
GhydraMCP is a bridge between [Ghidra](https://ghidra-sre.org/) and AI assistants that enables AI-assisted reverse engineering through the [Model Context Protocol (MCP)](https://github.com/modelcontextprotocol/mcp).
![GhydraMCP logo](https://github.com/user-attachments/assets/86b9b2de-767c-4ed5-b082-510b8109f00f)
## Overview
GhydraMCP consists of:
1. **Ghidra Plugin**: Exposes Ghidra's powerful reverse engineering capabilities through a REST API
2. **MCP Bridge**: A Python script that translates MCP requests into API calls
3. **Multi-instance Support**: Connect multiple Ghidra instances to analyze different binaries simultaneously
This allows AI assistants like Claude to directly:
- Decompile functions and analyze binary code
- Understand program structure, function relationships, and data types
- Perform binary analysis tasks (identify cross-references, data flow, etc.)
- Make meaningful changes to the analysis (rename functions, add comments, etc.)
GhydraMCP is based on [GhidraMCP by Laurie Wired](https://github.com/LaurieWired/GhidraMCP/) with added multi-instance support and numerous enhancements.
# Features
GhydraMCP combines a Ghidra plugin with an MCP server to provide a comprehensive set of reverse engineering capabilities to AI assistants:
## Program Analysis
- **Decompilation**: Convert binary functions to readable C code
- **Static Analysis**:
- Cross-reference analysis (find who calls what)
- Data flow analysis
- Type propagation and reconstruction
- **Symbol Management**:
- View and analyze imports and exports
- Identify library functions and dependencies
## Interactive Reverse Engineering
- **Code Understanding**:
- Explore function code and relationships
- Analyze data structures and types
- **Annotation**:
- Rename functions, variables, and data
- Add comments and documentation
- Create and modify data types
## Multi-instance Support
- Run multiple Ghidra instances simultaneously
- Analyze different binaries in parallel
- Connect to specific instances using port numbers
- Auto-discovery of running Ghidra instances
- Instance metadata with project and file information
## Program Navigation
- List and search functions, classes, and namespaces
- View memory segments and layout
- Search by name, pattern, or signature
# Installation
## Prerequisites
- Install [Ghidra](https://ghidra-sre.org)
- Python3
- MCP [SDK](https://github.com/modelcontextprotocol/python-sdk)
## Ghidra
First, download the latest [release](https://github.com/teal-bauer/GhydraMCP/releases) from this repository. The "Complete" artifact contains the zipped Ghidra plugin and the Python MCP bridge. Unpack the outer archive, then, add the plugin to Ghidra:
1. Run Ghidra
2. Select `File` -> `Install Extensions`
3. Click the `+` button
4. Select the `GhydraMCP-1.1.zip` (or your chosen version) from the downloaded release
5. Restart Ghidra
6. Make sure the GhydraMCPPlugin is enabled in `File` -> `Configure` -> `Developer`
> **Note:** By default, the first CodeBrowser opened in Ghidra gets port 8192, the second gets 8193, and so on. You can check which ports are being used by looking at the Console in the Ghidra main (project) window - click the computer icon in the bottom right to "Open Console". Look for log entries like:
> ```
> (HydraMCPPlugin) Plugin loaded on port 8193
> (HydraMCPPlugin) HydraMCP HTTP server started on port 8193
> ```
>
> GhydraMCP now includes auto-discovery of running Ghidra instances, so manually registering each instance is typically not necessary. The MCP bridge will automatically discover and register instances on startup and periodically check for new ones.
Video Installation Guide:
https://github.com/user-attachments/assets/75f0c176-6da1-48dc-ad96-c182eb4648c3
## MCP Clients
Theoretically, any MCP client should work with GhydraMCP. Two examples are given below.
## API Reference
### Available Tools
**Program Analysis**:
- `list_methods`: List all functions (params: offset, limit)
- `list_classes`: List all classes/namespaces (params: offset, limit)
- `decompile_function`: Get decompiled C code (params: name)
- `rename_function`: Rename a function (params: old_name, new_name)
- `rename_data`: Rename data at address (params: address, new_name)
- `list_segments`: View memory segments (params: offset, limit)
- `list_imports`: List imported symbols (params: offset, limit)
- `list_exports`: List exported functions (params: offset, limit)
- `list_namespaces`: Show namespaces (params: offset, limit)
- `list_data_items`: View data labels (params: offset, limit)
- `search_functions_by_name`: Find functions (params: query, offset, limit)
**Instance Management**:
- `list_instances`: List active Ghidra instances (no params)
- `register_instance`: Register new instance (params: port, url)
- `unregister_instance`: Remove instance (params: port)
- `discover_instances`: Auto-discover running instances (params: host [optional])
**Example Usage**:
```python
# Program analysis
client.use_tool("ghydra", "decompile_function", {"name": "main"})
# Instance management
client.use_tool("ghydra", "register_instance", {"port": 8192, "url": "http://localhost:8192/"})
client.use_tool("ghydra", "register_instance", {"port": 8193})
# Auto-discover instances
client.use_tool("ghydra", "discover_instances") # Default host
client.use_tool("ghydra", "discover_instances", {"host": "192.168.1.10"}) # Custom host
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ "Analyze the authentication bypass in this firmware" │
│ │
│ Claude: I'll decompile the auth functions and trace the validation logic. │
│ │
│ [functions_list grep="auth|login|verify"] │
│ [functions_decompile name="verify_password"] │
│ [xrefs_list to_addr="0x0040156c"] │
│ [analysis_get_dataflow address="0x00401234" direction="backward"] │
│ │
│ Found it. The password check at 0x401580 compares against a hardcoded │
│ hash, but there's a debug backdoor at 0x401590 that bypasses validation │
│ when the username starts with "debug_". Let me show you the call graph... │
└─────────────────────────────────────────────────────────────────────────────┘
```
## Client Setup
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://www.apache.org/licenses/LICENSE-2.0)
### Claude Desktop Configuration
## What You Get
**64 MCP tools** across 12 categories:
| Category | Tools | What it does |
|----------|-------|--------------|
| **Functions** | 11 | Decompile, disassemble, rename, set signatures, list variables |
| **Data** | 8 | Create/modify data items, list strings, set types |
| **Structs** | 7 | Create structs, add/update fields, manage data types |
| **Symbols** | 9 | Create labels, rename symbols, list imports/exports |
| **Analysis** | 6 | Call graphs, data flow, cross-references, run analysis |
| **Memory** | 2 | Read/write raw bytes |
| **Variables** | 4 | List/rename function variables, set types |
| **Bookmarks** | 3 | Create/list/delete analysis bookmarks |
| **Enums/Typedefs** | 4 | Create enum and typedef data types |
| **Namespaces** | 2 | List namespaces and classes |
| **Segments** | 1 | List memory segments with permissions |
| **Docker** | 7 | Auto-start containers, health checks, session management |
**13 analysis prompts** for common RE workflows:
- `malware_triage` — Quick capability assessment
- `identify_crypto` — Find crypto functions and constants
- `find_authentication` — Locate auth, license checks, credentials
- `analyze_protocol` — Reverse network/file protocols
- `trace_data_flow` — Taint analysis through functions
- And 8 more specialized prompts...
**11 MCP resources** for quick enumeration without tool calls.
---
## Quick Start
### Option 1: Docker (Easiest)
No Ghidra installation needed. Analyze binaries in isolated containers.
```bash
# Build the image (once)
cd MCGhidra && docker build -t mcghidra:latest -f docker/Dockerfile .
# Add to your MCP config
claude mcp add mcghidra -- uv run --directory /path/to/MCGhidra mcghidra
```
Then in Claude:
```
Analyze /path/to/suspicious.exe
```
Claude will auto-start a container, wait for analysis, and begin work.
### Option 2: Native Ghidra
1. **Install the Ghidra plugin:**
- Download latest [release](https://github.com/starsong-consulting/MCGhidra/releases)
- In Ghidra: `File → Install Extensions → +` → select the `.zip`
- Restart Ghidra
- Enable in `File → Configure → Developer → MCGhidraPlugin`
2. **Add MCP server:**
```bash
claude mcp add mcghidra -- uv run --directory /path/to/MCGhidra mcghidra
```
3. **Open a binary in Ghidra**, then ask Claude to analyze it.
---
## How It Works
```
┌──────────────┐ MCP ┌──────────────┐ HTTP ┌──────────────┐
│ Claude │◄────────────►│ MCGhidra │◄────────────►│ Ghidra │
│ (or other │ stdio │ (Python) │ REST API │ Plugin │
│ MCP client) │ │ │ │ (Java) │
└──────────────┘ └──────────────┘ └──────────────┘
```
- **Ghidra Plugin**: Exposes Ghidra's analysis via HTTP REST API (HATEOAS)
- **MCGhidra Server**: Translates MCP tool calls to API requests
- **Multi-instance**: Analyze multiple binaries simultaneously on different ports
- **Session isolation**: Docker containers get unique ports, preventing conflicts
---
## Usage Patterns
### Set Current Instance (Then Forget About Ports)
```python
instances_list() # Discover running Ghidra instances
instances_use(port=8192) # Set as current
functions_list() # No port needed!
data_list_strings(grep="password") # Uses current instance
```
### Docker Workflow
```python
# Start container (returns immediately)
result = docker_auto_start(binary_path="/path/to/malware.exe")
# → {port: 8195, message: "Poll docker_health(port=8195)..."}
# Poll until ready
while True:
health = docker_health(port=8195)
if health["healthy"]:
break
# Can check docker_logs() while waiting
# Register and use
instances_use(port=8195)
functions_list() # Ready to analyze
```
### Cursor-Based Pagination
Large binaries can have 100K+ functions. Use cursors:
```python
result = functions_list(page_size=100)
# → {items: [...], cursor_id: "abc123", has_more: true}
# Get next page
cursor_next(cursor_id="abc123")
# Or filter server-side
functions_list(grep="crypto|encrypt", page_size=50)
```
### Analysis Prompts
Built-in prompts for common workflows:
```
/prompt malware_triage
/prompt identify_crypto
/prompt find_authentication
```
These guide Claude through systematic analysis with progress reporting.
---
## Configuration
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `GHIDRA_HYDRA_HOST` | `localhost` | Ghidra instance host |
| `GHIDRA_HYDRA_PORT` | `8192` | Default port |
### MCP Config Examples
**Claude Desktop** (`~/Library/Application Support/Claude/claude_desktop_config.json`):
```json
{
"mcpServers": {
"ghydra": {
"mcghidra": {
"command": "uv",
"args": [
"run",
"/ABSOLUTE_PATH_TO/bridge_mcp_hydra.py"
],
"env": {
"GHIDRA_HYDRA_HOST": "localhost" // Optional - defaults to localhost
}
"args": ["run", "--directory", "/path/to/MCGhidra", "mcghidra"]
}
}
}
```
> **Note:** You can also use `python` instead of `uv run`, but then you'll need to manually install the requirements first with `pip install mcp requests`.
### 5ire Configuration
1. Tool Key: ghydra
2. Name: GhydraMCP
3. Command: `uv run /ABSOLUTE_PATH_TO/bridge_mcp_hydra.py`
## Example Session
Below is an example of how an AI assistant might interact with multiple Ghidra instances using GhydraMCP:
**User:**
```
Check which Ghidra instances are currently running.
**Claude Code**:
```bash
claude mcp add mcghidra -- uv run --directory /path/to/MCGhidra mcghidra
```
**Assistant:**
---
## Tool Reference
### Instance Management
```
I'll check which Ghidra instances are currently running by discovering any active instances.
View result from discover_instances from ghydra (local)
{
"found": 2,
"instances": [
{
"port": 8192,
"url": "http://localhost:8192",
"result": "Registered instance on port 8192 at http://localhost:8192"
},
{
"port": 8193,
"url": "http://localhost:8193",
"result": "Registered instance on port 8193 at http://localhost:8193"
}
]
}
View result from list_instances from ghydra (local)
{
"instances": [
{
"port": 8192,
"url": "http://localhost:8192",
"project": "TestProject",
"file": "application.exe"
},
{
"port": 8193,
"url": "http://localhost:8193",
"project": "FirmwareAnalysis",
"file": "firmware.bin"
}
]
}
I found two active Ghidra instances:
1. Port 8192: Analyzing "application.exe" in project "TestProject"
2. Port 8193: Analyzing "firmware.bin" in project "FirmwareAnalysis"
I'm ready to analyze these binaries. What would you like me to examine?
instances_list # Discover Ghidra instances (use this first!)
instances_use # Set current working instance
instances_current # Show current instance info
```
**User:**
### Function Analysis
```
List the functions through ghydra for both instances and identify any related functionality.
functions_list # List functions (supports grep, pagination)
functions_get # Get function details by name or address
functions_decompile # Decompile to C pseudocode
functions_disassemble # Get assembly instructions
functions_rename # Rename a function
functions_set_signature # Set function prototype
functions_set_comment # Add decompiler comment
functions_create # Create function at address
functions_variables # List local variables and parameters
```
**Assistant:**
### Data Operations
```
I'll list the functions from both instances and analyze them.
View result from list_functions from ghydra (local){
`port`: 8192
}
[Function list for application.exe...]
View result from list_functions from ghydra (local){
`port`: 8193
}
[Function list for firmware.bin...]
I notice some interesting functions in both binaries. Let me decompile a few to understand their behavior.
View result from get_function from ghydra (local){
`port`: 8192,
`name`: "process_command"
}
[Decompiled code for process_command...]
View result from get_function from ghydra (local){
`port`: 8193,
`name`: "handle_message"
}
[Decompiled code for handle_message...]
Based on this analysis, I can see these binaries communicate using a simple protocol where...
data_list # List defined data items
data_list_strings # List strings (with grep filtering)
data_create # Define data at address
data_rename # Rename data item
data_set_type # Change data type
data_delete # Remove data definition
```
# JSON Communication
GhydraMCP uses structured JSON for all communication between the Python bridge and Java plugin. This ensures consistent and reliable data exchange.
## Response Format
All responses follow a standard format:
```json
{
"success": true,
"result": "...",
"timestamp": 1712159482123,
"port": 8192,
"instanceType": "base"
}
### Cross-References & Analysis
```
xrefs_list # Find cross-references to/from address
analysis_get_callgraph # Generate call graph
analysis_get_dataflow # Trace data flow forward/backward
analysis_run # Trigger Ghidra auto-analysis
```
Error responses include additional information:
```json
{
"success": false,
"error": "Error message",
"status_code": 404,
"timestamp": 1712159482123
}
### Structs & Types
```
structs_list # List struct definitions
structs_get # Get struct with all fields
structs_create # Create new struct
structs_add_field # Add field to struct
structs_update_field # Modify existing field
structs_delete # Remove struct
enums_list / enums_create
typedefs_list / typedefs_create
```
This structured approach makes the communication more reliable and easier to debug.
# Testing
GhydraMCP includes comprehensive test suites for both the HTTP API and MCP bridge. See [TESTING.md](TESTING.md) for details on running the tests.
## HTTP API Tests
Tests the HTTP endpoints exposed by the Java plugin:
- Response format and structure
- JSON structure consistency
- Required fields in responses
- Error handling
## MCP Bridge Tests
Tests the MCP bridge functionality:
- MCP protocol communication
- Tool availability and structure
- Response format and structure
- JSON structure consistency
# Building from Source
You can build different artifacts with Maven:
## Build Everything (Default)
Build both the Ghidra plugin and the complete package:
### Docker Management
```
docker_auto_start # Start container for binary (auto port allocation)
docker_health # Check if container API is responding
docker_status # List all containers and images
docker_start # Manual container start
docker_stop # Stop container (session-scoped)
docker_logs # Get container logs
docker_cleanup # Remove orphaned containers
```
See `--help` or the [API docs](GHIDRA_HTTP_API.md) for full parameter details.
---
## Building from Source
```bash
# Clone
git clone https://github.com/starsong-consulting/MCGhidra
cd MCGhidra
# Build Ghidra plugin
mvn clean package
# → target/MCGhidra-[version].zip
# Build Docker image
docker build -t mcghidra:latest -f docker/Dockerfile .
# Run MCP server (for development)
uv run mcghidra
```
This creates:
- `target/GhydraMCP-[version].zip` - The Ghidra plugin only
- `target/GhydraMCP-Complete-[version].zip` - Complete package with plugin and bridge script
---
## Build Ghidra Plugin Only
If you only need the Ghidra plugin:
## Architecture
```
mvn clean package -P plugin-only
```
MCGhidra is designed for AI agents:
## Build Complete Package Only
If you only need the combined package:
- **Lazy registration**: `instances_use` doesn't block — validates on first real call
- **Non-blocking I/O**: All Docker/HTTP operations run in thread executors
- **Session isolation**: Each MCP session gets unique container ports
- **Cursor pagination**: Handle 100K+ item responses without context overflow
- **Server-side grep**: Filter results before they hit the wire
```
mvn clean package -P complete-only
```
Based on [GhidraMCP by Laurie Wired](https://github.com/LaurieWired/GhidraMCP/), evolved into a comprehensive RE platform.
The Ghidra plugin includes these files required for Ghidra to recognize the extension:
- lib/GhydraMCP.jar
- extension.properties
- Module.manifest
---
## License
Apache 2.0

View File

@ -1,11 +1,11 @@
# Testing GhydraMCP
# Testing MCGhidra
This document describes how to test the GhydraMCP plugin and bridge.
This document describes how to test the MCGhidra plugin and bridge.
## Prerequisites
- Python 3.11 or higher
- Ghidra with the GhydraMCP plugin installed and running
- Ghidra with the MCGhidra plugin installed and running
- The `requests` Python package (`pip install requests`)
## Running All Tests
@ -34,7 +34,7 @@ The `test_http_api.py` script tests the HTTP API exposed by the Java plugin. It
### Running the HTTP API Tests
1. Make sure Ghidra is running with the GhydraMCP plugin loaded
1. Make sure Ghidra is running with the MCGhidra plugin loaded
2. Run the tests:
```bash
@ -57,7 +57,7 @@ The `test_mcp_client.py` script tests the MCP bridge functionality using the MCP
### Running the MCP Bridge Tests
1. Make sure Ghidra is running with the GhydraMCP plugin loaded
1. Make sure Ghidra is running with the MCGhidra plugin loaded
2. Run the tests:
```bash
@ -89,7 +89,7 @@ The test script will:
### HTTP API Tests
- If tests are skipped with "Ghidra server not running or not accessible", make sure Ghidra is running and the GhydraMCP plugin is loaded.
- If tests are skipped with "Ghidra server not running or not accessible", make sure Ghidra is running and the MCGhidra plugin is loaded.
- If tests fail with connection errors, check that the plugin is listening on the expected port (default: 8192).
### MCP Bridge Tests
@ -103,7 +103,7 @@ The test script will:
To add a new test for an HTTP endpoint:
1. Add a new test method to the `GhydraMCPHttpApiTests` class
1. Add a new test method to the `MCGhidraHttpApiTests` class
2. Use the `requests` library to make HTTP requests to the endpoint
3. Verify the response using assertions

1
binaries/.gitkeep Normal file
View File

@ -0,0 +1 @@
# Add binaries here for analysis

File diff suppressed because it is too large Load Diff

116
docker-compose.yml Normal file
View File

@ -0,0 +1,116 @@
# MCGhidra Docker Compose Configuration
# Provides both development and production modes for Ghidra + MCGhidra
#
# Usage:
# Development: docker compose up mcghidra-dev
# Production: docker compose up mcghidra
#
# Set MODE in .env file to switch between dev/prod behaviors
services:
# =============================================================================
# Production Service - Optimized for stability and security
# =============================================================================
mcghidra:
build:
context: .
dockerfile: docker/Dockerfile
args:
GHIDRA_VERSION: ${GHIDRA_VERSION:-11.4.2}
GHIDRA_DATE: ${GHIDRA_DATE:-20250826}
image: mcghidra:${MCGHIDRAMCP_VERSION:-latest}
container_name: ${COMPOSE_PROJECT_NAME:-mcghidra}-server
restart: unless-stopped
ports:
- "${MCGHIDRA_PORT:-8192}:8192"
volumes:
# Mount binaries to analyze (read-only in prod)
- ${BINARIES_PATH:-./binaries}:/binaries:ro
# Persist Ghidra projects between runs
- mcghidra-projects:/projects
environment:
- MCGHIDRA_MODE=${MCGHIDRA_MODE:-headless}
- MCGHIDRA_PORT=8192
- MCGHIDRA_MAXMEM=${MCGHIDRA_MAXMEM:-2G}
- PROJECT_NAME=${PROJECT_NAME:-MCGhidra}
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8192/"]
interval: 30s
timeout: 10s
start_period: 60s
retries: 3
deploy:
resources:
limits:
memory: ${MCGHIDRA_MAXMEM:-2G}
profiles:
- prod
- default
# =============================================================================
# Development Service - Hot-reload and debugging friendly
# =============================================================================
mcghidra-dev:
build:
context: .
dockerfile: docker/Dockerfile
args:
GHIDRA_VERSION: ${GHIDRA_VERSION:-11.4.2}
GHIDRA_DATE: ${GHIDRA_DATE:-20250826}
image: mcghidra:dev
container_name: ${COMPOSE_PROJECT_NAME:-mcghidra}-dev
ports:
- "${MCGHIDRA_PORT:-8192}:8192"
# Additional ports for debugging/multiple instances
- "8193:8193"
- "8194:8194"
volumes:
# Mount binaries (read-write in dev)
- ${BINARIES_PATH:-./binaries}:/binaries:rw
# Persist projects
- mcghidra-projects-dev:/projects
# Mount scripts for live editing (development only)
- ./docker/MCGhidraServer.java:/opt/ghidra/scripts/MCGhidraServer.java:ro
- ./docker/entrypoint.sh:/entrypoint.sh:ro
environment:
- MCGHIDRA_MODE=${MCGHIDRA_MODE:-headless}
- MCGHIDRA_PORT=8192
- MCGHIDRA_MAXMEM=${MCGHIDRA_MAXMEM:-4G}
- PROJECT_NAME=${PROJECT_NAME:-MCGhidra-Dev}
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8192/"]
interval: 15s
timeout: 5s
start_period: 120s
retries: 5
profiles:
- dev
# =============================================================================
# Shell Service - Interactive debugging container
# =============================================================================
mcghidra-shell:
build:
context: .
dockerfile: docker/Dockerfile
image: mcghidra:${MCGHIDRAMCP_VERSION:-latest}
container_name: ${COMPOSE_PROJECT_NAME:-mcghidra}-shell
stdin_open: true
tty: true
volumes:
- ${BINARIES_PATH:-./binaries}:/binaries:rw
- mcghidra-projects-dev:/projects
environment:
- MCGHIDRA_MODE=shell
profiles:
- debug
volumes:
mcghidra-projects:
name: ${COMPOSE_PROJECT_NAME:-mcghidra}-projects
mcghidra-projects-dev:
name: ${COMPOSE_PROJECT_NAME:-mcghidra}-projects-dev
networks:
default:
name: ${COMPOSE_PROJECT_NAME:-mcghidra}-network

152
docker/Dockerfile Normal file
View File

@ -0,0 +1,152 @@
# MCGhidra Docker Image
# Ghidra + MCGhidra Plugin pre-installed for headless binary analysis
#
# Build: docker build -t mcghidra:latest -f docker/Dockerfile .
# Run: docker run -p 8192:8192 -v /path/to/binaries:/binaries mcghidra:latest
ARG GHIDRA_VERSION=11.4.2
ARG GHIDRA_DATE=20250826
# =============================================================================
# Stage 1: Build the MCGhidra plugin
# =============================================================================
FROM eclipse-temurin:21-jdk-jammy AS builder
ARG GHIDRA_VERSION
ARG GHIDRA_DATE
# Install build dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
unzip \
maven \
git \
&& rm -rf /var/lib/apt/lists/*
# Download and extract Ghidra
WORKDIR /opt
# Download with retries and resume support for unreliable connections
RUN for i in 1 2 3 4 5; do \
curl -fSL --http1.1 -C - \
"https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_${GHIDRA_VERSION}_build/ghidra_${GHIDRA_VERSION}_PUBLIC_${GHIDRA_DATE}.zip" \
-o ghidra.zip && break || sleep 30; \
done \
&& unzip -q ghidra.zip \
&& rm ghidra.zip \
&& mv ghidra_${GHIDRA_VERSION}_PUBLIC ghidra
ENV GHIDRA_HOME=/opt/ghidra
# Copy MCGhidra source and build
WORKDIR /build
# Copy pom.xml first and download dependencies (cached until pom.xml changes)
COPY pom.xml .
RUN mvn dependency:resolve -P plugin-only -q \
-Dghidra.generic.jar=${GHIDRA_HOME}/Ghidra/Framework/Generic/lib/Generic.jar \
-Dghidra.softwaremodeling.jar=${GHIDRA_HOME}/Ghidra/Framework/SoftwareModeling/lib/SoftwareModeling.jar \
-Dghidra.project.jar=${GHIDRA_HOME}/Ghidra/Framework/Project/lib/Project.jar \
-Dghidra.docking.jar=${GHIDRA_HOME}/Ghidra/Framework/Docking/lib/Docking.jar \
-Dghidra.decompiler.jar=${GHIDRA_HOME}/Ghidra/Features/Decompiler/lib/Decompiler.jar \
-Dghidra.utility.jar=${GHIDRA_HOME}/Ghidra/Framework/Utility/lib/Utility.jar \
-Dghidra.base.jar=${GHIDRA_HOME}/Ghidra/Features/Base/lib/Base.jar \
|| true
# Now copy source - only this layer rebuilds on code changes
COPY src ./src
# Build the plugin (skip git-commit-id plugin since .git isn't in Docker context)
RUN mvn package -P plugin-only -DskipTests \
-Dmaven.gitcommitid.skip=true \
-Dghidra.generic.jar=${GHIDRA_HOME}/Ghidra/Framework/Generic/lib/Generic.jar \
-Dghidra.softwaremodeling.jar=${GHIDRA_HOME}/Ghidra/Framework/SoftwareModeling/lib/SoftwareModeling.jar \
-Dghidra.project.jar=${GHIDRA_HOME}/Ghidra/Framework/Project/lib/Project.jar \
-Dghidra.docking.jar=${GHIDRA_HOME}/Ghidra/Framework/Docking/lib/Docking.jar \
-Dghidra.decompiler.jar=${GHIDRA_HOME}/Ghidra/Features/Decompiler/lib/Decompiler.jar \
-Dghidra.utility.jar=${GHIDRA_HOME}/Ghidra/Framework/Utility/lib/Utility.jar \
-Dghidra.base.jar=${GHIDRA_HOME}/Ghidra/Features/Base/lib/Base.jar
# =============================================================================
# Stage 2: Runtime image with Ghidra + MCGhidra
# =============================================================================
# NOTE: Ghidra requires JDK (not JRE) - it checks for javac in LaunchSupport
FROM eclipse-temurin:21-jdk-jammy AS runtime
ARG GHIDRA_VERSION
ARG GHIDRA_DATE
LABEL org.opencontainers.image.title="mcghidra" \
org.opencontainers.image.description="Ghidra + MCGhidra Plugin for AI-assisted reverse engineering" \
org.opencontainers.image.source="https://github.com/starsong-consulting/MCGhidra" \
org.opencontainers.image.licenses="Apache-2.0"
# Install runtime dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
unzip \
fontconfig \
libfreetype6 \
&& rm -rf /var/lib/apt/lists/*
# Create non-root user
RUN groupadd -g 1001 ghidra && useradd -u 1001 -g ghidra -m -s /bin/bash ghidra
# Download and extract Ghidra (in runtime stage for cleaner image)
WORKDIR /opt
# Download with retries and resume support for unreliable connections
RUN for i in 1 2 3 4 5; do \
curl -fSL --http1.1 -C - \
"https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_${GHIDRA_VERSION}_build/ghidra_${GHIDRA_VERSION}_PUBLIC_${GHIDRA_DATE}.zip" \
-o ghidra.zip && break || sleep 30; \
done \
&& unzip -q ghidra.zip \
&& rm ghidra.zip \
&& mv ghidra_${GHIDRA_VERSION}_PUBLIC ghidra \
&& chown -R ghidra:ghidra /opt/ghidra
ENV GHIDRA_HOME=/opt/ghidra
ENV PATH="${GHIDRA_HOME}:${PATH}"
# Install the MCGhidra plugin
COPY --from=builder /build/target/MCGhidra-*.zip /tmp/
RUN mkdir -p /opt/ghidra/Ghidra/Extensions \
&& unzip -q /tmp/MCGhidra-*.zip -d /opt/ghidra/Ghidra/Extensions/ \
&& rm /tmp/MCGhidra-*.zip \
&& chown -R ghidra:ghidra /opt/ghidra/Ghidra/Extensions/
# Create directories for projects and binaries
RUN mkdir -p /projects /binaries /home/ghidra/.ghidra \
&& chown -R ghidra:ghidra /projects /binaries /home/ghidra
# Copy MCGhidra Python scripts to user scripts directory
# Python/Jython scripts don't require OSGi bundle registration - they work without issue
RUN mkdir -p /home/ghidra/ghidra_scripts
COPY docker/MCGhidraServer.py /home/ghidra/ghidra_scripts/
COPY docker/ImportRawARM.java /home/ghidra/ghidra_scripts/
# Set proper ownership and permissions
RUN chown -R ghidra:ghidra /home/ghidra/ghidra_scripts \
&& chmod 755 /home/ghidra/ghidra_scripts/*.py 2>/dev/null || true \
&& chmod 755 /home/ghidra/ghidra_scripts/*.java 2>/dev/null || true
# Copy entrypoint script (755 so ghidra user can read and execute)
COPY docker/entrypoint.sh /entrypoint.sh
RUN chmod 755 /entrypoint.sh
# Switch to non-root user
USER ghidra
WORKDIR /home/ghidra
# Expose the MCGhidra HTTP API port (and additional ports for multiple instances)
EXPOSE 8192 8193 8194 8195
# Default environment
ENV MCGHIDRA_MODE=headless
ENV MCGHIDRA_PORT=8192
ENV MCGHIDRA_MAXMEM=2G
# Healthcheck
HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
CMD curl -f http://localhost:${MCGHIDRA_PORT}/health || exit 1
ENTRYPOINT ["/entrypoint.sh"]

204
docker/ImportRawARM.java Normal file
View File

@ -0,0 +1,204 @@
// Import and analyze raw ARM firmware binary
// This script imports a raw binary file with specified ARM processor and load address
// @author MCGhidra
// @category Binary.Import
// @keybinding
// @menupath
// @toolbar
import ghidra.app.script.GhidraScript;
import ghidra.app.util.bin.ByteArrayProvider;
import ghidra.app.util.importer.MessageLog;
import ghidra.app.util.opinion.BinaryLoader;
import ghidra.app.util.opinion.LoadSpec;
import ghidra.framework.model.DomainFile;
import ghidra.framework.model.DomainFolder;
import ghidra.program.model.address.Address;
import ghidra.program.model.lang.LanguageCompilerSpecPair;
import ghidra.program.model.listing.Program;
import ghidra.util.task.TaskMonitor;
import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
import java.util.List;
public class ImportRawARM extends GhidraScript {
@Override
public void run() throws Exception {
String[] args = getScriptArgs();
if (args.length < 2) {
println("========================================");
println(" Import Raw ARM Firmware Binary");
println("========================================");
println("");
println("Usage: -postScript ImportRawARM.java <binary_path> <load_address> [<language_id>]");
println("");
println("Arguments:");
println(" binary_path - Path to raw binary file");
println(" load_address - Base address in hex (e.g., 0x00000000)");
println(" language_id - ARM language ID (default: ARM:LE:32:v5t)");
println("");
println("Common ARM Language IDs:");
println(" ARM:LE:32:v4 - ARMv4 little-endian");
println(" ARM:LE:32:v5 - ARMv5 little-endian");
println(" ARM:LE:32:v5t - ARMv5T little-endian (Thumb)");
println(" ARM:LE:32:v6 - ARMv6 little-endian");
println(" ARM:LE:32:v7 - ARMv7 little-endian");
println(" ARM:LE:32:Cortex - ARM Cortex");
println(" ARM:BE:32:v5t - ARMv5T big-endian");
println("");
println("Example:");
println(" analyzeHeadless /projects MyProject \\");
println(" -postScript ImportRawARM.java /binaries/firmware.bin 0x00000000 ARM:LE:32:v5t");
println("");
return;
}
String binaryPath = args[0];
String loadAddressStr = args[1];
String languageIDStr = args.length > 2 ? args[2] : "ARM:LE:32:v5t";
// Parse load address
long loadAddress;
if (loadAddressStr.toLowerCase().startsWith("0x")) {
loadAddress = Long.parseLong(loadAddressStr.substring(2), 16);
} else {
loadAddress = Long.parseLong(loadAddressStr, 16);
}
File binaryFile = new File(binaryPath);
if (!binaryFile.exists()) {
printerr("ERROR: Binary file not found: " + binaryPath);
return;
}
long fileSize = binaryFile.length();
println("========================================");
println(" Importing Raw ARM Firmware");
println("========================================");
println(" Binary: " + binaryFile.getName());
println(" Size: " + fileSize + " bytes (0x" + Long.toHexString(fileSize) + ")");
println(" Load Address: 0x" + String.format("%08X", loadAddress));
println(" Language: " + languageIDStr);
println("========================================");
println("");
try {
// Read the binary file
byte[] bytes = Files.readAllBytes(binaryFile.toPath());
ByteArrayProvider provider = new ByteArrayProvider(bytes);
// Parse language string (format: "ARM:LE:32:v5t")
String[] parts = languageIDStr.split(":");
if (parts.length < 4) {
printerr("ERROR: Invalid language ID format. Expected format: ARCH:ENDIAN:SIZE:VARIANT");
printerr(" Example: ARM:LE:32:v5t");
return;
}
// Create language compiler spec pair
LanguageCompilerSpecPair lcsPair = new LanguageCompilerSpecPair(languageIDStr, "default");
// Use BinaryLoader to import
BinaryLoader loader = new BinaryLoader();
// Create load spec with our language
LoadSpec loadSpec = new LoadSpec(loader, loadAddress, lcsPair, false);
// Get project folder
DomainFolder rootFolder = state.getProject().getProjectData().getRootFolder();
String programName = binaryFile.getName();
println("Creating program: " + programName);
println("Using language: " + languageIDStr);
println("");
// Import using BinaryLoader
MessageLog log = new MessageLog();
List<Program> programs = loader.load(
provider,
programName,
rootFolder,
loadSpec,
List.of(), // options
log,
this, // consumer
monitor
);
if (programs == null || programs.isEmpty()) {
printerr("ERROR: Failed to load binary");
println("");
println("Loader messages:");
println(log.toString());
return;
}
Program program = programs.get(0);
println("Import successful!");
println(" Program: " + program.getName());
println(" Base Address: " + program.getImageBase());
println(" Memory blocks: " + program.getMemory().getBlocks().length);
println("");
// Set entry point at load address
Address entryAddr = program.getAddressFactory().getDefaultAddressSpace().getAddress(loadAddress);
int txId = program.startTransaction("Set Entry Point");
try {
program.getSymbolTable().addExternalEntryPoint(entryAddr);
program.getSymbolTable().createLabel(entryAddr, "entry",
ghidra.program.model.symbol.SourceType.ANALYSIS);
program.endTransaction(txId, true);
println("Entry point set at: " + entryAddr);
println("");
} catch (Exception e) {
program.endTransaction(txId, false);
printerr("Warning: Could not set entry point: " + e.getMessage());
}
// Save the program
try {
program.save("ARM firmware import", monitor);
println("Program saved successfully");
println("");
} catch (Exception e) {
printerr("Warning: Could not save program: " + e.getMessage());
}
// Run analysis
println("Starting auto-analysis...");
println("(This may take a while for large binaries)");
println("");
analyzeAll(program);
println("");
println("========================================");
println(" Import Complete!");
println("========================================");
println(" Program: " + program.getName());
println(" Functions found: " + program.getFunctionManager().getFunctionCount());
println(" Defined data: " + program.getListing().getNumDefinedData());
println("========================================");
println("");
// Set as current program for subsequent scripts
state.setCurrentProgram(program);
} catch (IOException e) {
printerr("ERROR: Failed to read binary file: " + e.getMessage());
e.printStackTrace();
} catch (Exception e) {
printerr("ERROR: Import failed: " + e.getMessage());
e.printStackTrace();
}
}
}

2843
docker/MCGhidraServer.py Normal file

File diff suppressed because it is too large Load Diff

197
docker/README.md Normal file
View File

@ -0,0 +1,197 @@
# MCGhidra Docker Setup
This directory contains Docker configuration for running MCGhidra in headless mode.
## Quick Start
```bash
# Build the image
docker build -t mcghidra:latest -f docker/Dockerfile .
# Analyze a binary
docker run -p 8192:8192 -v /path/to/binaries:/binaries mcghidra /binaries/sample.exe
# Check API health
curl http://localhost:8192/
```
## Architecture
The Docker container includes:
1. **Ghidra 11.4.2** - Full headless installation
2. **MCGhidra Extension** - The Java plugin (installed in Extensions/)
3. **MCGhidraServer.py** - Headless HTTP server (Jython, full API parity)
### Why Two HTTP Servers?
The MCGhidra plugin (`MCGhidraPlugin.java`) is a full Ghidra GUI plugin that requires:
- Ghidra's `PluginTool` framework
- `ProgramManager` service for program access
- GUI event handling
These GUI services don't exist in headless mode. Instead, the container uses `MCGhidraServer.py`, a Jython script that:
- Runs via `analyzeHeadless -postScript`
- Has direct access to `currentProgram` from the script context
- Provides **full API parity** with the GUI plugin (45 routes)
- Supports all read and write operations
### Available Endpoints (Headless Mode)
The headless server implements the complete MCGhidra HTTP API:
| Category | Endpoints | Description |
|----------|-----------|-------------|
| **Info** | `GET /`, `/info`, `/program` | API info, program metadata |
| **Functions** | `GET /functions`, `/functions/{addr}`, `/functions/by-name/{name}` | List and detail |
| **Decompile** | `GET /functions/{addr}/decompile`, `/functions/by-name/{name}/decompile` | C pseudocode |
| **Disassembly** | `GET /functions/{addr}/disassembly`, `/functions/by-name/{name}/disassembly` | Assembly listing |
| **Data** | `GET /data`, `/strings` | Defined data and strings |
| **Memory** | `GET /memory`, `/memory/blocks` | Read bytes, list segments |
| **Xrefs** | `GET /xrefs` | Cross-references (to/from) |
| **Structs** | `GET /structs` | Data type structures |
| **Symbols** | `GET /symbols`, `/imports`, `/exports` | Symbol tables |
| **Analysis** | `GET /analysis/callgraph`, `/analysis/dataflow` | Static analysis |
| **Write Ops** | `PATCH /functions/*`, `POST /data`, `POST /structs/*` | Rename, annotate, create |
See [GHIDRA_HTTP_API.md](../GHIDRA_HTTP_API.md) for the complete API specification.
## Container Modes
### Headless Mode (Default)
Imports a binary, analyzes it, and starts the HTTP API server:
```bash
docker run -p 8192:8192 \
-v ./samples:/binaries \
mcghidra /binaries/sample.exe
```
### Server Mode
Opens an existing project and program:
```bash
docker run -p 8192:8192 \
-e MCGHIDRA_MODE=server \
-v ./projects:/projects \
mcghidra program_name
```
### Analyze Mode
Imports and analyzes without starting HTTP server:
```bash
docker run \
-e MCGHIDRA_MODE=analyze \
-v ./samples:/binaries \
-v ./projects:/projects \
mcghidra /binaries/sample.exe
```
### Shell Mode
Interactive debugging:
```bash
docker run -it \
-e MCGHIDRA_MODE=shell \
mcghidra
```
## Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `MCGHIDRA_MODE` | `headless` | Container mode (headless, server, analyze, shell) |
| `MCGHIDRA_PORT` | `8192` | HTTP API port |
| `MCGHIDRA_MAXMEM` | `2G` | JVM heap memory |
| `PROJECT_DIR` | `/projects` | Ghidra project directory |
| `PROJECT_NAME` | `MCGhidra` | Ghidra project name |
## Docker Compose
Use docker-compose for easier management:
```bash
# Development mode (hot-reload scripts)
docker compose --profile dev up mcghidra-dev
# Production mode
docker compose --profile prod up mcghidra
# Interactive shell
docker compose --profile debug run --rm mcghidra-shell
```
## MCP Integration
The MCGhidra Python server includes Docker management tools:
```python
# Check Docker status
await docker_status()
# Start container for a binary
await docker_start(binary_path="/path/to/binary.exe", port=8192)
# Wait for container to be ready
await docker_wait(port=8192, timeout=300)
# Automatic mode - starts container if no Ghidra available
await docker_auto_start(binary_path="/path/to/binary.exe")
# Get container logs
await docker_logs("mcghidra-server")
# Stop container
await docker_stop("mcghidra-server")
```
## Building
```bash
# Using Make
make build
# Using Docker directly
docker build -t mcghidra:latest -f docker/Dockerfile .
# Build with specific Ghidra version
docker build -t mcghidra:latest \
--build-arg GHIDRA_VERSION=11.4.2 \
--build-arg GHIDRA_DATE=20250826 \
-f docker/Dockerfile .
```
## Troubleshooting
### Container starts but API doesn't respond
Analysis takes time. Monitor progress with:
```bash
docker logs -f mcghidra-server
```
### Port already in use
Stop existing containers:
```bash
docker stop $(docker ps -q --filter "name=mcghidra")
```
### Memory issues with large binaries
Increase JVM heap:
```bash
docker run -e MCGHIDRA_MAXMEM=4G -p 8192:8192 mcghidra /binaries/large.exe
```
### Permission denied on volumes
The container runs as user `ghidra` (UID 1001). Ensure volume permissions:
```bash
sudo chown -R 1001:1001 /path/to/binaries
```

178
docker/entrypoint.sh Executable file
View File

@ -0,0 +1,178 @@
#!/bin/bash
# MCGhidra Docker Entrypoint
# Starts Ghidra in headless mode with HTTP API server
set -e
MCGHIDRA_MODE=${MCGHIDRA_MODE:-headless}
MCGHIDRA_PORT=${MCGHIDRA_PORT:-8192}
MCGHIDRA_MAXMEM=${MCGHIDRA_MAXMEM:-2G}
GHIDRA_HOME=${GHIDRA_HOME:-/opt/ghidra}
# User scripts directory - Python scripts don't need OSGi bundle registration
SCRIPT_DIR=${SCRIPT_DIR:-/home/ghidra/ghidra_scripts}
# Project settings
PROJECT_DIR=${PROJECT_DIR:-/projects}
PROJECT_NAME=${PROJECT_NAME:-MCGhidra}
echo "=============================================="
echo " MCGhidra Docker Container"
echo "=============================================="
echo " Mode: ${MCGHIDRA_MODE}"
echo " Port: ${MCGHIDRA_PORT}"
echo " Memory: ${MCGHIDRA_MAXMEM}"
echo " Project: ${PROJECT_DIR}/${PROJECT_NAME}"
echo "=============================================="
# Ensure directories exist
mkdir -p "${PROJECT_DIR}"
# Handle different modes
case "${MCGHIDRA_MODE}" in
headless)
# Headless mode: Import a binary and start HTTP server
if [ $# -eq 0 ]; then
echo ""
echo "Usage: docker run mcghidra:latest [binary_path] [options]"
echo ""
echo "Examples:"
echo " # Analyze a binary mounted at /binaries/sample.exe"
echo " docker run -p 8192:8192 -v ./samples:/binaries mcghidra /binaries/sample.exe"
echo ""
echo " # With custom project name"
echo " docker run -p 8192:8192 -v ./samples:/binaries -e PROJECT_NAME=malware mcghidra /binaries/sample.exe"
echo ""
echo "Environment variables:"
echo " MCGHIDRA_PORT - HTTP API port (default: 8192)"
echo " MCGHIDRA_MAXMEM - Max JVM heap (default: 2G)"
echo " PROJECT_NAME - Ghidra project name (default: MCGhidra)"
echo " PROJECT_DIR - Project directory (default: /projects)"
echo " GHIDRA_LANGUAGE - Processor language ID (e.g., ARM:LE:32:v4t)"
echo " GHIDRA_BASE_ADDRESS - Base address for raw binaries (e.g., 0x00000000)"
echo " GHIDRA_LOADER - Loader type (e.g., BinaryLoader for raw firmware)"
echo ""
echo "Starting in wait mode..."
echo "Container will stay running for debugging or manual operation."
echo "You can exec into this container to run analyzeHeadless manually."
echo ""
# Keep container alive for debugging/manual operation
tail -f /dev/null
else
BINARY_PATH="$1"
shift
if [ ! -f "${BINARY_PATH}" ]; then
echo "ERROR: Binary not found: ${BINARY_PATH}"
echo "Make sure to mount the binary directory with -v /host/path:/binaries"
exit 1
fi
BINARY_NAME=$(basename "${BINARY_PATH}")
echo "Importing and analyzing: ${BINARY_NAME}"
echo ""
# Build the analyzeHeadless command
ANALYZE_CMD="${GHIDRA_HOME}/support/analyzeHeadless"
ANALYZE_ARGS=(
"${PROJECT_DIR}"
"${PROJECT_NAME}"
-import "${BINARY_PATH}"
-max-cpu 2
-scriptPath "${SCRIPT_DIR}"
-postScript "MCGhidraServer.py" "${MCGHIDRA_PORT}"
)
# Optional: processor/language for raw binaries
if [ -n "${GHIDRA_LANGUAGE}" ]; then
if ! echo "${GHIDRA_LANGUAGE}" | grep -qE '^[A-Za-z0-9_]+:[A-Z]{2}:[0-9]+:[A-Za-z0-9._-]+$'; then
echo "ERROR: Invalid GHIDRA_LANGUAGE format: ${GHIDRA_LANGUAGE}"
echo "Expected: ARCH:ENDIAN:SIZE:VARIANT (e.g., ARM:LE:32:v4t)"
exit 1
fi
ANALYZE_ARGS+=(-processor "${GHIDRA_LANGUAGE}")
fi
# Optional: base address
if [ -n "${GHIDRA_BASE_ADDRESS}" ]; then
if ! echo "${GHIDRA_BASE_ADDRESS}" | grep -qE '^(0x)?[0-9a-fA-F]+$'; then
echo "ERROR: Invalid GHIDRA_BASE_ADDRESS format: ${GHIDRA_BASE_ADDRESS}"
echo "Expected hex: 0x00000000 or 00000000"
exit 1
fi
ANALYZE_ARGS+=(-loader-baseAddr "${GHIDRA_BASE_ADDRESS}")
fi
# Optional: explicit loader (e.g., BinaryLoader for raw firmware)
if [ -n "${GHIDRA_LOADER}" ]; then
if ! echo "${GHIDRA_LOADER}" | grep -qE '^[A-Za-z0-9_]+$'; then
echo "ERROR: Invalid GHIDRA_LOADER format: ${GHIDRA_LOADER}"
echo "Expected alphanumeric name (e.g., BinaryLoader)"
exit 1
fi
ANALYZE_ARGS+=(-loader "${GHIDRA_LOADER}")
fi
# Add any extra arguments passed
ANALYZE_ARGS+=("$@")
echo "Running: ${ANALYZE_CMD} ${ANALYZE_ARGS[*]}"
echo ""
exec "${ANALYZE_CMD}" "${ANALYZE_ARGS[@]}"
fi
;;
server)
# Server mode: Open existing project with HTTP server
echo "Starting MCGhidra server on existing project..."
if [ $# -eq 0 ]; then
echo "Usage: docker run -e MCGHIDRA_MODE=server mcghidra [program_name]"
echo ""
echo " program_name: Name of program in the project to open"
exit 1
fi
PROGRAM_NAME="$1"
shift
exec "${GHIDRA_HOME}/support/analyzeHeadless" \
"${PROJECT_DIR}" "${PROJECT_NAME}" \
-process "${PROGRAM_NAME}" \
-noanalysis \
-scriptPath "${SCRIPT_DIR}" \
-postScript "MCGhidraServer.py" "${MCGHIDRA_PORT}" \
"$@"
;;
analyze)
# Analyze mode: Import and analyze, then exit (no HTTP server)
if [ $# -eq 0 ]; then
echo "Usage: docker run -e MCGHIDRA_MODE=analyze mcghidra [binary_path]"
exit 1
fi
BINARY_PATH="$1"
shift
echo "Analyzing binary: ${BINARY_PATH}"
exec "${GHIDRA_HOME}/support/analyzeHeadless" \
"${PROJECT_DIR}" "${PROJECT_NAME}" \
-import "${BINARY_PATH}" \
-max-cpu 2 \
"$@"
;;
shell)
# Interactive shell
exec /bin/bash
;;
*)
echo "Unknown mode: ${MCGHIDRA_MODE}"
echo "Valid modes: headless, server, analyze, shell"
exit 1
;;
esac

6
docs-site/.dockerignore Normal file
View File

@ -0,0 +1,6 @@
node_modules
dist
.astro
.env
.env.*
!.env.example

6
docs-site/.gitignore vendored Normal file
View File

@ -0,0 +1,6 @@
node_modules/
dist/
.astro/
.env
.env.local
.env.production

61
docs-site/Dockerfile Normal file
View File

@ -0,0 +1,61 @@
# syntax=docker/dockerfile:1
# -- Build stage --
FROM node:22-alpine AS builder
RUN corepack enable && corepack prepare pnpm@latest --activate
WORKDIR /app
COPY package.json pnpm-lock.yaml ./
RUN --mount=type=cache,target=/root/.local/share/pnpm/store \
pnpm install --frozen-lockfile
COPY . .
ENV ASTRO_TELEMETRY_DISABLED=1
RUN pnpm build
# -- Production stage --
FROM caddy:2-alpine AS production
COPY --from=builder /app/dist /srv
COPY <<EOF /etc/caddy/Caddyfile
:80 {
root * /srv
encode gzip
try_files {path} {path}/
file_server
header Cache-Control "public, max-age=3600"
@immutable path /_astro/*
header @immutable Cache-Control "public, max-age=31536000, immutable"
handle_errors {
rewrite * /404.html
file_server
}
}
EOF
EXPOSE 80
CMD ["caddy", "run", "--config", "/etc/caddy/Caddyfile"]
# -- Dev stage --
FROM node:22-alpine AS dev
RUN corepack enable && corepack prepare pnpm@latest --activate
WORKDIR /app
COPY package.json pnpm-lock.yaml ./
RUN --mount=type=cache,target=/root/.local/share/pnpm/store \
pnpm install --frozen-lockfile
COPY . .
ENV ASTRO_TELEMETRY_DISABLED=1
EXPOSE 4321
CMD ["pnpm", "dev"]

13
docs-site/Makefile Normal file
View File

@ -0,0 +1,13 @@
.PHONY: prod dev down logs
prod:
docker compose up -d --build
dev:
docker compose --profile dev up --build
down:
docker compose down
logs:
docker compose logs -f

View File

@ -0,0 +1,75 @@
// @ts-check
import { defineConfig } from 'astro/config';
import starlight from '@astrojs/starlight';
export default defineConfig({
site: 'https://mcghidra.warehack.ing',
devToolbar: { enabled: false },
integrations: [
starlight({
title: 'MCGhidra',
tagline: 'Reverse engineering bridge between Ghidra and MCP',
description: 'Multi-instance Ghidra plugin with HATEOAS REST API and MCP server for decompilation, analysis, and binary manipulation.',
social: [
{ icon: 'external', label: 'Gitea', href: 'https://git.supported.systems/MCP/mcghidra' },
{ icon: 'external', label: 'PyPI', href: 'https://pypi.org/project/mcghidra/' },
],
customCss: ['./src/styles/custom.css'],
editLink: {
baseUrl: 'https://git.supported.systems/MCP/mcghidra/_edit/main/docs-site/',
},
head: [
{
tag: 'meta',
attrs: {
name: 'theme-color',
content: '#c85533',
},
},
],
sidebar: [
{
label: 'Getting Started',
items: [
{ label: 'Overview', slug: 'getting-started/overview' },
{ label: 'Installation', slug: 'getting-started/installation' },
],
},
{
label: 'Guides',
items: [
{ label: 'Analysis Workflows', slug: 'guides/workflows' },
{ label: 'Cursor Pagination', slug: 'guides/cursor-pagination' },
{ label: 'Troubleshooting', slug: 'guides/troubleshooting' },
],
},
{
label: 'Reference',
collapsed: true,
items: [
{ label: 'MCP Tools', slug: 'reference/mcp-tools' },
{ label: 'MCP Resources', slug: 'reference/resources' },
{ label: 'REST API', slug: 'reference/rest-api' },
{ label: 'Configuration', slug: 'reference/configuration' },
{ label: 'Docker Usage', slug: 'reference/docker' },
],
},
{
label: 'Concepts',
collapsed: true,
items: [
{ label: 'Architecture', slug: 'concepts/architecture' },
{ label: 'Prior Art', slug: 'concepts/prior-art' },
],
},
{
label: 'About',
collapsed: true,
items: [
{ label: 'Changelog', slug: 'about/changelog' },
],
},
],
}),
],
});

View File

@ -0,0 +1,38 @@
services:
docs:
build:
context: .
dockerfile: Dockerfile
target: production
restart: unless-stopped
networks:
- caddy
labels:
caddy: ${DOMAIN:-mcghidra.l.warehack.ing}
caddy.reverse_proxy: "{{upstreams 80}}"
healthcheck:
test: ["CMD", "wget", "-q", "--spider", "http://127.0.0.1:80/"]
interval: 30s
timeout: 5s
retries: 3
start_period: 10s
docs-dev:
build:
context: .
dockerfile: Dockerfile
target: dev
profiles: ["dev"]
volumes:
- ./src:/app/src
- ./public:/app/public
- ./astro.config.mjs:/app/astro.config.mjs
networks:
- caddy
labels:
caddy: ${DOMAIN:-mcghidra.l.warehack.ing}
caddy.reverse_proxy: "{{upstreams 4321}}"
networks:
caddy:
external: true

24
docs-site/package.json Normal file
View File

@ -0,0 +1,24 @@
{
"name": "mcghidra-docs",
"type": "module",
"version": "0.0.1",
"scripts": {
"dev": "astro dev --host",
"start": "astro dev --host",
"build": "astro build",
"preview": "astro preview",
"astro": "astro"
},
"dependencies": {
"@astrojs/mdx": "^4.3.13",
"@astrojs/starlight": "^0.37.4",
"astro": "^5.6.1",
"sharp": "^0.34.2"
},
"pnpm": {
"onlyBuiltDependencies": [
"esbuild",
"sharp"
]
}
}

4355
docs-site/pnpm-lock.yaml generated Normal file

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,2 @@
User-agent: *
Allow: /

View File

@ -0,0 +1,7 @@
import { defineCollection } from 'astro:content';
import { docsLoader } from '@astrojs/starlight/loaders';
import { docsSchema } from '@astrojs/starlight/schema';
export const collections = {
docs: defineCollection({ loader: docsLoader(), schema: docsSchema() }),
};

View File

@ -0,0 +1,88 @@
---
title: Changelog
description: Version history and release notes
---
This page summarizes each release. For full commit-level detail, see the repository history.
## Unreleased
### Added
- Symbol CRUD operations: `symbols_create`, `symbols_rename`, `symbols_delete`, `symbols_imports`, `symbols_exports`
- Bookmark management: `bookmarks_list`, `bookmarks_create`, `bookmarks_delete`
- Enum and typedef creation: `enums_create`, `enums_list`, `typedefs_create`, `typedefs_list`
- Variable management: `variables_list`, `variables_rename`, `functions_variables`
- Namespace and class tools: `namespaces_list`, `classes_list`
- Memory segment listing: `segments_list`
- 13 analysis prompts with progress reporting
- Docker port auto-allocation from a configurable pool (default 8192-8223)
- Lazy `instances_use` -- returns immediately, validates on first real call
- All Docker operations non-blocking via thread executors
- Session isolation for `docker_stop` and `docker_cleanup`
### Fixed
- Eliminated 4+ hour hangs when switching to slow or unreachable instances
- Multiple bug fixes across Docker lifecycle and session management
## 2025.12.1
### Added
- Cursor-based pagination with configurable `page_size` and `cursor_id`
- Grep and regex filtering applied before pagination
- 8 enumeration resources using `ghidra://` URIs
### Security
- ReDoS protection on regex filters
- Session spoofing prevention for cursor operations
## 2.0.0
### Changed
- Full MCP integration refactor using FastMCP
- HATEOAS-driven API v2 with hypermedia links on all responses
### Added
- String listing across program memory
- Data manipulation tools
- Cross-reference analysis tools
- Memory read and write operations
## 1.4.0
### Changed
- Communication between bridge and plugin switched to structured JSON
### Added
- Test suites for bridge and plugin
- Origin checking on HTTP requests
## 1.3.0
### Added
- Variable manipulation tools (rename and retype)
- Dynamic version reporting in API responses
## 1.2.0
### Added
- Multi-instance support -- connect to multiple Ghidra sessions and switch between them
## 1.1.0
### Added
- Initial bridge release connecting MCP server to Ghidra plugin
## 1.0.0
- Initial project setup

View File

@ -0,0 +1,87 @@
---
title: Architecture
description: How MCGhidra's components fit together and why
---
MCGhidra is a three-layer stack. Each layer operates independently, communicates over well-defined boundaries, and can be replaced without affecting the others.
```
┌─────────────────────────────────────────────────────────────────┐
│ MCP Client (Claude Code, Claude Desktop, etc.) │
└──────────────────────────┬──────────────────────────────────────┘
│ stdio (MCP protocol)
┌──────────────────────────┴──────────────────────────────────────┐
│ MCGhidra Python Server (FastMCP) │
│ ┌─────────┐ ┌──────────┐ ┌─────────┐ ┌────────────┐ │
│ │Functions │ │ Data │ │Analysis │ │ Docker │ ... │
│ │ Mixin │ │ Mixin │ │ Mixin │ │ Mixin │ │
│ └────┬────┘ └────┬─────┘ └────┬────┘ └─────┬─────┘ │
│ └───────────┴────────────┴─────────────┘ │
│ HTTP Client │
└──────────────────────────┬──────────────────────────────────────┘
│ HTTP REST (HATEOAS)
┌──────────────────────────┴──────────────────────────────────────┐
│ Ghidra Plugin (Java, runs inside JVM) │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ MCGhidraServer.py → HTTP endpoints │ │
│ │ Functions / Data / Memory / Xrefs / Analysis │ │
│ └────────────────────────────────────────────────────────┘ │
│ Ghidra Analysis Engine (decompiler, disassembler, types) │
└─────────────────────────────────────────────────────────────────┘
```
## The Three-Layer Stack
The top layer is the MCP client. Claude Code, Claude Desktop, or any MCP-compatible tool connects to MCGhidra over stdio using the Model Context Protocol. The client sees MCP tools, resources, and prompts -- it never deals with HTTP or Ghidra's internals directly.
The middle layer is the Python MCP server, built on FastMCP. It translates MCP tool calls into HTTP requests against the Ghidra plugin's REST API. The server is organized as a set of mixins -- Functions, Data, Analysis, Docker, and others -- each registering their own tools. This keeps the codebase navigable despite having 64+ tools.
The bottom layer is the Ghidra plugin. It runs as a Jython script inside Ghidra's JVM and starts an HTTP server that exposes Ghidra's analysis engine. The plugin does not know or care about MCP. It serves a HATEOAS REST API that any HTTP client can consume.
### Why a REST intermediary
A direct JVM-to-MCP bridge sounds simpler, but Ghidra's runtime imposes real constraints. The JVM uses OSGi classloading, the scripting environment is Jython (Python 2.7), and Ghidra's internal APIs are not designed for external consumption. HTTP sidesteps all of this. The Ghidra plugin speaks HTTP; the Python server speaks MCP. Each layer uses the language and runtime best suited to its job.
This separation also enables multi-instance support. Multiple Ghidra processes can run on different ports, each analyzing a different binary, and the MCP server routes requests to the right one. If the REST layer were baked into the MCP transport, this routing would be much harder.
Finally, the REST layer is language-independent. The Python server could be replaced with a Go or Rust implementation without touching the Ghidra plugin. This is not a theoretical benefit -- it means the plugin's API is usable outside of MCP entirely.
## HATEOAS Design
Most REST APIs call themselves RESTful but skip the hypermedia constraint. MCGhidra does not. Every response from the Ghidra plugin includes `_links` pointing to related resources.
A request to `GET /functions/0x401000` returns the function metadata along with links to decompile it, disassemble it, list its variables, and find cross-references. The client follows links rather than constructing URLs from templates.
This matters more for MCP agents than for human users. An agent that follows links does not need to memorize URL patterns or understand the API's URL structure upfront. It reads a response, sees what actions are available, and picks the relevant one. The API is self-describing at every step.
The practical effect: when the Ghidra plugin adds a new capability, the agent can discover and use it without any changes to the MCP server -- as long as the server forwards the link.
## Session Isolation
Each MCP client gets a session ID, derived from the FastMCP context. This ID scopes all stateful operations.
Pagination cursors are session-scoped. If two clients are paging through the same function list, their cursors are independent -- advancing one does not affect the other. Docker containers track which session started them, and `docker_stop` validates ownership before killing a container. One client cannot shut down another client's analysis session.
`docker_cleanup` follows the same rule. It only removes containers and port locks belonging to the calling session, unless explicitly asked to clean up orphans.
## Port Pooling
When Docker provisioning starts a new container, it needs a host port to map the container's HTTP API. Ports come from a configurable pool, defaulting to 8192-8319 (128 ports).
Allocation uses `flock`-based file locking. Each port has a lock file, and the allocator takes an exclusive lock before assigning it. This is safe across multiple processes -- if two MCP servers run on the same host, they will not collide.
The `PortPool` is lazy. It is not created until the first Docker operation that needs a port. If a user never touches Docker, the lock directory is never created and no background work occurs.
A background discovery thread scans the port range every 30 seconds, probing each port with a 0.5-second timeout. This is how the server finds Ghidra instances that were started outside of MCGhidra -- manually launched containers, or Ghidra instances running the plugin natively.
## Non-Blocking Design
The MCP server runs an asyncio event loop. Blocking that loop would freeze all connected clients. MCGhidra avoids this in several ways.
All Docker subprocess calls (`docker run`, `docker stop`, `docker logs`) run in thread pool executors via `asyncio.to_thread`. The event loop stays responsive while containers start, stop, or produce output.
`instances_use` is lazy. When a client switches to a new Ghidra instance, the server creates a stub immediately and returns. It does not validate the connection until the first real tool call against that instance. This avoids the situation where a slow or unreachable Ghidra instance blocks the `instances_use` call for minutes.
`docker_auto_start` returns as soon as the container is running. It does not wait for Ghidra to finish loading and analyzing the binary -- that can take minutes for large files. The client is expected to poll `docker_health` until the API responds.
The background port discovery thread runs on its own schedule and never blocks the event loop. It updates the instance list atomically, so clients always see a consistent snapshot.

View File

@ -0,0 +1,39 @@
---
title: Prior Art
description: Acknowledgments and related projects
---
MCGhidra builds on the work of many people and projects. This page gives credit where it is due.
## Ghidra
NSA released Ghidra as open source in 2019 after years of internal development. MCGhidra would not exist without the decade of investment the agency put into building a full-featured analysis engine. The decompiler alone represents person-years of work on intermediate representations, type inference, and control flow recovery. The fact that it runs headless, supports scripting, and handles dozens of processor architectures out of the box made this project feasible.
Ghidra is available at [ghidra-sre.org](https://ghidra-sre.org/).
## GhidraMCP by Laurie Wired
The direct inspiration. [Laurie Wired's GhidraMCP](https://github.com/LaurieWired/GhidraMCP/) demonstrated that connecting Ghidra to the Model Context Protocol was viable and useful. MCGhidra started as a fork of her project and evolved into a different architecture -- a HATEOAS REST intermediary, multi-instance support, Docker provisioning, cursor-based pagination -- but the core idea of letting an MCP agent drive Ghidra traces back to her work. The proof of concept she built made the case that this was worth pursuing further.
## FastMCP
[FastMCP](https://github.com/jlowin/fastmcp) by Jeremiah Lowin is the Python framework that MCGhidra's MCP server is built on. Its decorator-based tool registration and mixin composition pattern made it practical to organize 64+ tools into maintainable domain modules. The `Context` system for session isolation and progress reporting is central to how MCGhidra handles multi-client scenarios. FastMCP removed a large amount of boilerplate that would otherwise dominate the codebase.
## HATEOAS and REST
The Hypermedia as the Engine of Application State constraint comes from Roy Fielding's 2000 dissertation, where he formalized the REST architectural style. Most APIs that call themselves RESTful ignore this constraint. MCGhidra embraces it because agents benefit from self-describing responses -- when every result includes `_links` to related resources, the agent does not need to memorize URL patterns or maintain a hardcoded API map.
## Model Context Protocol
Anthropic's [MCP specification](https://modelcontextprotocol.io/) provides the transport layer between MCGhidra and its clients. The protocol's tool/resource/prompt abstraction maps naturally to reverse engineering workflows: tools for mutating operations like renaming symbols, resources for read-only enumeration like listing functions, and prompts for guided analysis workflows.
## Related Projects
MCGhidra is part of a broader ecosystem of people bridging reverse engineering tools with external interfaces. Notable related work includes:
- Binary Ninja MCP servers that expose BN's API over the same protocol
- IDA Pro scripting bridges that have connected IDA to external tools for years
- Radare2 and rizin automation frameworks, which pioneered the idea of a scriptable RE command interface
- The growing community of MCP server authors connecting domain-specific tools to language model agents
Each of these projects informed the design decisions in MCGhidra, whether by example or by contrast. The RE tooling community has a long history of building bridges between analysis engines and the outside world -- MCGhidra is one more entry in that tradition.

View File

@ -0,0 +1,99 @@
---
title: Installation
description: Installing MCGhidra from PyPI and configuring your MCP client
---
## Install from PyPI
```bash
pip install mcghidra
```
Or with [uv](https://docs.astral.sh/uv/) for isolated execution:
```bash
uvx mcghidra
```
This installs the MCP server and bundles the Ghidra plugin JAR. No separate plugin installation is needed — the server deploys the JAR automatically when starting Docker containers.
## MCP Client Configuration
### Claude Desktop
Add to your Claude Desktop configuration file:
- **macOS**: `~/Library/Application Support/Claude/claude_desktop_config.json`
- **Windows**: `%APPDATA%\Claude\claude_desktop_config.json`
- **Linux**: `~/.config/Claude/claude_desktop_config.json`
```json
{
"mcpServers": {
"mcghidra": {
"command": "uvx",
"args": ["mcghidra"]
}
}
}
```
### Claude Code
```bash
claude mcp add mcghidra -- uvx mcghidra
```
Or if running from a local clone:
```bash
claude mcp add mcghidra -- uv run --directory /path/to/MCGhidra mcghidra
```
## Docker Setup (Optional)
If you want automatic container provisioning:
1. Clone the repository:
```bash
git clone https://git.supported.systems/MCP/mcghidra.git
cd mcghidra
```
2. Build the Docker image:
```bash
cd docker && docker build -t mcghidra:latest -f Dockerfile ..
```
Or use the MCP tool:
```
docker_build()
```
3. The `docker_auto_start` tool handles everything else — port allocation, container naming, health polling.
## Analyzing Raw Firmware
For raw binary blobs (firmware dumps, bootloaders), specify the processor architecture:
```
docker_auto_start(
binary_path="/path/to/firmware.bin",
language="ARM:LE:32:v4t",
base_address="0x00000000"
)
```
Common language IDs:
| Target | Language ID |
|--------|------------|
| ARM7TDMI (Thumb) | `ARM:LE:32:v4t` |
| ARM Cortex-M | `ARM:LE:32:Cortex` |
| ARMv7 | `ARM:LE:32:v7` |
| MIPS32 LE | `MIPS:LE:32:default` |
| MIPS32 BE | `MIPS:BE:32:default` |
| x86 32-bit | `x86:LE:32:default` |
| x86-64 | `x86:LE:64:default` |
When `language` is specified, the loader is automatically set to `BinaryLoader` (raw binary import). Override with `loader="AutoImporter"` if your file has a recognized header format.

View File

@ -0,0 +1,69 @@
---
title: Overview
description: What MCGhidra does and how the pieces fit together
---
MCGhidra is a two-part system that bridges NSA's [Ghidra](https://ghidra-sre.org/) reverse engineering framework with [Model Context Protocol](https://modelcontextprotocol.io/) (MCP) clients.
## What You Get
- **64 MCP tools** across 12 categories (functions, data, structs, symbols, analysis, memory, variables, bookmarks, enums, namespaces, segments, Docker)
- **13 analysis prompts** for guided RE workflows (malware triage, crypto identification, auth bypass hunting, protocol analysis, and more)
- **19 MCP resources** for quick enumeration without tool calls
- **Cursor-based pagination** for handling binaries with 100K+ functions
- **Server-side grep** filtering before results hit the wire
- **Docker provisioning** with automatic port pooling and session isolation
## Components
### Ghidra Plugin (Java)
A headless-compatible Ghidra script that starts an HTTP server inside the JVM. The server exposes a HATEOAS REST API — every response includes hypermedia links to related resources, so clients discover the API by following links rather than memorizing paths.
The plugin supports:
- Function listing, decompilation, and disassembly
- Memory reads and data type inspection
- Symbol renaming and annotation
- Cross-reference navigation
- Program metadata and analysis status
- Health checks for container orchestration
### MCP Server (Python)
A [FastMCP](https://github.com/jlowin/fastmcp) server that wraps the REST API as MCP tools. It adds:
- **Multi-instance management** — connect to multiple Ghidra sessions, switch between them
- **Docker provisioning** — automatic container lifecycle with port pooling
- **Raw firmware support** — specify processor language, base address, and loader for binary blobs
- **Session isolation** — each MCP client gets its own session ID, preventing cross-talk
## How the Pieces Connect
```
┌──────────────┐ MCP ┌──────────────┐ HTTP ┌──────────────┐
│ MCP Client │◄────────────►│ MCGhidra │◄────────────►│ Ghidra │
│ (Claude, │ stdio │ Python │ REST API │ Plugin │
│ Cursor, │ │ Server │ (HATEOAS) │ (Java) │
│ etc.) │ │ │ │ │
└──────────────┘ └──────────────┘ └──────────────┘
┌─────┴─────┐
│ Docker │
│ Engine │
└───────────┘
```
The MCP client communicates with MCGhidra over stdio using the Model Context Protocol. MCGhidra translates tool calls into HTTP requests against the Ghidra plugin's REST API. When Docker is available, MCGhidra can also provision and manage Ghidra containers automatically.
## Typical Workflow
1. **Start Ghidra** — either via Docker (`docker_auto_start`) or by running the plugin in an existing Ghidra instance
2. **Connect** — the MCP server discovers the Ghidra instance via health checks
3. **Analyze** — use MCP tools to decompile functions, read memory, navigate references
4. **Iterate** — rename symbols, annotate findings, re-decompile to see updated output
## Requirements
- Python 3.11+
- Ghidra 11.0+ (for the plugin)
- Docker (optional, for container management)

View File

@ -0,0 +1,158 @@
---
title: Cursor Pagination
description: Working with large binaries using cursor-based pagination
---
Large binaries can contain tens of thousands of functions, hundreds of thousands of cross-references, and thousands of strings. Returning all of that in a single tool response would overflow the MCP client's context window and produce unusable output. MCGhidra uses cursor-based pagination to deliver results in controlled pages.
## How it works
When a paginated tool returns more items than the page size, the response includes a `cursor_id`. Pass that cursor ID to `cursor_next` to get the next page. Continue until `has_more` is `false`.
```
# First page: get 100 functions matching a pattern
result = functions_list(page_size=100, grep="crypt|hash")
# Returns:
# {
# result: [...],
# pagination: {
# cursor_id: "a1b2c3d4e5f67890",
# total_count: 12847,
# filtered_count: 847,
# page_size: 100,
# current_page: 1,
# total_pages: 9,
# has_more: true
# }
# }
# Next page
result = cursor_next(cursor_id="a1b2c3d4e5f67890")
# Returns page 2 of 9
# Continue until has_more is false
result = cursor_next(cursor_id="a1b2c3d4e5f67890")
# ...
```
Each response also includes a `_message` field with a human-readable summary like "Showing 100 of 847 items (page 2/9). To get the next 100 items, call: cursor_next(cursor_id='a1b2c3d4e5f67890')". MCP clients use this to decide whether to continue fetching.
## Server-side grep filtering
The `grep` parameter filters results on the server before pagination. This is much more efficient than fetching everything and filtering client-side, because only matching items are stored in the cursor and counted toward page totals.
```
# Only functions with "auth" or "login" in their name/address
functions_list(grep="auth|login", page_size=50)
# Case-sensitive search (grep_ignorecase defaults to true)
data_list_strings(grep="BEGIN CERTIFICATE", grep_ignorecase=false, page_size=50)
```
The grep pattern is a regular expression. It matches against all string values in each result item -- for a function, that means the name, address, and signature fields are all searched.
### Pattern safety
Patterns are validated before execution to prevent runaway matches:
- Maximum 500 characters
- Maximum 15 repetition operators (`*`, `+`, `?`, `{n,m}`)
- Nested quantifiers like `(a+)+` are rejected
If a pattern fails validation, the tool returns an error with code `INVALID_GREP_PATTERN` explaining what to fix.
## The return_all option
When you need all matching results without paging through cursors, pass `return_all=True`:
```
functions_list(grep="crypt", return_all=True)
```
This bypasses pagination and returns every matching item in a single response. There is a token budget guard (default: 8,000 estimated tokens) that kicks in if the response would be too large. When the guard triggers, the response includes:
- A sample of the first 3 items
- The available field names
- Suggested narrower queries (grep patterns, field projections, or pagination)
Combine `return_all` with `grep` and `fields` to keep the response size down:
```
# Get all crypto-related function names and addresses (nothing else)
functions_list(grep="crypt|aes|sha", fields=["name", "address"], return_all=True)
```
## Page size
The `page_size` parameter controls how many items each page contains.
| Parameter | Default | Maximum |
|-----------|---------|---------|
| `page_size` | 50 | 500 |
For most MCP client contexts, 50-100 items per page is a good balance between making progress and keeping individual responses readable. Going above 200 is rarely useful unless you are scripting.
## Cursor lifecycle
### TTL and eviction
Cursors expire after 5 minutes of inactivity (no `cursor_next` calls). The timer resets each time a cursor is accessed.
When more than 100 cursors exist for a session, the least-recently-used cursor is evicted to make room. In practice, you will rarely hit this limit unless you start many queries without finishing them.
### Session isolation
Each MCP client session gets its own set of cursors. You cannot access or interfere with another session's cursors. Session IDs are derived from the MCP client context -- they are not user-controllable.
### Management tools
| Tool | What it does |
|------|-------------|
| `cursor_list()` | Show all active cursors for the current session: IDs, page progress, TTL remaining, grep pattern |
| `cursor_delete(cursor_id="...")` | Delete a specific cursor to free memory |
| `cursor_delete_all()` | Delete all cursors for the current session |
These are useful for cleanup during long analysis sessions or when you want to re-run a query from scratch.
## Example: scanning all strings for credentials
```
# Start with a broad credential search
result = data_list_strings(grep="password|secret|key|token|api_key|credential", page_size=100)
# Process first page of results
# ... examine the strings ...
# Get more if there are additional pages
if result has cursor_id:
result = cursor_next(cursor_id="...")
```
## Example: iterating through all functions matching a pattern
```
# First page
result = functions_list(grep="handle_|process_|parse_", page_size=50)
# Loop through pages
while result has cursor_id:
# Decompile interesting functions from this page
for func in result:
if func looks relevant:
functions_decompile(name=func["name"])
# Advance
result = cursor_next(cursor_id="...")
```
## Tips
- Prefer server-side `grep` over fetching everything. A query like `functions_list(grep="ssl")` is far cheaper than `functions_list(return_all=True)` followed by manual filtering.
- Use `fields` to reduce response size. If you only need names and addresses, `functions_list(fields=["name", "address"], page_size=100)` cuts the per-item size significantly.
- Small page sizes (50-100) keep individual responses from consuming too much context. You can always fetch more pages.
- If a cursor expires (5-minute TTL), just re-run the original query. The cursor IDs are not reusable -- you get a new one each time.
- For very large binaries (100K+ functions), start with grep-filtered queries rather than listing everything. Even paginated, iterating through 2,000 pages of 50 items each is slow and rarely what you actually need.

View File

@ -0,0 +1,257 @@
---
title: Troubleshooting
description: Common issues and solutions when using MCGhidra
---
## Container Issues
### Container will not start
Check that the binary path is correct and accessible from the Docker daemon. The path you pass to `docker_auto_start` must exist on the host machine, and the Docker volume mount must be able to reach it.
```
docker_auto_start(binary_path="/path/to/binary")
```
If this fails, verify:
- The file exists at the specified path
- The `mcghidra:latest` Docker image is built (run `docker_status()` to check)
- Docker is running and your user has permission to access it
### Health check timeouts
Analysis takes time. A small binary (under 1 MB) typically finishes in about 20 seconds. Larger binaries -- especially firmware images or complex C++ programs -- can take several minutes.
Poll `docker_health` to check readiness:
```
docker_health(port=8195)
```
While waiting, check what Ghidra is doing:
```
docker_logs(port=8195)
```
If you see Ghidra import and analysis messages in the logs but the health check never succeeds, the analysis is still running. If the logs show errors or the container has exited, the import likely failed (see "Import failed" below).
### Port conflicts
MCGhidra allocates ports from a pool (default 8192-8319). If another application is using a port in this range, the allocator skips it. If you run many concurrent containers and exhaust the pool, `docker_auto_start` will report that no ports are available.
Check current allocations with:
```
docker_status()
```
You can adjust the port range with environment variables:
| Variable | Default |
|----------|---------|
| `MCGHIDRA_PORT_START` | `8192` |
| `MCGHIDRA_PORT_END` | `8319` |
### Viewing container logs
```
docker_logs(port=8195, tail=200)
```
This shows stdout and stderr from the Ghidra headless process. Look for lines containing `ERROR`, `WARN`, or `Exception` to diagnose import or analysis failures.
---
## Connection Issues
### "No Ghidra instance specified"
This means no current instance is set. First, discover available instances, then select one:
```
instances_list()
instances_use(port=8195)
```
If `instances_list` returns no instances, either no Ghidra process is running or it is on a port outside the discovery range.
### Instance not found after starting a container
`docker_auto_start` returns a port, but the MCP server does not automatically register it as the current instance. You need to call:
```
instances_use(port=8195)
```
If `instances_list` does not show the container, the API may not be ready yet. Poll `docker_health` first.
### API version mismatch
If you see version mismatch errors, the Ghidra plugin is older than the MCP server expects. The current server expects API v2. Update the plugin by rebuilding the Docker image or installing the latest MCGhidra release.
### Timeout on first tool call after instances_use
`instances_use` is lazy -- it creates a stub entry without connecting to Ghidra. The first real tool call (like `functions_list`) validates the connection. If Ghidra is not ready yet, that call will time out.
Wait for `docker_health` to report healthy before calling `instances_use`.
---
## Analysis Issues
### Import failed
Raw binaries (firmware, bootloaders) need the `language` parameter to tell Ghidra which processor architecture to use. Without it, Ghidra tries to auto-detect the format and will fail on headerless files.
```
docker_auto_start(
binary_path="/path/to/firmware.bin",
language="ARM:LE:32:v4t",
base_address="0x00000000"
)
```
Check the logs if auto-import fails:
```
docker_logs(port=8195)
```
Common causes:
- Missing `language` for raw binaries
- Incorrect base address
- Corrupted or truncated binary file
- Unsupported file format (check with the `file` command on the host)
### OSGi bundle error
This is a known Ghidra limitation that can occur with certain script configurations. It appears as "Failed to get OSGi bundle" in the container logs. It does not usually affect analysis results -- the API still functions. If it blocks operation, rebuilding the Docker image with the latest scripts resolves it in most cases.
### Analysis incomplete
If decompiled output looks wrong (missing function boundaries, incorrect types), Ghidra's auto-analysis may not have finished or may need a second pass:
```
analysis_run()
```
This triggers a full re-analysis of the current program. It can take a while on large binaries.
### Decompilation timeout
For very large or complex functions, the decompiler can take longer than the default timeout. If `functions_decompile` times out, the function may have deeply nested loops, heavy inlining, or obfuscated control flow.
Try disassembly instead for a faster view:
```
functions_disassemble(address="00401234")
```
---
## Pagination Issues
### Cursor expired
Cursors have a 5-minute inactivity TTL. If you wait too long between `cursor_next` calls, the cursor is deleted. Re-run the original query to get a fresh cursor:
```
functions_list(grep="crypt", page_size=100)
```
See [Cursor Pagination](/guides/cursor-pagination/) for details on cursor lifecycle.
### Context window overflow
If tool responses are consuming too much context, reduce the page size:
```
functions_list(page_size=25, grep="your_pattern")
```
Use `fields` to limit which fields are returned:
```
functions_list(page_size=50, fields=["name", "address"])
```
And always prefer `grep` to filter results before they reach the client.
### "Session spoofing" errors
Session IDs are derived from the MCP client context and cannot be set manually. If you see session-related errors, it means a cursor belongs to a different MCP session. Each session (each Claude conversation, for example) has its own isolated cursor space.
---
## Docker-Specific Issues
### docker_auto_start appears to hang
`docker_auto_start` returns immediately after starting the container. It does not wait for analysis to complete. If it seems to hang, the issue is likely Docker itself taking time to pull or start the container. Check:
```
docker_status()
```
### Cross-session interference
Each MCP session has a unique session ID. Docker containers are tagged with their owning session. `docker_stop` validates that the container belongs to your session before stopping it. You cannot stop another session's container.
If you need to clean up containers from a previous session that is no longer active, use:
```
docker_cleanup(session_only=False)
```
Be careful with this -- it removes all orphaned MCGhidra containers, not just yours.
### Stale containers
If containers from previous sessions are still running, they consume ports from the pool. Use `docker_cleanup()` (which defaults to `session_only=True`) to clean up your own stale containers, or `docker_cleanup(session_only=False)` to remove all orphaned containers.
### Build failures
If `docker_build()` fails, make sure:
- The Dockerfile context is correct (it needs both the `docker/` directory and the project root)
- Docker has enough disk space
- The base Ghidra image layers download successfully (network access required for first build)
---
## Debug Mode
Set the `MCGHIDRAMCP_DEBUG` environment variable before starting the MCP server to enable verbose logging:
```bash
MCGHIDRAMCP_DEBUG=1 uvx mcghidra
```
Or in your MCP client configuration:
```json
{
"mcpServers": {
"mcghidra": {
"command": "uvx",
"args": ["mcghidra"],
"env": {
"MCGHIDRAMCP_DEBUG": "1"
}
}
}
}
```
Debug output goes to stderr and includes:
- Instance discovery attempts and results
- HTTP request/response details for Ghidra API calls
- Cursor creation, access, and expiration events
- Docker container lifecycle events
- Port pool allocation and release
Check the MCP server's stderr output in your terminal or in the MCP client's server log viewer.

View File

@ -0,0 +1,236 @@
---
title: Analysis Workflows
description: Common reverse engineering workflows with MCGhidra
---
These workflows assume you have MCGhidra installed and configured as described in the [Installation guide](/getting-started/installation/).
## Triage a Binary
The fastest way to get oriented in an unknown binary. Start a container, wait for Ghidra to finish analysis, then survey the surface area.
### 1. Start analysis
```
docker_auto_start(binary_path="/path/to/target.exe")
```
This returns immediately with a port number. It does not block while Ghidra runs.
### 2. Wait for analysis to complete
Poll until the HTTP API responds:
```
docker_health(port=8195)
```
For a small binary (under 1 MB), expect about 20 seconds. Larger binaries can take several minutes. Check `docker_logs(port=8195)` while waiting to see Ghidra's progress.
### 3. Set the instance as current
```
instances_use(port=8195)
```
After this, every tool call defaults to this instance. No need to pass `port` again.
### 4. Get the program overview
```
program_info()
functions_list(page_size=100)
data_list_strings(page_size=100)
```
`program_info` returns architecture, compiler, and image base address. The function and string listings give a first sense of scale and naming conventions.
### 5. Search for interesting patterns
Use server-side grep to find functions and strings related to security-sensitive behavior:
```
functions_list(grep="password|key|auth|crypt|login|verify", page_size=100)
data_list_strings(grep="password|secret|key|token|credential", page_size=100)
```
From here, decompile anything that looks relevant and follow cross-references to understand the surrounding logic.
---
## Rename and Annotate Loop
Ghidra auto-analysis produces generic names like `FUN_00401234`. As you reverse engineer, renaming functions and adding comments makes the decompiled output progressively easier to read.
### 1. Decompile a function
```
functions_decompile(address="00401234")
```
### 2. Identify what it does
Read the pseudocode. Look at string references, called functions, and parameter usage to determine the function's purpose.
### 3. Rename it
```
functions_rename(address="00401234", new_name="validate_user_credentials")
```
### 4. Set the signature
If you can determine the parameter types and return type:
```
functions_set_signature(
address="00401234",
signature="int validate_user_credentials(char *username, char *password)"
)
```
### 5. Add a comment
```
functions_set_comment(
address="00401234",
comment="Checks username/password against the SQLite user table. Returns 1 on success."
)
```
### 6. Re-decompile
```
functions_decompile(address="00401234")
```
The decompiled output now uses your names, types, and annotations. Functions called from `validate_user_credentials` also reflect the updated name wherever they reference it. Repeat this loop for each function you investigate.
---
## Firmware Reverse Engineering
Raw firmware (bootloaders, embedded system images, bare-metal code) requires extra setup because there is no ELF/PE header for Ghidra to parse.
### 1. Start with the right loader
Specify the processor language and base address:
```
docker_auto_start(
binary_path="/path/to/firmware.bin",
language="ARM:LE:32:v4t",
base_address="0x00000000"
)
```
When `language` is set, MCGhidra uses `BinaryLoader` to map the raw bytes at the given address. See the [Installation guide](/getting-started/installation/) for a table of common language IDs.
### 2. Find the entry point
ARM firmware typically starts with an exception vector table at address 0x00000000. The first entry is the initial stack pointer, and the second is the reset vector (entry point):
```
memory_read(address="0x00000000", length=32, format="hex")
functions_decompile(address="0x00000004")
```
### 3. Identify peripherals
Embedded firmware talks to hardware through memory-mapped I/O. Look for reads and writes to addresses outside the firmware's code and data regions:
```
data_list_strings(grep="UART|SPI|I2C|GPIO")
functions_list(grep="init_periph|hw_init|bsp_")
```
Constants like `0x40000000`, `0x48000000`, or `0xE000E000` (ARM Cortex-M NVIC) are strong indicators of peripheral access.
### 4. Trace interrupt handlers
Interrupt vector tables are typically at fixed offsets. For Cortex-M, the vector table starts at the base address. Each 4-byte entry points to a handler:
```
memory_read(address="0x00000000", length=256, format="hex")
```
Create functions at each non-null vector address:
```
functions_create(address="0x00000040")
functions_decompile(address="0x00000040")
```
### 5. Map protocol implementations
Firmware that communicates over a bus (UART, SPI, USB, CAN) will have recognizable patterns: ring buffers, state machines with packet parsing, and checksum calculations. Use call graph analysis to trace from peripheral init functions to protocol handlers:
```
analysis_get_callgraph(name="uart_init", max_depth=4)
```
---
## Using Analysis Prompts
MCGhidra includes 13 built-in prompts that guide Claude through structured analysis workflows. Each prompt defines a series of steps, tool calls, and checks for a specific reverse engineering task.
### Running a prompt
In Claude Code or Claude Desktop, use the `/prompt` command:
```
/prompt malware_triage
```
Claude will then execute a multi-step analysis: listing functions, scanning strings, checking imports, and producing a structured report. Prompts that involve scanning (like `malware_triage` or `identify_crypto`) report progress as they work through each step.
### Available prompts
| Prompt | What it does |
|--------|-------------|
| `malware_triage` | Quick capability assessment across 21 scanning steps: checks for network activity, file manipulation, process injection, anti-analysis tricks, and persistence mechanisms |
| `identify_crypto` | Scans for known crypto constants (AES S-boxes, SHA magic numbers), function names matching crypto libraries, and common key schedule patterns |
| `find_authentication` | Searches for password checks, credential storage, license validation, certificate handling, and authentication bypass patterns |
| `analyze_protocol` | Framework for reversing network or file format protocols: identifies packet structures, state machines, serialization routines |
| `trace_data_flow` | Follows data forward or backward through a program to map how input reaches sensitive operations |
| `find_main_logic` | Navigates past CRT startup, compiler-generated wrappers, and initialization to find the actual application entry point |
| `analyze_imports` | Categorizes imported functions by capability (file I/O, networking, crypto, process management) and flags suspicious combinations |
| `analyze_strings` | Groups strings by category (URLs, file paths, error messages, format strings) and cross-references them to find their usage |
| `analyze_switch_table` | Identifies jump tables and command dispatchers, maps case values to handler functions |
| `find_config_parsing` | Locates configuration file readers, command-line parsers, registry access, and environment variable lookups |
| `compare_functions` | Side-by-side comparison of two functions to identify patches, variants, or shared library code |
| `document_struct` | Traces struct usage across the binary to document field types, offsets, sizes, and purpose |
| `find_error_handlers` | Maps error handling paths, cleanup routines, exception handlers, and exit patterns |
### Prompt examples
Triage an unknown binary for malicious capabilities:
```
/prompt malware_triage
```
Find all cryptographic implementations:
```
/prompt identify_crypto
```
Trace how user input flows to a specific sink:
```
/prompt trace_data_flow
```
### What happens during a prompt
Each prompt orchestrates a series of MCP tool calls. For example, `malware_triage` will:
1. Call `program_info()` to determine the architecture and format
2. Call `functions_list(grep=...)` repeatedly with patterns for each capability category (networking, file ops, process injection, etc.)
3. Call `data_list_strings(grep=...)` to find suspicious string patterns
4. Call `symbols_imports(grep=...)` to categorize imported APIs
5. Produce a summary with findings organized by risk category
Prompts that scan many patterns report numeric progress (e.g., "Step 12/21: Checking for anti-debug techniques") so you can see where they are in the analysis.

View File

@ -0,0 +1,67 @@
---
title: MCGhidra
description: Reverse engineering bridge between Ghidra and MCP
template: splash
hero:
tagline: Multi-instance Ghidra with a REST API and MCP server for decompilation, analysis, and binary manipulation.
actions:
- text: Getting Started
link: /getting-started/overview/
icon: right-arrow
- text: View on PyPI
link: https://pypi.org/project/mcghidra/
icon: external
variant: minimal
---
import { Card, CardGrid } from '@astrojs/starlight/components';
## What is MCGhidra?
MCGhidra connects [Ghidra](https://ghidra-sre.org/) to the [Model Context Protocol](https://modelcontextprotocol.io/) (MCP), turning Ghidra's analysis engine into a set of tools that any MCP client can use. It consists of two parts: a **Ghidra plugin** that exposes a HATEOAS REST API from within Ghidra's JVM, and a **Python MCP server** that translates those endpoints into MCP tools.
The result is that an MCP client can decompile functions, read memory, rename symbols, navigate cross-references, and manage Docker-based Ghidra instances — all through standard tool calls.
<CardGrid>
<Card title="HATEOAS REST API" icon="puzzle">
The Ghidra plugin serves a discoverable API with hypermedia links. Clients navigate resources by following links rather than hardcoding endpoint paths.
</Card>
<Card title="Multi-Instance" icon="list-format">
Run multiple Ghidra sessions in parallel, each analyzing a different binary. The MCP server tracks instances and routes requests to the right one.
</Card>
<Card title="Docker Support" icon="rocket">
Automatic container provisioning with port pooling. Pass a binary path and MCGhidra handles image building, container startup, and health polling.
</Card>
<Card title="Raw Firmware" icon="setting">
Specify processor architecture, base address, and loader type for raw firmware binaries. Supports targets like ARM7TDMI, MIPS, x86, and anything Ghidra recognizes.
</Card>
</CardGrid>
## Quick Install
```bash
# Install the MCP server
pip install mcghidra
# Or use uvx for isolated execution
uvx mcghidra
```
The Ghidra plugin JAR ships inside the Python package and is automatically deployed to running Ghidra instances or Docker containers.
## Architecture
```
MCP Client (Claude, etc.)
MCGhidra MCP Server (Python, FastMCP)
Ghidra Plugin REST API (Java, runs inside Ghidra's JVM)
Ghidra Analysis Engine
```
Each layer is independently useful. The REST API works without MCP. The MCP server works without Docker. Docker support is optional for automated container management.

View File

@ -0,0 +1,115 @@
---
title: Configuration
description: Environment variables and settings for the MCP server, Docker containers, and port pool
---
MCGhidra is configured through environment variables. No configuration file is required -- defaults work out of the box for local development.
## MCP Server
These variables control the Python MCP server process.
| Variable | Default | Description |
|----------|---------|-------------|
| `GHIDRA_HOST` | `localhost` | Hostname for connecting to Ghidra instances. Change this when Ghidra runs on a remote host. |
| `MCGHIDRAMCP_DEBUG` | *unset* | Set to `1` to enable DEBUG-level logging. Shows HTTP requests, pagination details, and discovery results. |
| `MCGHIDRA_FEEDBACK` | `true` | Enable or disable feedback collection. Set to `false` to disable. |
| `MCGHIDRA_FEEDBACK_DB` | `~/.mcghidra/feedback.db` | Path to the SQLite database for feedback data. The parent directory is created automatically. |
### Internal Defaults
These values are set in `MCGhidraConfig` and are not currently exposed as environment variables, but can be overridden programmatically when creating the server:
| Setting | Default | Description |
|---------|---------|-------------|
| `quick_discovery_range` | 18489-18498 | Port range for quick instance discovery scans |
| `full_discovery_range` | 18400-18599 | Port range for full discovery scans (`instances_discover`) |
| `request_timeout` | 30.0s | HTTP request timeout for Ghidra API calls |
| `discovery_timeout` | 0.5s | HTTP timeout per port during discovery scans |
| `default_page_size` | 50 | Default pagination page size |
| `max_page_size` | 500 | Maximum allowed page size |
| `cursor_ttl_seconds` | 300 | Cursor expiration time (5 minutes) |
| `max_cursors_per_session` | 100 | Maximum active cursors per MCP session |
| `max_response_tokens` | 8000 | Hard token budget -- the return_all guard triggers above this |
| `expected_api_version` | 2 | Minimum API version required from the Ghidra plugin |
---
## Docker Image
These variables control the MCP server's Docker integration -- how it builds, tags, and starts containers.
| Variable | Default | Description |
|----------|---------|-------------|
| `MCGHIDRAMCP_VERSION` | `latest` | Docker image tag to use when starting containers. |
| `MCGHIDRA_PORT` | `8192` | Default port for container API mapping. Overridden by auto-allocation in multi-container mode. |
| `MCGHIDRA_MAXMEM` | `2G` | Max JVM heap size passed to containers. Increase for large binaries. |
| `MCGHIDRA_DOCKER_AUTO` | `false` | When `true`, the server will automatically start a Docker container when a binary is loaded and no Ghidra instance is available. |
---
## Port Pool
The port pool prevents conflicts when multiple MCP sessions run containers simultaneously. Ports are allocated using `flock`-based locking.
| Variable | Default | Description |
|----------|---------|-------------|
| `MCGHIDRA_PORT_START` | `8192` | First port in the allocation pool. |
| `MCGHIDRA_PORT_END` | `8319` | Last port in the allocation pool (128 ports total). |
| `MCGHIDRA_PORT_LOCK_DIR` | `/tmp/mcghidra-ports` | Directory for port lock files. Created automatically on first use. |
Port lock files are named `port-{N}.lock` and contain JSON with the session ID, PID, and timestamp. The `docker_cleanup` tool removes stale locks from crashed processes.
---
## Container Environment
These variables are read by the Docker entrypoint script (`entrypoint.sh`) inside the container. They configure how Ghidra runs in headless mode.
### Core Settings
| Variable | Default | Description |
|----------|---------|-------------|
| `MCGHIDRA_MODE` | `headless` | Container operating mode. See modes below. |
| `MCGHIDRA_PORT` | `8192` | HTTP API port inside the container. The MCP server maps this to a host port from the pool. |
| `MCGHIDRA_MAXMEM` | `2G` | Max JVM heap size. Passed to Ghidra's `analyzeHeadless` command. |
### Ghidra Paths
| Variable | Default | Description |
|----------|---------|-------------|
| `GHIDRA_HOME` | `/opt/ghidra` | Ghidra installation directory inside the container. |
| `SCRIPT_DIR` | `/home/ghidra/ghidra_scripts` | Directory for Ghidra Python scripts (MCGhidraServer.py lives here). |
| `PROJECT_DIR` | `/projects` | Directory where Ghidra stores its project files (.gpr, .rep). |
| `PROJECT_NAME` | `MCGhidra` | Name of the Ghidra project created for the imported binary. |
### Firmware Import Options
These are optional. When omitted, Ghidra auto-detects the binary format.
| Variable | Default | Description |
|----------|---------|-------------|
| `GHIDRA_LANGUAGE` | *auto-detect* | Processor language ID. Must match `ARCH:ENDIAN:SIZE:VARIANT` format (e.g., `ARM:LE:32:v4t`). Setting this causes the container to use `BinaryLoader` unless `GHIDRA_LOADER` overrides it. |
| `GHIDRA_BASE_ADDRESS` | *auto-detect* | Base address for raw binaries. Hex format: `0x00000000` or `00000000`. |
| `GHIDRA_LOADER` | *auto-detect* | Loader type override. Common values: `BinaryLoader` (raw bytes), `AutoImporter` (header-based detection). Must be alphanumeric with underscores. |
### Container Modes
The `MCGHIDRA_MODE` variable selects the operating mode:
| Mode | Description |
|------|-------------|
| `headless` | Default. Imports the binary, runs auto-analysis, starts the HTTP API server. This is what `docker_auto_start` and `docker_start` use. |
| `server` | Opens an existing project (no import). Requires a program name as an argument. Useful for re-analyzing a previously imported binary. |
| `analyze` | Imports and analyzes a binary, then exits. No HTTP server. Use for batch processing. |
| `shell` | Drops into an interactive bash shell. Useful for debugging the container environment. |
### Validation
All firmware import parameters are validated before reaching Ghidra:
- `GHIDRA_LANGUAGE` must match `ARCH:ENDIAN:SIZE:VARIANT` (regex-validated).
- `GHIDRA_BASE_ADDRESS` must be valid hex, max 64-bit.
- `GHIDRA_LOADER` must be alphanumeric with underscores.
Invalid values are rejected with a descriptive error before any Docker or Ghidra operations run. The MCP server validates these on the client side as well, so errors surface in tool responses rather than buried in container logs.

View File

@ -0,0 +1,87 @@
---
title: Docker Usage
description: Container management, port pooling, and firmware import options
---
MCGhidra can automatically provision Docker containers running Ghidra in headless mode. Each container gets a dedicated port from a configurable pool, and containers are tracked by session to prevent cross-talk between concurrent MCP clients.
## Quick Start
```
docker_auto_start(binary_path="/path/to/binary")
```
This checks for an existing instance analyzing the same binary, and if none is found, starts a new container with an auto-allocated port.
## Environment Variables
The Docker entrypoint accepts these environment variables:
| Variable | Default | Description |
|----------|---------|-------------|
| `MCGHIDRA_PORT` | `8192` | HTTP API port inside the container |
| `MCGHIDRA_MAXMEM` | `2G` | Max JVM heap size |
| `PROJECT_NAME` | `MCGhidra` | Ghidra project name |
| `PROJECT_DIR` | `/projects` | Project directory inside container |
| `GHIDRA_LANGUAGE` | *(auto-detect)* | Processor language ID (e.g., `ARM:LE:32:v4t`) |
| `GHIDRA_BASE_ADDRESS` | *(auto-detect)* | Base address for raw binaries (e.g., `0x00000000`) |
| `GHIDRA_LOADER` | *(auto-detect)* | Loader type (e.g., `BinaryLoader`) |
## Port Pool
Ports are allocated from a pool to prevent conflicts:
| Variable | Default | Description |
|----------|---------|-------------|
| `MCGHIDRA_PORT_START` | `8192` | First port in the pool |
| `MCGHIDRA_PORT_END` | `8319` | Last port in the pool (128 ports) |
| `MCGHIDRA_PORT_LOCK_DIR` | `/tmp/mcghidra-ports` | Lock file directory |
Port allocation uses `flock`-based locking for cross-process safety.
## Raw Firmware Import
For binaries without recognized headers (raw firmware dumps, bootloader images):
```
docker_start(
binary_path="/path/to/firmware.bin",
language="ARM:LE:32:v4t",
base_address="0x00000000"
)
```
When `language` is specified, `loader` is automatically set to `BinaryLoader`. This tells Ghidra to treat the file as raw bytes mapped at the given base address, rather than trying to parse it as an ELF, PE, or Mach-O.
To override the auto-loader (e.g., if you have an ELF but need a specific language):
```
docker_start(
binary_path="/path/to/firmware.elf",
language="ARM:LE:32:v7",
loader="AutoImporter"
)
```
## Container Lifecycle
| Tool | Description |
|------|-------------|
| `docker_auto_start` | Find existing or start new container |
| `docker_start` | Start a container explicitly |
| `docker_stop` | Stop and remove a container |
| `docker_health` | Check if API is responding |
| `docker_logs` | View container output |
| `docker_status` | List all containers and images |
| `docker_cleanup` | Remove orphaned containers and stale locks |
| `docker_session_info` | Show this session's containers |
## Input Validation
All firmware import parameters are validated before reaching the container:
- **Language**: Must match `ARCH:ENDIAN:SIZE:VARIANT` pattern (e.g., `ARM:LE:32:v4t`)
- **Base address**: Must be a valid hex string (e.g., `0x00000000` or `00000000`), max 64-bit
- **Loader**: Must be alphanumeric with underscores (e.g., `BinaryLoader`)
Invalid values are rejected with a descriptive error before any Docker operations occur.

View File

@ -0,0 +1,742 @@
---
title: MCP Tools
description: Complete reference for all MCGhidra MCP tools, grouped by domain
---
MCGhidra exposes Ghidra's capabilities as MCP tools. There are 64 tools across 14 categories.
## Pagination Convention
Most list tools share a common set of pagination and filtering parameters. Rather than repeating them in every table, they are documented once here:
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `port` | int | *current* | Ghidra instance port. Uses the active instance if omitted. |
| `page_size` | int | `50` | Items per page. Maximum: 500. |
| `grep` | string | *none* | Client-side regex pattern applied to results after fetching. |
| `grep_ignorecase` | bool | `true` | Case-insensitive grep matching. |
| `return_all` | bool | `false` | Bypass pagination and return everything. Triggers a budget guard if the response exceeds ~8000 tokens. |
| `fields` | list[str] | *none* | Field projection -- keep only these keys per item. Reduces response size. |
Tools that accept these parameters are marked with "Supports pagination" below. Use `cursor_next(cursor_id)` to advance through pages.
---
## Instance Management
Tools for discovering, registering, and switching between Ghidra instances.
### `instances_list`
List all active Ghidra instances. Runs a quick discovery scan before returning results.
Returns a dict with an `instances` list containing port, URL, project, and file for each instance.
### `instances_use`
Set the current working instance. All subsequent tool calls default to this instance.
Uses lazy registration -- the instance is recorded immediately without a blocking HTTP call. If the instance is unreachable, the next actual tool call will fail with a clear error.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `port` | int | *required* | Port number of the instance to activate |
Returns confirmation with instance details.
### `instances_current`
Show which instance is currently active, including its port, URL, project, and file. Returns an error message with available instance ports if none is set.
### `instances_register`
Manually register an instance by port. Verifies the instance is responsive and checks API version compatibility before registering.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `port` | int | *required* | Port number |
| `url` | string | *auto* | URL override (defaults to `http://{GHIDRA_HOST}:{port}`) |
Returns confirmation or error message.
### `instances_unregister`
Remove an instance from the registry. If the unregistered instance was the current working instance, the current selection is cleared.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `port` | int | *required* | Port number to unregister |
### `instances_discover`
Force a full discovery scan across the configured port range (ports 18400-18600). Use this when you need to find instances on a different host. For normal use, `instances_list` already runs a quick scan.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `host` | string | *configured* | Host to scan |
### `program_info`
Get full program metadata from the current Ghidra instance: architecture, language ID, compiler spec, image base address, and total memory size.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `port` | int | *current* | Ghidra instance port |
---
## Functions
Tools for listing, decompiling, disassembling, and modifying functions. Supports pagination.
### `functions_list`
List functions with cursor-based pagination and server-side filtering. For large binaries, use `name_contains` or `name_regex` for server-side filtering before results reach the client.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name_contains` | string | *none* | Server-side substring filter (faster than grep for large binaries) |
| `name_regex` | string | *none* | Server-side regex filter on function name |
| `address` | string | *none* | Filter by exact function address (hex) |
Supports pagination.
### `functions_get`
Get detailed information about a single function: name, address, signature, size, stack depth, calling convention.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | string | *none* | Function name (mutually exclusive with address) |
| `address` | string | *none* | Function address in hex |
| `port` | int | *current* | Ghidra instance port |
### `functions_decompile`
Decompile a function to C pseudocode. Output is split into lines for pagination -- use `grep` to search within the decompiled code.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | string | *none* | Function name (mutually exclusive with address) |
| `address` | string | *none* | Function address in hex |
| `syntax_tree` | bool | `false` | Include the decompiler syntax tree (JSON) |
| `style` | string | `"normalize"` | Decompiler simplification style |
Supports pagination (over decompiled lines).
### `functions_disassemble`
Get assembly-level disassembly for a function. Output is split into instruction lines for pagination.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | string | *none* | Function name (mutually exclusive with address) |
| `address` | string | *none* | Function address in hex |
Supports pagination (over instruction lines).
### `functions_rename`
Rename a function. Identify it by current name or address.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `old_name` | string | *none* | Current function name |
| `address` | string | *none* | Function address in hex |
| `new_name` | string | *required* | New name for the function |
| `port` | int | *current* | Ghidra instance port |
### `functions_set_signature`
Set the full prototype of a function, including return type, name, and parameter types.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | string | *none* | Function name |
| `address` | string | *none* | Function address in hex |
| `signature` | string | *required* | Full signature (e.g., `"int foo(char* arg1, int arg2)"`) |
| `port` | int | *current* | Ghidra instance port |
### `functions_set_comment`
Set a decompiler-level comment on a function. Tries the function comment first, then falls back to a pre-comment if the address is not a function entry point.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address (preferably function entry point) |
| `comment` | string | `""` | Comment text. Empty string removes the comment. |
| `port` | int | *current* | Ghidra instance port |
### `functions_create`
Create a new function definition at the specified address. Ghidra will attempt to determine the function boundaries automatically.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `port` | int | *current* | Ghidra instance port |
### `functions_variables`
List local variables and parameters for a specific function. Supports pagination.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Function address in hex |
Supports pagination.
---
## Data
Tools for working with defined data items and strings.
### `data_list`
List defined data items with filtering and cursor-based pagination.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `addr` | string | *none* | Filter by address (hex) |
| `name` | string | *none* | Exact name match (case-sensitive) |
| `name_contains` | string | *none* | Substring name filter (case-insensitive) |
| `type` | string | *none* | Filter by data type (e.g., `"string"`, `"dword"`) |
Supports pagination.
### `data_list_strings`
List all defined strings in the binary. Use `filter` for server-side content matching, or `grep` for client-side regex.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `filter` | string | *none* | Server-side string content filter |
Supports pagination.
### `data_create`
Define a new data item at the specified address.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `data_type` | string | *required* | Data type (e.g., `"string"`, `"dword"`, `"byte"`) |
| `size` | int | *none* | Size in bytes (optional) |
| `port` | int | *current* | Ghidra instance port |
### `data_rename`
Rename a data item at the specified address.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `name` | string | *required* | New name |
| `port` | int | *current* | Ghidra instance port |
### `data_set_type`
Change the data type of an existing data item.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `data_type` | string | *required* | New data type (e.g., `"uint32_t"`, `"char[10]"`) |
| `port` | int | *current* | Ghidra instance port |
### `data_delete`
Remove a data definition at the specified address.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `port` | int | *current* | Ghidra instance port |
---
## Structs
Tools for creating and modifying struct (composite) data types.
### `structs_list`
List all struct data types. Supports pagination.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `category` | string | *none* | Filter by category path (e.g., `"/winapi"`) |
Supports pagination.
### `structs_get`
Get a struct with all its fields. If the struct has more than 10 fields, the field list is paginated. Use `fields` projection to reduce response size.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | string | *required* | Struct name |
Supports pagination (over struct fields).
### `structs_create`
Create a new struct data type.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | string | *required* | Name for the struct |
| `category` | string | *none* | Category path (e.g., `"/custom"`) |
| `description` | string | *none* | Description text |
| `port` | int | *current* | Ghidra instance port |
### `structs_add_field`
Add a field to an existing struct. If `offset` is omitted, the field is appended to the end of the struct.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `struct_name` | string | *required* | Name of the struct |
| `field_name` | string | *required* | Name for the new field |
| `field_type` | string | *required* | Data type (e.g., `"int"`, `"char"`, `"pointer"`) |
| `offset` | int | *none* | Byte offset within the struct |
| `comment` | string | *none* | Field comment |
| `port` | int | *current* | Ghidra instance port |
### `structs_update_field`
Modify an existing field in a struct. Identify the field by name or offset. At least one of `new_name`, `new_type`, or `new_comment` must be provided.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `struct_name` | string | *required* | Name of the struct |
| `field_name` | string | *none* | Current field name (or use `field_offset`) |
| `field_offset` | int | *none* | Field offset (or use `field_name`) |
| `new_name` | string | *none* | New name |
| `new_type` | string | *none* | New data type |
| `new_comment` | string | *none* | New comment |
| `port` | int | *current* | Ghidra instance port |
### `structs_delete`
Remove a struct data type definition.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | string | *required* | Struct name to delete |
| `port` | int | *current* | Ghidra instance port |
---
## Symbols
Tools for working with the symbol table: labels, imports, and exports.
### `symbols_list`
List all symbols in the program. Supports pagination.
Supports pagination.
### `symbols_create`
Create a new label/symbol at the specified address. If a symbol already exists at that address, it is renamed.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | string | *required* | Symbol name |
| `address` | string | *required* | Memory address in hex |
| `port` | int | *current* | Ghidra instance port |
### `symbols_rename`
Rename the primary symbol at an address.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `new_name` | string | *required* | New name |
| `port` | int | *current* | Ghidra instance port |
### `symbols_delete`
Delete the primary symbol at an address.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `port` | int | *current* | Ghidra instance port |
### `symbols_imports`
List imported symbols (external references). Supports pagination.
Supports pagination.
### `symbols_exports`
List exported symbols (entry points). Supports pagination.
Supports pagination.
---
## Analysis
Tools for triggering and inspecting Ghidra analysis results.
### `analysis_run`
Trigger Ghidra's auto-analysis on the current program.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `port` | int | *current* | Ghidra instance port |
| `analysis_options` | dict | *none* | Analysis options to enable/disable |
### `analysis_get_callgraph`
Generate a call graph starting from a function. Returns nodes and edges. Edges are paginated.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | string | *none* | Starting function name |
| `address` | string | *none* | Starting function address |
| `max_depth` | int | `3` | Maximum call depth |
Supports pagination (over edges).
### `analysis_get_dataflow`
Trace data flow forward or backward from an address. Returns a list of steps showing how data propagates through the program.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Starting address in hex |
| `direction` | string | `"forward"` | `"forward"` or `"backward"` |
| `max_steps` | int | `50` | Maximum analysis steps |
Supports pagination (over steps).
### `xrefs_list`
Find cross-references to or from an address. At least one of `to_addr` or `from_addr` is required.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `to_addr` | string | *none* | Find references to this address |
| `from_addr` | string | *none* | Find references from this address |
| `type` | string | *none* | Filter by type: `"CALL"`, `"READ"`, `"WRITE"`, `"DATA"`, `"POINTER"` |
Supports pagination.
### `comments_get`
Get a comment at the specified address.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `comment_type` | string | `"plate"` | Type: `"plate"`, `"pre"`, `"post"`, `"eol"`, `"repeatable"` |
| `port` | int | *current* | Ghidra instance port |
### `comments_set`
Set a comment at the specified address. Pass an empty string to remove the comment.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `comment` | string | `""` | Comment text |
| `comment_type` | string | `"plate"` | Type: `"plate"`, `"pre"`, `"post"`, `"eol"`, `"repeatable"` |
| `port` | int | *current* | Ghidra instance port |
---
## Memory
Direct memory access tools.
### `memory_read`
Read bytes from a memory address.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `length` | int | `16` | Number of bytes to read |
| `format` | string | `"hex"` | Output format: `"hex"`, `"base64"`, or `"string"` |
| `port` | int | *current* | Ghidra instance port |
Returns the bytes in the requested format along with the actual byte count.
### `memory_write`
Write bytes to a memory address. Use with caution -- this modifies the program state.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `bytes_data` | string | *required* | Data to write |
| `format` | string | `"hex"` | Input format: `"hex"`, `"base64"`, or `"string"` |
| `port` | int | *current* | Ghidra instance port |
---
## Variables
Tools for querying and renaming variables.
### `variables_list`
List variables with optional global-only filtering. Supports pagination.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `global_only` | bool | `false` | Return only global variables |
Supports pagination.
### `variables_rename`
Rename a variable within a function, and optionally change its data type.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `function_address` | string | *required* | Function address in hex |
| `variable_name` | string | *required* | Current variable name |
| `new_name` | string | *required* | New name |
| `new_type` | string | *none* | New data type (e.g., `"int"`, `"char*"`) |
| `port` | int | *current* | Ghidra instance port |
---
## Bookmarks
Tools for managing Ghidra bookmarks (annotations at addresses).
### `bookmarks_list`
List bookmarks with optional type and category filtering. Supports pagination.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `type` | string | *none* | Filter by type: `"Note"`, `"Warning"`, `"Error"`, `"Info"` |
| `category` | string | *none* | Filter by category |
Supports pagination.
### `bookmarks_create`
Create a bookmark at the specified address.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `type` | string | `"Note"` | Bookmark type: `Note`, `Warning`, `Error`, `Info` |
| `category` | string | `""` | Category string for grouping |
| `comment` | string | `""` | Bookmark comment text |
| `port` | int | *current* | Ghidra instance port |
### `bookmarks_delete`
Delete all bookmarks at the specified address.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `port` | int | *current* | Ghidra instance port |
---
## Enums and Typedefs
Tools for managing enum and typedef data types.
### `enums_list`
List enum data types with their members. Supports pagination.
Supports pagination.
### `enums_create`
Create a new enum data type.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | string | *required* | Enum name |
| `size` | int | `4` | Size in bytes |
| `port` | int | *current* | Ghidra instance port |
### `typedefs_list`
List typedef data types. Supports pagination.
Supports pagination.
### `typedefs_create`
Create a new typedef data type.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | string | *required* | Typedef name |
| `base_type` | string | *required* | Base data type (e.g., `"int"`, `"uint32_t"`, `"char*"`) |
| `port` | int | *current* | Ghidra instance port |
---
## Namespaces
Tools for querying namespaces and class definitions.
### `namespaces_list`
List all non-global namespaces in the program. Supports pagination.
Supports pagination.
### `classes_list`
List class namespaces with qualified names. Supports pagination.
Supports pagination.
---
## Segments
### `segments_list`
List memory segments (`.text`, `.data`, `.bss`, etc.) with read/write/execute permissions, start address, and size.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | string | *none* | Filter by segment name (exact match) |
Supports pagination.
---
## Cursors
Tools for managing pagination state. Every paginated tool response includes a `cursor_id` in the pagination metadata when more pages are available.
### `cursor_next`
Fetch the next page of results for a cursor. Cursors expire after 5 minutes of inactivity.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `cursor_id` | string | *required* | Cursor identifier from a previous paginated response |
Returns the next page of results with updated pagination info.
### `cursor_list`
List active cursors for the current session.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `all_sessions` | bool | `false` | Include cursors from all sessions |
### `cursor_delete`
Delete a specific cursor to free resources.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `cursor_id` | string | *required* | Cursor identifier to delete |
### `cursor_delete_all`
Delete all cursors for the current session.
---
## Docker
Tools for managing Ghidra Docker containers. See the [Docker Usage](/reference/docker/) page for environment variables and firmware import details.
### `docker_auto_start`
The primary entry point for automatic container management. Checks all pooled ports for an existing instance analyzing the same binary. If none is found, allocates a port and starts a new container. Returns connection info immediately -- poll `docker_health` to check when the API is ready.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `binary_path` | string | *required* | Path to the binary file |
| `language` | string | *none* | Ghidra processor language ID (e.g., `"ARM:LE:32:v4t"`) |
| `base_address` | string | *none* | Base address for raw binaries (e.g., `"0x00000000"`) |
| `loader` | string | *none* | Loader type. Auto-set to `"BinaryLoader"` when language is specified. |
### `docker_start`
Start a container with explicit control over all parameters. Ports are auto-allocated from the pool (8192-8319).
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `binary_path` | string | *required* | Path to the binary file |
| `memory` | string | `"2G"` | Max JVM heap size |
| `name` | string | *auto* | Container name (auto-generated with session ID) |
| `language` | string | *none* | Ghidra processor language ID |
| `base_address` | string | *none* | Base address (hex) |
| `loader` | string | *none* | Loader type |
### `docker_stop`
Stop and optionally remove a container. Session-scoped: you can only stop containers started by your own MCP session.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name_or_id` | string | *required* | Container name or ID |
| `remove` | bool | `true` | Also remove the container |
### `docker_health`
Check if a container's HTTP API is responding. Tries `/health` first, then falls back to the root endpoint for older plugin versions.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `port` | int | *current* | API port to check |
| `timeout` | float | `5.0` | Request timeout in seconds |
### `docker_logs`
Get stdout/stderr from a container. Useful for monitoring analysis progress.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name_or_id` | string | *required* | Container name or ID |
| `tail` | int | `100` | Number of lines to show |
| `follow` | bool | `false` | Follow log output (not recommended for MCP) |
### `docker_status`
List all MCGhidra containers, Docker images, port pool allocation status, and whether Docker/Compose are available.
### `docker_build`
Build the MCGhidra Docker image from source.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `tag` | string | `"latest"` | Image tag |
| `no_cache` | bool | `false` | Build without Docker cache |
| `project_dir` | string | *auto* | Path to MCGhidra project root |
### `docker_cleanup`
Remove orphaned containers and stale port lock files. By default, only cleans containers from the current session for safety.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `session_only` | bool | `true` | Only clean this session's containers |
| `max_age_hours` | float | `24.0` | Max age for orphaned containers |
| `dry_run` | bool | `false` | Report what would be cleaned without acting |
### `docker_session_info`
Show containers and allocated ports for the current MCP session.

View File

@ -0,0 +1,203 @@
---
title: MCP Resources
description: Reference for MCGhidra's read-only MCP resource URIs
---
MCGhidra registers 19 MCP resources that provide read-only access to Ghidra data. Resources are a good fit for quick enumeration -- they return data without requiring tool calls and work well for populating context at the start of a conversation.
## Resources vs Tools
Resources and tools serve different purposes:
- **Resources** return a capped snapshot of data. They have no pagination controls, no filtering, and a fixed maximum result size. Use them for a quick overview: "what functions exist in this binary?" or "what strings are defined?"
- **Tools** support pagination, grep filtering, field projection, and mutation operations. Use them when you need to page through large result sets, search for specific items, or modify the program.
If a resource hits its cap, the response includes a `_hint` field suggesting which tool to use for full pagination.
## Result Caps
Each resource type has a configurable maximum number of items it will return. These defaults are set in `MCGhidraConfig.resource_caps`:
| Resource Type | Default Cap |
|---------------|-------------|
| functions | 1000 |
| strings | 500 |
| data | 1000 |
| structs | 500 |
| xrefs | 500 |
| symbols | 1000 |
| segments | 500 |
| variables | 1000 |
| namespaces | 500 |
| classes | 500 |
| bookmarks | 1000 |
| enums | 500 |
| typedefs | 500 |
---
## Instance Resources
### `ghidra://instances`
List all active Ghidra instances. Runs a quick discovery scan before returning.
Returns: `instances` array (port, project, file), `count`, and `current_port`.
### `ghidra://instance/{port}`
Get detailed information about a specific Ghidra instance, including program metadata from the plugin's root endpoint.
**URI parameter:** `port` -- the instance port number.
### `ghidra://instance/{port}/summary`
Program overview with aggregate statistics. Fetches function count and string count in addition to basic program metadata (name, language, processor, format).
**URI parameter:** `port` -- the instance port number.
### `ghidra://instance/{port}/program`
Program metadata: architecture, language ID, compiler spec, image base address, and memory size. This is the same data returned by the REST API's `GET /program` endpoint.
**URI parameter:** `port` -- the instance port number.
---
## Function Resources
### `ghidra://instance/{port}/functions`
List functions in the program. Capped at 1000 items.
Returns: `functions` array, `count`, and `capped_at` (non-null if the cap was reached).
If capped, use the `functions_list()` tool for full pagination.
### `ghidra://instance/{port}/function/decompile/address/{address}`
Decompile a function by its address. Returns the C pseudocode as a plain text string.
**URI parameters:** `port`, `address` (hex, e.g., `0x401000`).
### `ghidra://instance/{port}/function/decompile/name/{name}`
Decompile a function by name. Returns the C pseudocode as a plain text string.
**URI parameters:** `port`, `name` (function name, e.g., `main`).
---
## Data Resources
### `ghidra://instance/{port}/strings`
List defined strings in the binary. Capped at 500 items.
Returns: `strings` array, `count`, and `capped_at`.
If capped, use `data_list_strings()` for full pagination.
### `ghidra://instance/{port}/data`
List defined data items. Capped at 1000 items.
Returns: `data` array, `count`, and `capped_at`.
If capped, use `data_list()` for full pagination.
### `ghidra://instance/{port}/structs`
List struct data types. Capped at 500 items.
Returns: `structs` array, `count`, and `capped_at`.
If capped, use `structs_list()` for full pagination.
---
## Cross-Reference Resources
### `ghidra://instance/{port}/xrefs/to/{address}`
Get all cross-references pointing to the specified address. Capped at 500 items.
**URI parameters:** `port`, `address` (hex).
Returns: `address`, `xrefs_to` array, `count`, and `capped_at`.
If capped, use `xrefs_list(to_addr=...)` for full pagination.
### `ghidra://instance/{port}/xrefs/from/{address}`
Get all cross-references originating from the specified address. Capped at 500 items.
**URI parameters:** `port`, `address` (hex).
Returns: `address`, `xrefs_from` array, `count`, and `capped_at`.
If capped, use `xrefs_list(from_addr=...)` for full pagination.
---
## Symbol Resources
### `ghidra://instance/{port}/symbols`
List all symbols in the program. Capped at 1000 items.
Returns: `symbols` array, `count`, and `capped_at`.
If capped, use `symbols_list()` for full pagination.
### `ghidra://instance/{port}/symbols/imports`
List imported symbols (external references). Capped at 1000 items.
Returns: `imports` array, `count`, and `capped_at`.
If capped, use `symbols_imports()` for full pagination.
### `ghidra://instance/{port}/symbols/exports`
List exported symbols (entry points). Capped at 1000 items.
Returns: `exports` array, `count`, and `capped_at`.
If capped, use `symbols_exports()` for full pagination.
---
## Other Resources
### `ghidra://instance/{port}/segments`
List memory segments with names, address ranges, sizes, and permissions. Capped at 500 items.
Returns: `segments` array, `count`, and `capped_at`.
If capped, use `segments_list()` for full pagination.
### `ghidra://instance/{port}/namespaces`
List all non-global namespaces. Capped at 500 items.
Returns: `namespaces` array, `count`, and `capped_at`.
If capped, use `namespaces_list()` for full pagination.
### `ghidra://instance/{port}/classes`
List class namespaces with qualified names. Capped at 500 items.
Returns: `classes` array, `count`, and `capped_at`.
If capped, use `classes_list()` for full pagination.
### `ghidra://instance/{port}/variables`
List variables. Capped at 1000 items.
Returns: `variables` array, `count`, and `capped_at`.
If capped, use `variables_list()` for full pagination.

View File

@ -0,0 +1,423 @@
---
title: REST API
description: Reference for the Ghidra plugin's HATEOAS HTTP API
---
The Ghidra plugin runs an HTTP server inside the JVM and exposes a HATEOAS REST API. Every response includes hypermedia links (`_links`) to related resources, so clients can discover the API by following links rather than hardcoding paths.
The MCP server wraps this API as MCP tools. You generally do not need to call the REST API directly, but understanding it helps when debugging or building custom integrations.
## General Concepts
### Request Format
Standard HTTP verbs: `GET` to read, `POST` to create, `PATCH` to modify, `PUT` to replace, `DELETE` to remove. Request bodies use JSON (`Content-Type: application/json`). Include an `X-Request-ID` header for correlation if needed.
### Response Envelope
Every response follows this structure:
```json
{
"id": "req-123",
"instance": "http://localhost:8192",
"success": true,
"result": { ... },
"_links": {
"self": { "href": "/path/to/resource" },
"related": { "href": "/path/to/related" }
}
}
```
- `id` -- Correlation identifier from `X-Request-ID`, or a generated value.
- `instance` -- URL of the plugin instance that handled the request.
- `result` -- The payload. A single object for detail endpoints, an array for list endpoints.
- `_links` -- HATEOAS links to related resources and actions.
### Error Responses
Errors use standard HTTP status codes and include a structured error object:
```json
{
"id": "req-456",
"instance": "http://localhost:8192",
"success": false,
"error": {
"code": "RESOURCE_NOT_FOUND",
"message": "No function at address 0x999999"
}
}
```
Common status codes: `200` OK, `201` Created, `400` Bad Request, `404` Not Found, `500` Internal Server Error.
### Pagination
List endpoints accept `offset` and `limit` query parameters. Responses include `size` (total count), `offset`, `limit`, and `_links` with `next`/`prev` when applicable.
```
GET /functions?offset=50&limit=50
```
### Addressing and Search
Resources can be accessed by hex address or searched by name:
- By address: `GET /functions/0x401000`
- By exact name: `GET /functions?name=main`
- By substring: `GET /functions?name_contains=init`
- By regex: `GET /functions?name_matches_regex=^FUN_`
---
## Meta Endpoints
### `GET /plugin-version`
Returns the plugin build version and API version number. The MCP server uses this for compatibility checks.
```json
{
"result": {
"plugin_version": "v2.0.0",
"api_version": 2
}
}
```
### `GET /info`
Returns details about the current plugin instance: loaded file, architecture, processor, address size, project name, and server port.
```json
{
"result": {
"file": "example.exe",
"architecture": "x86:LE:64:default",
"processor": "x86",
"addressSize": 64,
"project": "MyProject",
"serverPort": 8192,
"instanceCount": 1
}
}
```
### `GET /instances`
Lists all active plugin instances (one per open program in the Ghidra project). Each entry includes port, type, project, file, and links to connect.
### `GET /program`
Returns program metadata: language ID, compiler spec, image base address, memory size, and analysis status.
```json
{
"result": {
"name": "mybinary.exe",
"languageId": "x86:LE:64:default",
"compilerSpecId": "gcc",
"imageBase": "0x400000",
"memorySize": 1048576,
"analysisComplete": true
}
}
```
---
## Functions
### `GET /functions`
List functions. Supports pagination and search parameters (`name`, `name_contains`, `name_matches_regex`, `addr`).
```json
{
"result": [
{ "name": "main", "address": "0x401000" },
{ "name": "init_peripherals", "address": "0x08001cf0" }
],
"size": 150,
"offset": 0,
"limit": 50
}
```
### `POST /functions`
Create a function at an address. Body: `{ "address": "0x401000" }`.
### `GET /functions/{address}`
Get function details: name, signature, size, stack depth, calling convention, varargs status.
```json
{
"result": {
"name": "process_data",
"address": "0x4010a0",
"signature": "int process_data(char * data, int size)",
"size": 128,
"calling_convention": "__stdcall"
}
}
```
### `PATCH /functions/{address}`
Modify a function. Payload can include `name`, `signature`, and `comment`.
```json
{ "name": "calculate_checksum", "signature": "uint32_t calculate_checksum(uint8_t* buffer, size_t length)" }
```
### `DELETE /functions/{address}`
Delete the function definition at the specified address.
### `GET /functions/{address}/decompile`
Get decompiled C pseudocode. Optional query parameters:
| Parameter | Description |
|-----------|-------------|
| `syntax_tree` | `true` to include the syntax tree as JSON |
| `style` | Decompiler simplification style (e.g., `normalize`) |
| `timeout` | Decompilation timeout in seconds |
```json
{
"result": {
"address": "0x4010a0",
"ccode": "int process_data(char *param_1, int param_2)\n{\n ...\n}\n"
}
}
```
### `GET /functions/{address}/disassembly`
Get assembly listing. Supports pagination (`offset`, `limit`).
```json
{
"result": [
{ "address": "0x4010a0", "mnemonic": "PUSH", "operands": "RBP", "bytes": "55" },
{ "address": "0x4010a1", "mnemonic": "MOV", "operands": "RBP, RSP", "bytes": "4889E5" }
]
}
```
### `GET /functions/{address}/variables`
List local variables for a function. Supports name search.
### `PATCH /functions/{address}/variables/{variable_name}`
Modify a local variable. Payload: `{ "name": "new_name", "type": "int" }`.
---
## Data
### `GET /data`
List defined data items. Supports search (`name`, `name_contains`, `addr`, `type`) and pagination.
### `POST /data`
Define data at an address. Body: `{ "address": "0x402000", "type": "dword" }`.
### `GET /data/{address}`
Get data item details (type, size, value representation).
### `PATCH /data/{address}`
Modify a data item: change `name`, `type`, or `comment`.
### `DELETE /data/{address}`
Undefine the data item at the specified address.
### `GET /strings`
List defined strings. Supports pagination and a `filter` parameter for substring matching.
```json
{
"result": [
{ "address": "0x00401234", "value": "Hello, world!", "length": 14, "type": "string" },
{ "address": "0x00401250", "value": "Error: could not open file", "length": 26, "type": "string" }
]
}
```
---
## Structs
### `GET /structs`
List struct data types. Supports pagination and `category` filtering.
### `GET /structs?name={name}`
Get detailed struct information including all fields with offsets, types, and comments.
```json
{
"result": {
"name": "MyStruct",
"size": 16,
"category": "/custom",
"fields": [
{ "name": "id", "offset": 0, "length": 4, "type": "int", "comment": "Unique identifier" },
{ "name": "flags", "offset": 4, "length": 4, "type": "dword", "comment": "" }
]
}
}
```
### `POST /structs/create`
Create a struct. Body: `{ "name": "NetworkPacket", "category": "/network" }`.
### `POST /structs/addfield`
Add a field. Body: `{ "struct": "NetworkPacket", "fieldName": "header", "fieldType": "dword" }`.
### `POST /structs/updatefield`
Update a field. Identify by `fieldName` or `fieldOffset`, then provide `newName`, `newType`, and/or `newComment`.
### `POST /structs/delete`
Delete a struct. Body: `{ "name": "NetworkPacket" }`.
---
## Symbols
### `GET /symbols`
List all symbols. Supports search and pagination. Can filter by `type` (`function`, `data`, `label`).
### `POST /symbols`
Create or rename a symbol. Body: `{ "address": "0x401000", "name": "my_label" }`.
### `PATCH /symbols/{address}`
Modify a symbol (rename, change namespace, set as primary).
### `DELETE /symbols/{address}`
Remove the symbol at the specified address.
---
## Memory
### `GET /memory/{address}`
Read bytes from memory.
| Parameter | Description |
|-----------|-------------|
| `length` | Number of bytes (required, server-imposed max) |
| `format` | `hex`, `base64`, or `string` (default: `hex`) |
```json
{
"result": {
"address": "0x402000",
"length": 16,
"format": "hex",
"bytes": "48656C6C6F20576F726C642100000000"
}
}
```
### `PATCH /memory/{address}`
Write bytes. Body: `{ "bytes": "DEADBEEF", "format": "hex" }`. Use with caution.
---
## Segments
### `GET /segments`
List memory segments (`.text`, `.data`, `.bss`, etc.) with address ranges, sizes, and R/W/X permissions.
### `GET /segments/{name}`
Get details for a specific segment.
---
## Cross-References
### `GET /xrefs`
Find cross-references. At least one query parameter is required:
| Parameter | Description |
|-----------|-------------|
| `to_addr` | References pointing to this address |
| `from_addr` | References originating from this address |
| `type` | Filter: `CALL`, `READ`, `WRITE`, `DATA`, `POINTER` |
Supports pagination.
---
## Analysis
### `GET /analysis`
Get analysis status and list of available analyzers.
```json
{
"result": {
"program": "mybinary.exe",
"analysis_enabled": true,
"available_analyzers": [
"Function Start Analyzer",
"Reference Analyzer",
"Decompiler Parameter ID"
]
}
}
```
### `POST /analysis`
Trigger re-analysis of the program.
### `GET /analysis/callgraph`
Generate a call graph.
| Parameter | Default | Description |
|-----------|---------|-------------|
| `function` | *entry point* | Starting function name |
| `max_depth` | `3` | Maximum call depth |
Returns `nodes` (functions) and `edges` (calls between them with call-site addresses).
### `GET /analysis/dataflow`
Trace data flow from an address.
| Parameter | Default | Description |
|-----------|---------|-------------|
| `address` | *required* | Starting address |
| `direction` | `forward` | `forward` or `backward` |
| `max_steps` | `50` | Maximum analysis steps |
Returns a list of `steps`, each with an address, instruction, and description.

1
docs-site/src/env.d.ts vendored Normal file
View File

@ -0,0 +1 @@
/// <reference path="../.astro/types.d.ts" />

View File

@ -0,0 +1,62 @@
/* MCGhidra reverse engineering tooling aesthetic
* Warm amber/rust accent (#c85533), dark grays
*/
/* Light mode */
:root {
--sl-color-accent-low: #f8e4db;
--sl-color-accent: #b5452a;
--sl-color-accent-high: #6e2915;
--sl-color-white: #1a1614;
--sl-color-gray-1: #3a3230;
--sl-color-gray-2: #5c524e;
--sl-color-gray-3: #8a7e78;
--sl-color-gray-4: #b0a49e;
--sl-color-gray-5: #d0c8c4;
--sl-color-gray-6: #eae4e0;
--sl-color-gray-7: #f6f2f0;
--sl-color-black: #faf8f6;
--sl-font: 'Segoe UI', system-ui, -apple-system, sans-serif;
--sl-font-mono: 'Cascadia Code', 'Fira Code', 'JetBrains Mono', ui-monospace, monospace;
}
/* Dark mode */
:root[data-theme='dark'] {
--sl-color-accent-low: #3a1a0c;
--sl-color-accent: #e07040;
--sl-color-accent-high: #f5c4aa;
--sl-color-white: #f6f2f0;
--sl-color-gray-1: #d0c8c4;
--sl-color-gray-2: #b0a49e;
--sl-color-gray-3: #8a7e78;
--sl-color-gray-4: #5c524e;
--sl-color-gray-5: #3a3230;
--sl-color-gray-6: #261e1a;
--sl-color-gray-7: #1a1614;
--sl-color-black: #100c0a;
}
/* Inline code background */
:root {
--sl-color-bg-inline-code: #eae4e0;
}
:root[data-theme='dark'] {
--sl-color-bg-inline-code: #261e1a;
}
/* Tighter sidebar spacing */
nav.sidebar .top-level > li + li {
margin-top: 0.25rem;
}
/* Code blocks and disassembly tables */
table {
font-size: 0.9rem;
}
th {
background: var(--sl-color-gray-6);
font-weight: 600;
}

3
docs-site/tsconfig.json Normal file
View File

@ -0,0 +1,3 @@
{
"extends": "astro/tsconfigs/strict"
}

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

135
pom.xml
View File

@ -4,20 +4,28 @@
<modelVersion>4.0.0</modelVersion>
<groupId>eu.starsong.ghidra</groupId>
<artifactId>GhydraMCP</artifactId>
<artifactId>MCGhidra</artifactId>
<packaging>jar</packaging>
<version>${revision}</version>
<name>GhydraMCP</name>
<url>https://github.com/teal-bauer/GhydraMCP</url>
<version>dev</version>
<name>MCGhidra</name>
<url>https://github.com/starsong-consulting/MCGhidra</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
<ghidra.jar.location>${project.basedir}/lib</ghidra.jar.location>
<!-- Default paths to local lib - will be overridden if GHIDRA_HOME is set -->
<ghidra.generic.jar>${project.basedir}/lib/Generic.jar</ghidra.generic.jar>
<ghidra.softwaremodeling.jar>${project.basedir}/lib/SoftwareModeling.jar</ghidra.softwaremodeling.jar>
<ghidra.project.jar>${project.basedir}/lib/Project.jar</ghidra.project.jar>
<ghidra.docking.jar>${project.basedir}/lib/Docking.jar</ghidra.docking.jar>
<ghidra.decompiler.jar>${project.basedir}/lib/Decompiler.jar</ghidra.decompiler.jar>
<ghidra.utility.jar>${project.basedir}/lib/Utility.jar</ghidra.utility.jar>
<ghidra.base.jar>${project.basedir}/lib/Base.jar</ghidra.base.jar>
<maven.deploy.skip>true</maven.deploy.skip>
<maven.install.skip>true</maven.install.skip>
<maven.build.timestamp.format>yyyyMMdd-HHmmss</maven.build.timestamp.format>
<revision>dev-SNAPSHOT</revision>
<revision>dev</revision>
<inner.zip.filename>MCGhidra-${git.commit.id.describe}-${maven.build.timestamp}.zip</inner.zip.filename>
</properties>
<dependencies>
@ -26,57 +34,59 @@
<groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId>
<version>2.10.1</version>
<scope>provided</scope>
</dependency>
<!-- Ghidra JARs as system-scoped dependencies -->
<!-- Paths can be overridden via GHIDRA_HOME environment variable -->
<dependency>
<groupId>ghidra</groupId>
<artifactId>Generic</artifactId>
<version>11.3.1</version>
<version>11.4.2</version>
<scope>system</scope>
<systemPath>${ghidra.jar.location}/Generic.jar</systemPath>
<systemPath>${ghidra.generic.jar}</systemPath>
</dependency>
<dependency>
<groupId>ghidra</groupId>
<artifactId>SoftwareModeling</artifactId>
<version>11.3.1</version>
<version>11.4.2</version>
<scope>system</scope>
<systemPath>${ghidra.jar.location}/SoftwareModeling.jar</systemPath>
<systemPath>${ghidra.softwaremodeling.jar}</systemPath>
</dependency>
<dependency>
<groupId>ghidra</groupId>
<artifactId>Project</artifactId>
<version>11.3.1</version>
<version>11.4.2</version>
<scope>system</scope>
<systemPath>${ghidra.jar.location}/Project.jar</systemPath>
<systemPath>${ghidra.project.jar}</systemPath>
</dependency>
<dependency>
<groupId>ghidra</groupId>
<artifactId>Docking</artifactId>
<version>11.3.1</version>
<version>11.4.2</version>
<scope>system</scope>
<systemPath>${ghidra.jar.location}/Docking.jar</systemPath>
<systemPath>${ghidra.docking.jar}</systemPath>
</dependency>
<dependency>
<groupId>ghidra</groupId>
<artifactId>Decompiler</artifactId>
<version>11.3.1</version>
<version>11.4.2</version>
<scope>system</scope>
<systemPath>${ghidra.jar.location}/Decompiler.jar</systemPath>
<systemPath>${ghidra.decompiler.jar}</systemPath>
</dependency>
<dependency>
<groupId>ghidra</groupId>
<artifactId>Utility</artifactId>
<version>11.3.1</version>
<version>11.4.2</version>
<scope>system</scope>
<systemPath>${ghidra.jar.location}/Utility.jar</systemPath>
<systemPath>${ghidra.utility.jar}</systemPath>
</dependency>
<dependency>
<groupId>ghidra</groupId>
<artifactId>Base</artifactId>
<version>11.3.1</version>
<version>11.4.2</version>
<scope>system</scope>
<systemPath>${ghidra.jar.location}/Base.jar</systemPath>
<systemPath>${ghidra.base.jar}</systemPath>
</dependency>
<!-- Test dependencies -->
@ -97,16 +107,6 @@
</resources>
<plugins>
<!-- Set Java version -->
<!-- Resources plugin to handle filtering -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-resources-plugin</artifactId>
<version>3.3.1</version>
<configuration>
<encoding>UTF-8</encoding>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
@ -114,16 +114,15 @@
<configuration>
<source>21</source>
<target>21</target>
<!-- Ignore warning about system paths -->
<compilerArgument>-Xlint:-path</compilerArgument>
</configuration>
</plugin>
<!-- Git Commit ID plugin to generate version from git -->
<!-- Git Commit ID plugin to generate version info -->
<plugin>
<groupId>io.github.git-commit-id</groupId>
<artifactId>git-commit-id-maven-plugin</artifactId>
<version>5.0.0</version>
<version>9.0.1</version>
<executions>
<execution>
<id>get-git-info</id>
@ -134,40 +133,15 @@
</execution>
</executions>
<configuration>
<gitDescribe>
<tags>true</tags>
</gitDescribe>
<generateGitPropertiesFile>true</generateGitPropertiesFile>
<generateGitPropertiesFilename>${project.build.outputDirectory}/git.properties</generateGitPropertiesFilename>
<includeOnlyProperties>
<includeOnlyProperty>git.commit.id.abbrev</includeOnlyProperty>
<includeOnlyProperty>git.commit.time</includeOnlyProperty>
<includeOnlyProperty>git.closest.tag.name</includeOnlyProperty>
<includeOnlyProperty>git.build.version</includeOnlyProperty>
</includeOnlyProperties>
<commitIdGenerationMode>full</commitIdGenerationMode>
</configuration>
</plugin>
<!-- Set revision property from git info -->
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>build-helper-maven-plugin</artifactId>
<version>3.4.0</version>
<executions>
<execution>
<id>set-revision-from-git</id>
<phase>initialize</phase>
<goals>
<goal>regex-property</goal>
</goals>
<configuration>
<name>revision</name>
<value>${git.commit.id.abbrev}-${maven.build.timestamp}</value>
<regex>.*</regex>
<replacement>$0</replacement>
<failIfNoMatch>false</failIfNoMatch>
</configuration>
</execution>
</executions>
</plugin>
<!-- Use custom MANIFEST.MF -->
<plugin>
@ -179,16 +153,16 @@
<addDefaultImplementationEntries>false</addDefaultImplementationEntries>
</manifest>
<manifestEntries>
<Implementation-Title>GhydraMCP</Implementation-Title>
<Implementation-Title>MCGhidra</Implementation-Title>
<Implementation-Version>${git.commit.id.abbrev}-${maven.build.timestamp}</Implementation-Version>
<Plugin-Class>eu.starsong.ghidra.GhydraMCP</Plugin-Class>
<Plugin-Name>GhydraMCP</Plugin-Name>
<Plugin-Class>eu.starsong.ghidra.MCGhidra</Plugin-Class>
<Plugin-Name>MCGhidra</Plugin-Name>
<Plugin-Version>${git.commit.id.abbrev}-${maven.build.timestamp}</Plugin-Version>
<Plugin-Author>LaurieWired, Teal Bauer</Plugin-Author>
<Plugin-Description>Expose multiple Ghidra tools to MCP servers with variable management</Plugin-Description>
</manifestEntries>
</archive>
<finalName>GhydraMCP</finalName>
<finalName>MCGhidra</finalName>
<excludes>
<exclude>**/App.class</exclude>
</excludes>
@ -196,16 +170,16 @@
</configuration>
</plugin>
<!-- The Assembly Plugin for creating the Ghidra extension ZIP -->
<!-- The Assembly Plugin -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>3.3.0</version>
<executions>
<!-- Default execution for the plugin only -->
<!-- Execution for the Ghidra extension zip -->
<execution>
<id>plugin-assembly</id>
<phase>package</phase>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
@ -213,7 +187,7 @@
<descriptors>
<descriptor>src/assembly/ghidra-extension.xml</descriptor>
</descriptors>
<finalName>GhydraMCP-${git.commit.id.abbrev}-${maven.build.timestamp}</finalName>
<finalName>MCGhidra-${git.commit.id.describe}-${maven.build.timestamp}</finalName>
<appendAssemblyId>false</appendAssemblyId>
</configuration>
</execution>
@ -221,7 +195,7 @@
<!-- Execution for the complete package -->
<execution>
<id>complete-package</id>
<phase>package</phase>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
@ -229,7 +203,7 @@
<descriptors>
<descriptor>src/assembly/complete-package.xml</descriptor>
</descriptors>
<finalName>GhydraMCP-Complete-${git.commit.id.abbrev}-${maven.build.timestamp}</finalName>
<finalName>MCGhidra-Complete-${git.commit.id.describe}-${maven.build.timestamp}</finalName>
<appendAssemblyId>false</appendAssemblyId>
</configuration>
</execution>
@ -285,6 +259,25 @@
</build>
<profiles>
<!-- Profile activated when GHIDRA_HOME environment variable is set -->
<profile>
<id>use-ghidra-home</id>
<activation>
<property>
<name>env.GHIDRA_HOME</name>
</property>
</activation>
<properties>
<ghidra.generic.jar>${env.GHIDRA_HOME}/Ghidra/Framework/Generic/lib/Generic.jar</ghidra.generic.jar>
<ghidra.softwaremodeling.jar>${env.GHIDRA_HOME}/Ghidra/Framework/SoftwareModeling/lib/SoftwareModeling.jar</ghidra.softwaremodeling.jar>
<ghidra.project.jar>${env.GHIDRA_HOME}/Ghidra/Framework/Project/lib/Project.jar</ghidra.project.jar>
<ghidra.docking.jar>${env.GHIDRA_HOME}/Ghidra/Framework/Docking/lib/Docking.jar</ghidra.docking.jar>
<ghidra.decompiler.jar>${env.GHIDRA_HOME}/Ghidra/Features/Decompiler/lib/Decompiler.jar</ghidra.decompiler.jar>
<ghidra.utility.jar>${env.GHIDRA_HOME}/Ghidra/Framework/Utility/lib/Utility.jar</ghidra.utility.jar>
<ghidra.base.jar>${env.GHIDRA_HOME}/Ghidra/Features/Base/lib/Base.jar</ghidra.base.jar>
</properties>
</profile>
<!-- Profile for building just the Ghidra plugin -->
<profile>
<id>plugin-only</id>

48
pyproject.toml Normal file
View File

@ -0,0 +1,48 @@
[project]
name = "mcghidra"
version = "2026.3.7"
description = "Reverse engineering bridge: multi-instance Ghidra plugin with HATEOAS REST API and MCP server for decompilation, analysis & binary manipulation"
readme = "README.md"
requires-python = ">=3.11"
authors = [
{name = "Ryan Malloy", email = "ryan@supported.systems"}
]
dependencies = [
"mcp>=1.22.0",
"requests>=2.32.3",
"fastmcp>=2.0.0",
"fastmcp-feedback>=1.0.0",
]
[project.scripts]
mcghidra = "mcghidra:main"
[project.urls]
Documentation = "https://mcghidra.warehack.ing"
Repository = "https://git.supported.systems/MCP/mcghidra"
Issues = "https://git.supported.systems/MCP/mcghidra/issues"
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[tool.hatch.build.targets.wheel]
packages = ["mcghidra"]
[tool.hatch.build]
sources = ["src"]
[tool.hatch.build.targets.sdist]
include = [
"/src",
"/README.md",
"/pyproject.toml",
]
[tool.ruff]
line-length = 100
target-version = "py311"
[tool.ruff.lint]
select = ["E", "F", "I", "W"]
ignore = ["E501"] # Line too long - handled by formatter

View File

@ -1,6 +1,6 @@
#!/usr/bin/env python3
"""
Test runner for GhydraMCP tests.
Test runner for MCGhidra tests.
This script runs both the HTTP API tests and the MCP bridge tests.
"""
import os
@ -21,10 +21,10 @@ def run_http_api_tests():
# Import and run the tests
try:
from test_http_api import GhydraMCPHttpApiTests
from test_http_api import MCGhidraHttpApiTests
# Create a test suite with all tests from GhydraMCPHttpApiTests
suite = unittest.TestLoader().loadTestsFromTestCase(GhydraMCPHttpApiTests)
# Create a test suite with all tests from MCGhidraHttpApiTests
suite = unittest.TestLoader().loadTestsFromTestCase(MCGhidraHttpApiTests)
# Run the tests
result = unittest.TextTestRunner(verbosity=2).run(suite)
@ -68,24 +68,73 @@ def run_mcp_bridge_tests():
print(f"Error running MCP bridge tests: {str(e)}")
return False
def run_data_tests():
"""Run the data operations tests."""
print_header("Running Data Operations Tests")
try:
result = subprocess.run(
[sys.executable, "test_data_operations.py"],
capture_output=True,
text=True
)
if result.stdout:
print("STDOUT:")
print(result.stdout)
if result.stderr:
print("STDERR:")
print(result.stderr)
return result.returncode == 0
except Exception as e:
print(f"Error running data operations tests: {str(e)}")
return False
def run_comment_tests():
"""Run the comment functionality tests."""
print_header("Running Comment Tests")
try:
result = subprocess.run(
[sys.executable, "test_comments.py"],
capture_output=True,
text=True
)
if result.stdout:
print("STDOUT:")
print(result.stdout)
if result.stderr:
print("STDERR:")
print(result.stderr)
return result.returncode == 0
except Exception as e:
print(f"Error running comment tests: {str(e)}")
return False
def run_all_tests():
"""Run all tests"""
print_header("GhydraMCP Test Suite")
print_header("MCGhidra Test Suite")
# Run the HTTP API tests
# Run test suites
http_api_success = run_http_api_tests()
# Run the MCP bridge tests
mcp_bridge_success = run_mcp_bridge_tests()
data_tests_success = run_data_tests()
comment_tests_success = run_comment_tests()
# Print a summary
print_header("Test Summary")
print(f"HTTP API Tests: {'PASSED' if http_api_success else 'FAILED'}")
print(f"MCP Bridge Tests: {'PASSED' if mcp_bridge_success else 'FAILED'}")
print(f"Overall: {'PASSED' if http_api_success and mcp_bridge_success else 'FAILED'}")
print(f"Data Operations Tests: {'PASSED' if data_tests_success else 'FAILED'}")
print(f"Comment Tests: {'PASSED' if comment_tests_success else 'FAILED'}")
print(f"Overall: {'PASSED' if (http_api_success and mcp_bridge_success and data_tests_success and comment_tests_success) else 'FAILED'}")
# Return True if all tests passed, False otherwise
return http_api_success and mcp_bridge_success
return http_api_success and mcp_bridge_success and data_tests_success and comment_tests_success
if __name__ == "__main__":
# Check if we have the required dependencies
@ -104,9 +153,15 @@ if __name__ == "__main__":
elif sys.argv[1] == "--mcp":
# Run only the MCP bridge tests
success = run_mcp_bridge_tests()
elif sys.argv[1] == "--data":
# Run only the data operations tests
success = run_data_tests()
elif sys.argv[1] == "--comments":
# Run only the comment tests
success = run_comment_tests()
else:
print(f"Unknown argument: {sys.argv[1]}")
print("Usage: python run_tests.py [--http|--mcp]")
print("Usage: python run_tests.py [--http|--mcp|--data|--comments]")
sys.exit(1)
else:
# Run all tests

View File

@ -19,17 +19,16 @@
<include>README.md</include>
<include>LICENSE</include>
</includes>
<outputDirectory></outputDirectory>
</fileSet>
<!-- Include pre-built Ghidra plugin zip -->
<fileSet>
<directory>${project.build.directory}</directory>
<includes>
<!-- Use build.identifier to match the actual filename -->
<include>GhydraMCP-${build.identifier}.zip</include>
</includes>
<outputDirectory></outputDirectory>
<outputDirectory>/</outputDirectory>
</fileSet>
</fileSets>
<files>
<!-- Include the inner ZIP directly from target -->
<file>
<source>${project.build.directory}/${inner.zip.filename}</source>
<outputDirectory>/</outputDirectory>
<fileMode>0644</fileMode>
</file>
</files>
</assembly>

View File

@ -4,37 +4,31 @@
xsi:schemaLocation="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.3
http://maven.apache.org/xsd/assembly-1.1.3.xsd">
<!-- Just a name for reference -->
<id>ghidra-extension</id>
<!-- We want a .zip file -->
<formats>
<format>zip</format>
</formats>
<!-- Don't put everything in an extra top-level directory named after the assembly ID -->
<includeBaseDirectory>false</includeBaseDirectory>
<fileSets>
<!-- 1) Copy extension.properties and Module.manifest into the top level
of a folder named GhydraMCP/ (the actual extension folder). -->
<!-- Copy extension files to MCGhidra/ directory -->
<fileSet>
<directory>src/main/resources</directory>
<includes>
<include>extension.properties</include>
<include>Module.manifest</include>
</includes>
<outputDirectory>GhydraMCP</outputDirectory>
</fileSet>
<!-- 2) Copy your built plugin JAR into GhydraMCP/lib -->
<fileSet>
<directory>${project.build.directory}</directory>
<includes>
<!-- Use the finalized JAR name from the maven-jar-plugin -->
<include>GhydraMCP.jar</include>
</includes>
<outputDirectory>GhydraMCP/lib</outputDirectory>
<outputDirectory>MCGhidra</outputDirectory>
</fileSet>
</fileSets>
<dependencySets>
<!-- Include the main project JAR as MCGhidra.jar -->
<dependencySet>
<useProjectArtifact>true</useProjectArtifact>
<outputDirectory>MCGhidra/lib</outputDirectory>
<outputFileNameMapping>MCGhidra.jar</outputFileNameMapping>
<unpack>false</unpack>
</dependencySet>
</dependencySets>
</assembly>

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,507 @@
package eu.starsong.ghidra;
// Imports for refactored structure
import eu.starsong.ghidra.api.*;
import eu.starsong.ghidra.endpoints.*;
import eu.starsong.ghidra.util.*;
import eu.starsong.ghidra.model.*;
import java.io.IOException;
import java.net.InetSocketAddress;
import java.net.ServerSocket;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.Executors;
// For JSON response handling
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.sun.net.httpserver.HttpExchange;
import com.sun.net.httpserver.HttpServer;
import com.sun.net.httpserver.Headers;
import ghidra.app.plugin.PluginCategoryNames;
import ghidra.app.services.ProgramManager;
import ghidra.framework.main.ApplicationLevelPlugin;
import ghidra.framework.model.Project;
import ghidra.framework.plugintool.Plugin;
import ghidra.framework.plugintool.PluginInfo;
import ghidra.framework.plugintool.PluginTool;
import ghidra.framework.plugintool.util.PluginStatus;
import ghidra.program.model.listing.Program;
import ghidra.util.Msg;
@PluginInfo(
status = PluginStatus.RELEASED,
packageName = ghidra.app.DeveloperPluginPackage.NAME,
category = PluginCategoryNames.ANALYSIS,
shortDescription = "MCGhidra Plugin for MCP Analysis",
description = "Exposes program data via HATEOAS HTTP API for automated reverse engineering with MCP (Model Context Protocol).",
servicesRequired = { ProgramManager.class }
)
public class MCGhidraPlugin extends Plugin implements ApplicationLevelPlugin {
// Made public static to be accessible by InstanceEndpoints
public static final Map<Integer, MCGhidraPlugin> activeInstances = new ConcurrentHashMap<>();
private static final Object baseInstanceLock = new Object();
private HttpServer server;
private int port;
private boolean isBaseInstance = false;
private long serverStartTimeMs;
/**
* Constructor for MCGhidra Plugin.
* @param tool The Ghidra PluginTool
*/
public MCGhidraPlugin(PluginTool tool) {
super(tool);
this.port = findAvailablePort();
activeInstances.put(port, this);
synchronized (baseInstanceLock) {
if (port == ApiConstants.DEFAULT_PORT || activeInstances.get(ApiConstants.DEFAULT_PORT) == null) {
this.isBaseInstance = true;
Msg.info(this, "Starting as base instance on port " + port);
}
}
Msg.info(this, "MCGhidraPlugin loaded on port " + port);
System.out.println("[MCGhidra] Plugin loaded on port " + port);
try {
startServer();
} catch (IOException e) {
Msg.error(this, "Failed to start HTTP server on port " + port, e);
if (e.getMessage().contains("Address already in use")) {
Msg.showError(this, null, "Port Conflict",
"Port " + port + " is already in use. Please specify a different port with -Dghidra.mcp.port=NEW_PORT");
}
}
}
/**
* Starts the HTTP server and registers all endpoints
*/
private void startServer() throws IOException {
server = HttpServer.create(new InetSocketAddress(port), 0);
// Use a cached thread pool for better performance with multiple concurrent requests
server.setExecutor(Executors.newCachedThreadPool());
// --- Register Endpoints ---
Program currentProgram = getCurrentProgram(); // Get program once
// Register Meta Endpoints (these don't require a program)
registerMetaEndpoints(server);
// Register endpoints that don't require a program
registerProjectEndpoints(server);
new InstanceEndpoints(currentProgram, port, activeInstances).registerEndpoints(server);
// Register Resource Endpoints that require a program
registerProgramDependentEndpoints(server);
// Register Root Endpoint (should be last to include links to all other endpoints)
registerRootEndpoint(server);
serverStartTimeMs = System.currentTimeMillis();
new Thread(() -> {
server.start();
Msg.info(this, "MCGhidra HTTP server started on port " + port);
System.out.println("[MCGhidra] HTTP server started on port " + port);
}, "MCGhidra-HTTP-Server").start();
}
/**
* Register all endpoints that require a program to function.
* This method always registers all endpoints, even when no program is loaded.
* The endpoints will check for program availability at runtime when they're called.
*
* IMPORTANT: Endpoints are registered in order from most specific to least specific
* to ensure proper URL path matching.
*/
private void registerProgramDependentEndpoints(HttpServer server) {
// Register all endpoints without checking for a current program
// The endpoints will check for the current program at runtime when they're called
Msg.info(this, "Registering program-dependent endpoints. Programs will be checked at runtime.");
Program currentProgram = getCurrentProgram();
Msg.info(this, "Current program at registration time: " + (currentProgram != null ? currentProgram.getName() : "none"));
new FunctionEndpoints(currentProgram, port, tool).registerEndpoints(server);
new VariableEndpoints(currentProgram, port, tool).registerEndpoints(server);
new ClassEndpoints(currentProgram, port, tool).registerEndpoints(server);
new SegmentEndpoints(currentProgram, port, tool).registerEndpoints(server);
new SymbolEndpoints(currentProgram, port, tool).registerEndpoints(server);
new NamespaceEndpoints(currentProgram, port, tool).registerEndpoints(server);
new DataEndpoints(currentProgram, port, tool).registerEndpoints(server);
new StructEndpoints(currentProgram, port, tool).registerEndpoints(server);
new MemoryEndpoints(currentProgram, port, tool).registerEndpoints(server);
new XrefsEndpoints(currentProgram, port, tool).registerEndpoints(server);
new AnalysisEndpoints(currentProgram, port, tool).registerEndpoints(server);
new ProgramEndpoints(currentProgram, port, tool).registerEndpoints(server);
Msg.info(this, "Registered program-dependent endpoints. Programs will be checked at runtime.");
}
/**
* Register additional endpoints for current program state
*/
private void registerProgramStateEndpoints(HttpServer server) {
// Any additional endpoints can be added here if needed
// But prefer to use the HATEOAS endpoints in ProgramEndpoints, FunctionEndpoints, etc.
}
// --- Endpoint Registration Methods ---
/**
* Register meta endpoints that provide plugin information
*/
private void registerMetaEndpoints(HttpServer server) {
// Plugin version endpoint
server.createContext("/plugin-version", exchange -> {
try {
if ("GET".equals(exchange.getRequestMethod())) {
ResponseBuilder builder = new ResponseBuilder(exchange, port)
.success(true)
.result(Map.of(
"plugin_version", ApiConstants.PLUGIN_VERSION,
"api_version", ApiConstants.API_VERSION
))
.addLink("self", "/plugin-version")
.addLink("root", "/");
HttpUtil.sendJsonResponse(exchange, builder.build(), 200, port);
} else {
HttpUtil.sendErrorResponse(exchange, 405, "Method Not Allowed", "METHOD_NOT_ALLOWED", port);
}
} catch (IOException e) {
Msg.error(this, "Error handling /plugin-version", e);
}
});
// Health endpoint lightweight, no program dependency
server.createContext("/health", exchange -> {
try {
if ("GET".equals(exchange.getRequestMethod())) {
long uptimeMs = System.currentTimeMillis() - serverStartTimeMs;
Program program = getCurrentProgram();
Map<String, Object> healthData = new HashMap<>();
healthData.put("status", "up");
healthData.put("port", port);
healthData.put("api_version", ApiConstants.API_VERSION);
healthData.put("uptime_ms", uptimeMs);
healthData.put("program", program != null ? program.getName() : null);
ResponseBuilder builder = new ResponseBuilder(exchange, port)
.success(true)
.result(healthData)
.addLink("self", "/health")
.addLink("root", "/");
HttpUtil.sendJsonResponse(exchange, builder.build(), 200, port);
} else {
HttpUtil.sendErrorResponse(exchange, 405, "Method Not Allowed", "METHOD_NOT_ALLOWED", port);
}
} catch (IOException e) {
Msg.error(this, "Error handling /health", e);
}
});
// Info endpoint
server.createContext("/info", exchange -> {
try {
Map<String, Object> infoData = new HashMap<>();
infoData.put("isBaseInstance", isBaseInstance);
Program program = getCurrentProgram();
if (program != null) {
infoData.put("file", program.getName());
infoData.put("architecture", program.getLanguage().getLanguageID().getIdAsString());
infoData.put("processor", program.getLanguage().getProcessor().toString());
infoData.put("addressSize", program.getAddressFactory().getDefaultAddressSpace().getSize());
infoData.put("creationDate", program.getCreationDate());
infoData.put("executable", program.getExecutablePath());
}
Project project = tool.getProject();
if (project != null) {
infoData.put("project", project.getName());
infoData.put("projectLocation", project.getProjectLocator().toString());
}
// Add server details
infoData.put("serverPort", port);
infoData.put("serverStartTime", System.currentTimeMillis());
infoData.put("instanceCount", activeInstances.size());
ResponseBuilder builder = new ResponseBuilder(exchange, port)
.success(true)
.result(infoData)
.addLink("self", "/info")
.addLink("root", "/")
.addLink("instances", "/instances");
// Add program link if available
if (program != null) {
builder.addLink("program", "/program");
}
HttpUtil.sendJsonResponse(exchange, builder.build(), 200, port);
} catch (Exception e) {
Msg.error(this, "Error serving /info endpoint", e);
try {
HttpUtil.sendErrorResponse(exchange, 500, "Internal server error: " + e.getMessage(), "INTERNAL_ERROR", port);
} catch (IOException ioEx) {
Msg.error(this, "Failed to send error for /info", ioEx);
}
}
});
}
/**
* Register project-related endpoints
*/
private void registerProjectEndpoints(HttpServer server) {
server.createContext("/projects", exchange -> {
try {
if ("GET".equals(exchange.getRequestMethod())) {
List<Map<String, String>> projects = new ArrayList<>();
Project project = tool.getProject();
if (project != null) {
Map<String, String> projInfo = new HashMap<>();
projInfo.put("name", project.getName());
projInfo.put("location", project.getProjectLocator().toString());
projects.add(projInfo);
}
ResponseBuilder builder = new ResponseBuilder(exchange, port)
.success(true)
.result(projects)
.addLink("self", "/projects")
.addLink("create", "/projects", "POST");
// Add link to current project if available
if (project != null) {
builder.addLink("current", "/projects/" + project.getName());
}
HttpUtil.sendJsonResponse(exchange, builder.build(), 200, port);
} else if ("POST".equals(exchange.getRequestMethod())) {
// Creating projects is not yet implemented
HttpUtil.sendErrorResponse(exchange, 501, "Creating projects via API is not implemented", "NOT_IMPLEMENTED", port);
} else {
HttpUtil.sendErrorResponse(exchange, 405, "Method Not Allowed", "METHOD_NOT_ALLOWED", port);
}
} catch (Exception e) {
Msg.error(this, "Error serving /projects endpoint", e);
try {
HttpUtil.sendErrorResponse(exchange, 500, "Internal server error: " + e.getMessage(), "INTERNAL_ERROR", port);
} catch (IOException ioEx) {
Msg.error(this, "Failed to send error for /projects", ioEx);
}
}
});
// Specific project endpoint
server.createContext("/projects/", exchange -> {
try {
String path = exchange.getRequestURI().getPath();
if (path.equals("/projects/") || path.equals("/projects")) {
// This should be handled by the /projects context
exchange.getResponseHeaders().set("Location", "/projects");
exchange.sendResponseHeaders(302, -1);
return;
}
// Extract project name from path
String projectName = path.substring("/projects/".length());
if ("GET".equals(exchange.getRequestMethod())) {
Project currentProject = tool.getProject();
if (currentProject == null) {
HttpUtil.sendErrorResponse(exchange, 404, "No project is currently open", "NO_PROJECT_OPEN", port);
return;
}
if (!currentProject.getName().equals(projectName)) {
HttpUtil.sendErrorResponse(exchange, 404, "Project not found: " + projectName, "PROJECT_NOT_FOUND", port);
return;
}
// Build project details
Map<String, Object> projectDetails = new HashMap<>();
projectDetails.put("name", currentProject.getName());
projectDetails.put("location", currentProject.getProjectLocator().toString());
ResponseBuilder builder = new ResponseBuilder(exchange, port)
.success(true)
.result(projectDetails)
.addLink("self", "/projects/" + projectName)
.addLink("programs", "/programs?project=" + projectName);
HttpUtil.sendJsonResponse(exchange, builder.build(), 200, port);
} else {
HttpUtil.sendErrorResponse(exchange, 405, "Method Not Allowed", "METHOD_NOT_ALLOWED", port);
}
} catch (Exception e) {
Msg.error(this, "Error serving /projects/{name} endpoint", e);
try {
HttpUtil.sendErrorResponse(exchange, 500, "Internal server error: " + e.getMessage(), "INTERNAL_ERROR", port);
} catch (IOException ioEx) {
Msg.error(this, "Failed to send error for /projects/{name}", ioEx);
}
}
});
}
/**
* Register the root endpoint which provides links to all other API endpoints
*/
private void registerRootEndpoint(HttpServer server) {
server.createContext("/", exchange -> {
try {
// Check if this is actually a CORS preflight request
if (exchange.getAttribute("cors.handled") != null) {
// CORS was already handled
return;
}
// Check if this is a request for the root endpoint specifically
if (!exchange.getRequestURI().getPath().equals("/")) {
HttpUtil.sendErrorResponse(exchange, 404, "Endpoint not found", "ENDPOINT_NOT_FOUND", port);
return;
}
Map<String, Object> rootData = new HashMap<>();
rootData.put("message", "MCGhidra API " + ApiConstants.API_VERSION);
rootData.put("documentation", "See GHIDRA_HTTP_API.md for full API documentation");
rootData.put("isBaseInstance", isBaseInstance);
// Build the HATEOAS response
ResponseBuilder builder = new ResponseBuilder(exchange, port)
.success(true)
.result(rootData)
.addLink("self", "/")
.addLink("health", "/health")
.addLink("info", "/info")
.addLink("plugin-version", "/plugin-version")
.addLink("projects", "/projects")
.addLink("instances", "/instances")
.addLink("programs", "/programs");
// Add links to program-dependent endpoints if a program is loaded
if (getCurrentProgram() != null) {
Project project = tool.getProject();
String projectName = (project != null) ? project.getName() : "unknown";
builder.addLink("program", "/program")
.addLink("project", "/projects/" + projectName)
.addLink("functions", "/functions")
.addLink("symbols", "/symbols")
.addLink("data", "/data")
.addLink("strings", "/strings")
.addLink("segments", "/segments")
.addLink("structs", "/structs")
.addLink("memory", "/memory")
.addLink("xrefs", "/xrefs")
.addLink("analysis", "/analysis")
.addLink("address", "/address")
.addLink("function", "/function");
}
HttpUtil.sendJsonResponse(exchange, builder.build(), 200, port);
} catch (Exception e) {
Msg.error(this, "Error serving / endpoint", e);
try {
HttpUtil.sendErrorResponse(exchange, 500, "Internal server error: " + e.getMessage(), "INTERNAL_ERROR", port);
} catch (IOException ioEx) {
Msg.error(this, "Failed to send error for /", ioEx);
}
}
});
}
// ----------------------------------------------------------------------------------
// Core Plugin Methods
// ----------------------------------------------------------------------------------
/**
* Gets the current program from the Ghidra tool
* @return The current program or null if no program is loaded
*/
public Program getCurrentProgram() {
if (tool == null) {
Msg.debug(this, "Tool is null when trying to get current program");
return null;
}
ProgramManager pm = tool.getService(ProgramManager.class);
if (pm == null) {
Msg.debug(this, "ProgramManager service is not available");
return null;
}
return pm.getCurrentProgram();
}
/**
* Find an available port for the HTTP server
* @return An available port number
*/
private int findAvailablePort() {
int basePort = ApiConstants.DEFAULT_PORT;
int maxAttempts = ApiConstants.MAX_PORT_ATTEMPTS;
for (int attempt = 0; attempt < maxAttempts; attempt++) {
int candidate = basePort + attempt;
if (!activeInstances.containsKey(candidate)) {
try (ServerSocket s = new ServerSocket(candidate)) {
return candidate;
} catch (IOException e) {
Msg.debug(this, "Port " + candidate + " is not available, trying next.");
}
} else {
Msg.debug(this, "Port " + candidate + " already tracked as active instance.");
}
}
Msg.error(this, "Could not find an available port between " + basePort + " and " + (basePort + maxAttempts - 1));
throw new RuntimeException("Could not find available port after " + maxAttempts + " attempts");
}
/**
* Called when the plugin is disposed
*/
@Override
public void dispose() {
if (server != null) {
server.stop(0); // Stop immediately
Msg.info(this, "MCGhidra HTTP server stopped on port " + port);
System.out.println("[MCGhidra] HTTP server stopped on port " + port);
}
activeInstances.remove(port);
super.dispose();
}
/**
* Get the port this plugin instance is running on
* @return The HTTP server port
*/
public int getPort() {
return port;
}
/**
* Check if this is the base instance
* @return true if this is the base instance
*/
public boolean isBaseInstance() {
return isBaseInstance;
}
}

View File

@ -0,0 +1,8 @@
package eu.starsong.ghidra.api;
public class ApiConstants {
public static final String PLUGIN_VERSION = "v2.1.0";
public static final int API_VERSION = 2010;
public static final int DEFAULT_PORT = 8192;
public static final int MAX_PORT_ATTEMPTS = 10;
}

View File

@ -0,0 +1,8 @@
package eu.starsong.ghidra.api;
import com.sun.net.httpserver.HttpExchange;
import com.sun.net.httpserver.HttpHandler;
public interface GhidraJsonEndpoint extends HttpHandler {
void registerEndpoints(com.sun.net.httpserver.HttpServer server);
}

View File

@ -0,0 +1,88 @@
package eu.starsong.ghidra.api;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.sun.net.httpserver.HttpExchange;
import java.util.Map;
import java.util.UUID;
/**
* Builder for standardized API responses (following GHIDRA_HTTP_API.md v1).
* This should be used by endpoint handlers to construct responses.
*/
public class ResponseBuilder {
private final HttpExchange exchange;
private final int port; // Port of the current Ghidra instance handling the request
private JsonObject response;
private JsonObject links; // For HATEOAS links
private final Gson gson = new Gson(); // Gson instance for serialization
public ResponseBuilder(HttpExchange exchange, int port) {
this.exchange = exchange;
this.port = port;
this.response = new JsonObject();
this.links = new JsonObject();
// Add standard fields
String requestId = exchange.getRequestHeaders().getFirst("X-Request-ID");
response.addProperty("id", requestId != null ? requestId : UUID.randomUUID().toString());
response.addProperty("instance", "http://localhost:" + port); // URL of this instance
}
public ResponseBuilder success(boolean success) {
response.addProperty("success", success);
return this;
}
public ResponseBuilder result(Object data) {
response.add("result", gson.toJsonTree(data));
return this;
}
public ResponseBuilder error(String message, String code) {
JsonObject error = new JsonObject();
error.addProperty("message", message);
if (code != null) {
error.addProperty("code", code);
}
response.add("error", error);
return this;
}
/**
* Add metadata to the response (e.g., pagination info)
* @param metadata Map of metadata key-value pairs
* @return this builder
*/
public ResponseBuilder metadata(Map<String, Object> metadata) {
if (metadata != null) {
for (Map.Entry<String, Object> entry : metadata.entrySet()) {
response.add(entry.getKey(), gson.toJsonTree(entry.getValue()));
}
}
return this;
}
public ResponseBuilder addLink(String rel, String href) {
JsonObject link = new JsonObject();
link.addProperty("href", href);
links.add(rel, link);
return this;
}
// Overload to add link with method
public ResponseBuilder addLink(String rel, String href, String method) {
JsonObject link = new JsonObject();
link.addProperty("href", href);
link.addProperty("method", method);
links.add(rel, link);
return this;
}
public JsonObject build() {
if (links.size() > 0) {
response.add("_links", links);
}
return response;
}
}

View File

@ -0,0 +1,182 @@
package eu.starsong.ghidra.endpoints;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.sun.net.httpserver.HttpExchange;
import eu.starsong.ghidra.api.GhidraJsonEndpoint;
import eu.starsong.ghidra.api.ResponseBuilder; // Import ResponseBuilder
import eu.starsong.ghidra.util.GhidraUtil; // Import GhidraUtil
import eu.starsong.ghidra.util.HttpUtil; // Import HttpUtil
import ghidra.app.services.ProgramManager;
import ghidra.framework.plugintool.PluginTool;
import ghidra.program.model.listing.Program;
import ghidra.util.Msg;
import java.io.IOException;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public abstract class AbstractEndpoint implements GhidraJsonEndpoint {
@Override
public void handle(HttpExchange exchange) throws IOException {
// Handle OPTIONS requests
if (HttpUtil.handleOptionsRequest(exchange)) {
return;
}
// This method is required by HttpHandler interface
// Each endpoint will register its own context handlers with specific paths
// so this default implementation should never be called
sendErrorResponse(exchange, 404, "Endpoint not found", "ENDPOINT_NOT_FOUND");
}
/**
* Helper method to handle pagination of collections and add pagination links to the response.
*
* @param <T> the type of items in the collection
* @param items the full collection to paginate
* @param offset the starting offset for pagination
* @param limit the maximum number of items per page
* @param builder the ResponseBuilder to add pagination links to
* @param basePath the base path for pagination links (without query parameters)
* @param additionalQueryParams additional query parameters to include in pagination links or null
* @return a list containing the paginated items
*/
protected <T> List<T> applyPagination(List<T> items, int offset, int limit,
eu.starsong.ghidra.api.ResponseBuilder builder, String basePath, String additionalQueryParams) {
// Apply pagination
int start = Math.max(0, offset);
int end = Math.min(items.size(), offset + limit);
List<T> paginated = items.subList(start, end);
// Add pagination metadata
Map<String, Object> metadata = new HashMap<>();
metadata.put("size", items.size());
metadata.put("offset", offset);
metadata.put("limit", limit);
builder.metadata(metadata);
// Format the query string
String queryParams = (additionalQueryParams != null && !additionalQueryParams.isEmpty())
? additionalQueryParams + "&"
: "";
// Add HATEOAS links
builder.addLink("self", basePath + "?" + queryParams + "offset=" + offset + "&limit=" + limit);
// Add next/prev links if applicable
if (end < items.size()) {
builder.addLink("next", basePath + "?" + queryParams + "offset=" + end + "&limit=" + limit);
}
if (offset > 0) {
int prevOffset = Math.max(0, offset - limit);
builder.addLink("prev", basePath + "?" + queryParams + "offset=" + prevOffset + "&limit=" + limit);
}
return paginated;
}
/**
* Overload of applyPagination without additional query parameters
*/
protected <T> List<T> applyPagination(List<T> items, int offset, int limit,
eu.starsong.ghidra.api.ResponseBuilder builder, String basePath) {
return applyPagination(items, offset, limit, builder, basePath, null);
}
protected final Gson gson = new Gson(); // Keep Gson if needed for specific object handling
protected Program currentProgram;
protected int port; // Add port field
// Constructor to receive Program and Port
public AbstractEndpoint(Program program, int port) {
this.currentProgram = program;
this.port = port;
}
// Get the current program - dynamically checks for program availability at runtime
protected Program getCurrentProgram() {
// ALWAYS try to get the current program from the tool first, regardless of the stored program
// This ensures we get the most up-to-date program state
try {
PluginTool tool = getTool();
if (tool != null) {
ProgramManager programManager = tool.getService(ProgramManager.class);
if (programManager != null) {
Program current = programManager.getCurrentProgram();
if (current != null) {
return current; // Return the current program from the tool
}
}
}
} catch (Exception e) {
Msg.error(this, "Error getting current program from tool", e);
}
// Only fall back to the stored program if dynamic lookup fails
if (currentProgram != null) {
return currentProgram;
}
return null;
}
// Can be overridden by subclasses that have a tool reference
protected PluginTool getTool() {
return null;
}
// --- Methods using HttpUtil ---
protected void sendJsonResponse(HttpExchange exchange, JsonObject data, int statusCode) throws IOException {
HttpUtil.sendJsonResponse(exchange, data, statusCode, this.port);
}
// Overload for sending success responses easily using ResponseBuilder
protected void sendSuccessResponse(HttpExchange exchange, Object resultData) throws IOException {
// No longer check if program is required here
// Each handler method should check for program availability at runtime if needed
ResponseBuilder builder = new ResponseBuilder(exchange, port)
.success(true)
.result(resultData);
// Add common links if desired here
HttpUtil.sendJsonResponse(exchange, builder.build(), 200, this.port);
}
/**
* Override this method in endpoint implementations that require a program to function.
* @return true if this endpoint requires a program, false otherwise
*/
protected boolean requiresProgram() {
// Default implementation returns true for most endpoints
return true;
}
protected void sendErrorResponse(HttpExchange exchange, int code, String message, String errorCode) throws IOException {
HttpUtil.sendErrorResponse(exchange, code, message, errorCode, this.port);
}
// Overload without error code
protected void sendErrorResponse(HttpExchange exchange, int code, String message) throws IOException {
HttpUtil.sendErrorResponse(exchange, code, message, null, this.port);
}
protected Map<String, String> parseQueryParams(HttpExchange exchange) {
return HttpUtil.parseQueryParams(exchange);
}
protected Map<String, String> parseJsonPostParams(HttpExchange exchange) throws IOException {
return HttpUtil.parseJsonPostParams(exchange);
}
// --- Methods using GhidraUtil ---
protected int parseIntOrDefault(String val, int defaultValue) {
return GhidraUtil.parseIntOrDefault(val, defaultValue);
}
// Add other common Ghidra related utilities here or call GhidraUtil directly
}

View File

@ -0,0 +1,113 @@
package eu.starsong.ghidra.endpoints;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.sun.net.httpserver.HttpExchange;
import com.sun.net.httpserver.HttpServer;
import eu.starsong.ghidra.api.ResponseBuilder;
import ghidra.program.model.listing.Program;
import ghidra.framework.plugintool.PluginTool;
import ghidra.util.Msg;
import java.io.IOException;
import java.util.*;
public class AnalysisEndpoints extends AbstractEndpoint {
private PluginTool tool;
public AnalysisEndpoints(Program program, int port) {
super(program, port);
}
public AnalysisEndpoints(Program program, int port, PluginTool tool) {
super(program, port);
this.tool = tool;
}
@Override
protected PluginTool getTool() {
return tool;
}
@Override
public void registerEndpoints(HttpServer server) {
server.createContext("/analysis", this::handleAnalysisRequest);
// NOTE: The callgraph endpoint is now registered in ProgramEndpoints
// This comment is to avoid confusion during future maintenance
}
private void handleAnalysisRequest(HttpExchange exchange) throws IOException {
try {
String method = exchange.getRequestMethod();
Program program = getCurrentProgram();
if (program == null) {
sendErrorResponse(exchange, 400, "No program loaded", "NO_PROGRAM_LOADED");
return;
}
// Create ResponseBuilder for HATEOAS-compliant response
ResponseBuilder builder = new ResponseBuilder(exchange, port)
.success(true)
.addLink("self", "/analysis");
// Add common links
builder.addLink("program", "/program");
// Get analysis status
Map<String, Object> status = new HashMap<>();
// Add program information
status.put("processor", program.getLanguage().getProcessor().toString());
status.put("addressSize", program.getAddressFactory().getDefaultAddressSpace().getSize());
status.put("programName", program.getName());
status.put("programLanguage", program.getLanguage().toString());
// Add analyzer counts - simplified since we don't have access to the Analysis API directly
int totalAnalyzers = 0;
int enabledAnalyzers = 0;
// Simple analysis status with minimal API use
Map<String, Boolean> analyzerStatus = new HashMap<>();
// Note: We're not attempting to get all analyzers as this would require access to internal Ghidra APIs
analyzerStatus.put("basicAnalysis", true);
analyzerStatus.put("advancedAnalysis", false);
totalAnalyzers = 2;
enabledAnalyzers = 1;
// Add counts to status report
status.put("totalAnalyzers", totalAnalyzers);
status.put("enabledAnalyzers", enabledAnalyzers);
status.put("analyzerStatus", analyzerStatus);
// Handle different request types
if ("GET".equals(method)) {
builder.result(status);
sendJsonResponse(exchange, builder.build(), 200);
} else if ("POST".equals(method)) {
// We can't directly start/stop analysis without direct AutoAnalysisManager access,
// so return a placeholder response
Map<String, String> params = parseJsonPostParams(exchange);
String action = params.get("action");
Map<String, Object> result = new HashMap<>();
result.put("success", true);
result.put("message", "Analysis action '" + action + "' requested, but not fully implemented yet.");
result.put("status", status);
builder.result(result);
sendJsonResponse(exchange, builder.build(), 200);
} else {
sendErrorResponse(exchange, 405, "Method Not Allowed");
}
} catch (Exception e) {
Msg.error(this, "Error in /analysis endpoint", e);
sendErrorResponse(exchange, 500, "Internal server error: " + e.getMessage());
}
}
}

View File

@ -0,0 +1,144 @@
package eu.starsong.ghidra.endpoints;
import com.google.gson.JsonObject;
import com.sun.net.httpserver.HttpExchange;
import com.sun.net.httpserver.HttpServer;
import eu.starsong.ghidra.api.ResponseBuilder;
import ghidra.framework.plugintool.PluginTool;
import ghidra.program.model.listing.Program;
import ghidra.program.model.symbol.Namespace;
import ghidra.program.model.symbol.Symbol;
import ghidra.util.Msg;
import java.io.IOException;
import java.util.*;
public class ClassEndpoints extends AbstractEndpoint {
private PluginTool tool;
// Updated constructor to accept port
public ClassEndpoints(Program program, int port) {
super(program, port); // Call super constructor
}
public ClassEndpoints(Program program, int port, PluginTool tool) {
super(program, port);
this.tool = tool;
}
@Override
protected PluginTool getTool() {
return tool;
}
@Override
public void registerEndpoints(HttpServer server) {
server.createContext("/classes", this::handleClasses);
}
private void handleClasses(HttpExchange exchange) throws IOException {
try {
if ("GET".equals(exchange.getRequestMethod())) {
Map<String, String> qparams = parseQueryParams(exchange);
int offset = parseIntOrDefault(qparams.get("offset"), 0);
int limit = parseIntOrDefault(qparams.get("limit"), 100);
// Always get the most current program from the tool
Program program = getCurrentProgram();
if (program == null) {
sendErrorResponse(exchange, 400, "No program loaded", "NO_PROGRAM_LOADED");
return;
}
// Get all class names
Set<String> classNames = new HashSet<>();
for (Symbol symbol : program.getSymbolTable().getAllSymbols(true)) {
Namespace ns = symbol.getParentNamespace();
// Check if namespace is not null, not global, and represents a class
if (ns != null && !ns.isGlobal() && ns.getSymbol().getSymbolType().isNamespace()) {
classNames.add(ns.getName(true)); // Get fully qualified name
}
}
// Sort and paginate
List<String> sorted = new ArrayList<>(classNames);
Collections.sort(sorted);
int start = Math.max(0, offset);
int end = Math.min(sorted.size(), offset + limit);
List<Map<String, Object>> paginatedClasses = new ArrayList<>();
// Create full class objects with namespace info
for (int i = start; i < end; i++) {
String className = sorted.get(i);
Map<String, Object> classInfo = new HashMap<>();
classInfo.put("name", className);
// Add namespace info if it contains a dot
if (className.contains(".")) {
String namespace = className.substring(0, className.lastIndexOf('.'));
classInfo.put("namespace", namespace);
classInfo.put("simpleName", className.substring(className.lastIndexOf('.') + 1));
} else {
classInfo.put("namespace", "default");
classInfo.put("simpleName", className);
}
// Add HATEOAS links for each class
Map<String, Object> links = new HashMap<>();
Map<String, String> selfLink = new HashMap<>();
selfLink.put("href", "/classes/" + className);
links.put("self", selfLink);
// Add link to program if relevant
Map<String, String> programLink = new HashMap<>();
programLink.put("href", "/program");
links.put("program", programLink);
classInfo.put("_links", links);
paginatedClasses.add(classInfo);
}
// We need to separately create the full class objects with details
// so we can't apply pagination directly to sorted list
// Build response with HATEOAS links
ResponseBuilder builder = new ResponseBuilder(exchange, port)
.success(true)
.result(paginatedClasses);
// Add pagination metadata
Map<String, Object> metadata = new HashMap<>();
metadata.put("size", sorted.size());
metadata.put("offset", offset);
metadata.put("limit", limit);
builder.metadata(metadata);
// Add HATEOAS links
builder.addLink("self", "/classes?offset=" + offset + "&limit=" + limit);
builder.addLink("program", "/program");
// Add next/prev links if applicable
if (end < sorted.size()) {
builder.addLink("next", "/classes?offset=" + end + "&limit=" + limit);
}
if (offset > 0) {
int prevOffset = Math.max(0, offset - limit);
builder.addLink("prev", "/classes?offset=" + prevOffset + "&limit=" + limit);
}
sendJsonResponse(exchange, builder.build(), 200);
} else {
sendErrorResponse(exchange, 405, "Method Not Allowed", "METHOD_NOT_ALLOWED");
}
} catch (Exception e) {
Msg.error(this, "Error in /classes endpoint", e);
sendErrorResponse(exchange, 500, "Internal server error: " + e.getMessage(), "INTERNAL_ERROR");
}
}
// parseIntOrDefault is inherited from AbstractEndpoint
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,151 @@
package eu.starsong.ghidra.endpoints;
import com.google.gson.JsonObject;
import com.sun.net.httpserver.HttpExchange;
import com.sun.net.httpserver.HttpServer;
import eu.starsong.ghidra.api.ResponseBuilder;
import eu.starsong.ghidra.MCGhidraPlugin; // Need access to activeInstances
import ghidra.program.model.listing.Program;
import ghidra.util.Msg;
import java.io.IOException;
import java.util.*;
public class InstanceEndpoints extends AbstractEndpoint {
// Need a way to access the static activeInstances map from MCGhidraPlugin
// This is a bit awkward and suggests the instance management might need
// a different design, perhaps a dedicated manager class.
// For now, we pass the map or use a static accessor if made public.
private final Map<Integer, MCGhidraPlugin> activeInstances;
// Note: Passing currentProgram might be null here if no program is open.
// The constructor in AbstractEndpoint handles null program.
// Updated constructor to accept port
public InstanceEndpoints(Program program, int port, Map<Integer, MCGhidraPlugin> instances) {
super(program, port); // Call super constructor
this.activeInstances = instances;
}
@Override
public void registerEndpoints(HttpServer server) {
server.createContext("/instances", this::handleInstances);
server.createContext("/registerInstance", this::handleRegisterInstance);
server.createContext("/unregisterInstance", this::handleUnregisterInstance);
}
@Override
protected boolean requiresProgram() {
// This endpoint doesn't require a program to function
return false;
}
private void handleInstances(HttpExchange exchange) throws IOException {
try {
List<Map<String, Object>> instanceData = new ArrayList<>();
// Accessing the static map directly - requires it to be accessible
// or passed in constructor.
for (Map.Entry<Integer, MCGhidraPlugin> entry : activeInstances.entrySet()) {
Map<String, Object> instance = new HashMap<>();
int instancePort = entry.getKey();
instance.put("port", instancePort);
instance.put("url", "http://localhost:" + instancePort);
instance.put("type", entry.getValue().isBaseInstance() ? "base" : "standard");
// Get program info if available
Program program = entry.getValue().getCurrentProgram();
if (program != null) {
instance.put("project", program.getDomainFile().getParent().getName());
instance.put("file", program.getName());
} else {
instance.put("project", "");
instance.put("file", "");
}
// Add HATEOAS links for each instance
Map<String, Object> links = new HashMap<>();
// Self link for this instance
Map<String, String> selfLink = new HashMap<>();
selfLink.put("href", "/instances/" + instancePort);
links.put("self", selfLink);
// Info link for this instance
Map<String, String> infoLink = new HashMap<>();
infoLink.put("href", "http://localhost:" + instancePort + "/info");
links.put("info", infoLink);
// Connect link
Map<String, String> connectLink = new HashMap<>();
connectLink.put("href", "http://localhost:" + instancePort);
links.put("connect", connectLink);
// Add links to object
instance.put("_links", links);
instanceData.add(instance);
}
// Build response with HATEOAS links
ResponseBuilder builder = new ResponseBuilder(exchange, port)
.success(true)
.result(instanceData);
// Add HATEOAS links
builder.addLink("self", "/instances");
builder.addLink("register", "/registerInstance", "POST");
builder.addLink("unregister", "/unregisterInstance", "POST");
builder.addLink("programs", "/programs");
sendJsonResponse(exchange, builder.build(), 200);
} catch (Exception e) {
Msg.error(this, "Error in /instances endpoint", e);
sendErrorResponse(exchange, 500, "Internal server error: " + e.getMessage(), "INTERNAL_ERROR");
}
}
private void handleRegisterInstance(HttpExchange exchange) throws IOException {
try {
Map<String, String> params = parseJsonPostParams(exchange);
int regPort = parseIntOrDefault(params.get("port"), 0);
if (regPort > 0) {
// Logic to actually register/track the instance should happen elsewhere (e.g., main plugin or dedicated manager)
sendSuccessResponse(exchange, Map.of("message", "Instance registration request received for port " + regPort)); // Use helper
} else {
sendErrorResponse(exchange, 400, "Invalid or missing port number"); // Use helper
}
} catch (IOException e) {
Msg.error(this, "Error parsing POST params for registerInstance", e);
sendErrorResponse(exchange, 400, "Invalid request body: " + e.getMessage(), "INVALID_REQUEST"); // Use helper
} catch (Exception e) {
Msg.error(this, "Error in /registerInstance", e);
sendErrorResponse(exchange, 500, "Internal server error: " + e.getMessage(), "INTERNAL_ERROR"); // Use helper
}
}
private void handleUnregisterInstance(HttpExchange exchange) throws IOException {
try {
Map<String, String> params = parseJsonPostParams(exchange);
int unregPort = parseIntOrDefault(params.get("port"), 0);
if (unregPort > 0 && activeInstances.containsKey(unregPort)) {
// Actual removal should likely happen in the main plugin's map or dedicated manager
activeInstances.remove(unregPort); // Potential ConcurrentModificationException if map is iterated elsewhere
sendSuccessResponse(exchange, Map.of("message", "Instance unregistered for port " + unregPort)); // Use helper
} else {
sendErrorResponse(exchange, 404, "No instance found on port " + unregPort, "RESOURCE_NOT_FOUND"); // Use helper
}
} catch (IOException e) {
Msg.error(this, "Error parsing POST params for unregisterInstance", e);
sendErrorResponse(exchange, 400, "Invalid request body: " + e.getMessage(), "INVALID_REQUEST"); // Use helper
} catch (Exception e) {
Msg.error(this, "Error in /unregisterInstance", e);
sendErrorResponse(exchange, 500, "Internal server error: " + e.getMessage(), "INTERNAL_ERROR"); // Use helper
}
}
// --- Helper Methods Removed (Inherited or internal logic adjusted) ---
// parseIntOrDefault is inherited from AbstractEndpoint
}

View File

@ -0,0 +1,442 @@
package eu.starsong.ghidra.endpoints;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.sun.net.httpserver.HttpExchange;
import com.sun.net.httpserver.HttpServer;
import eu.starsong.ghidra.api.ResponseBuilder;
import eu.starsong.ghidra.util.TransactionHelper;
import ghidra.program.model.address.Address;
import ghidra.program.model.address.AddressFactory;
import ghidra.program.model.mem.Memory;
import ghidra.program.model.mem.MemoryAccessException;
import ghidra.program.model.mem.MemoryBlock;
import ghidra.program.model.listing.CodeUnit;
import ghidra.program.model.listing.Program;
import ghidra.framework.plugintool.PluginTool;
import ghidra.util.Msg;
import java.io.IOException;
import java.util.*;
public class MemoryEndpoints extends AbstractEndpoint {
private static final int DEFAULT_MEMORY_LENGTH = 16;
private static final int MAX_MEMORY_LENGTH = 4096;
private PluginTool tool;
public MemoryEndpoints(Program program, int port) {
super(program, port);
}
public MemoryEndpoints(Program program, int port, PluginTool tool) {
super(program, port);
this.tool = tool;
}
@Override
protected PluginTool getTool() {
return tool;
}
@Override
public void registerEndpoints(HttpServer server) {
// Per HttpServer docs: paths are matched by longest matching prefix
// So register specific endpoints first, then more general ones
// Comments endpoint path needs to be registered with a specific context path
// Example: /memory/0x1000/comments/plate needs a specific handler
server.createContext("/memory/", exchange -> {
String path = exchange.getRequestURI().getPath();
if (path.contains("/comments/")) {
handleMemoryAddressRequest(exchange);
} else if (path.equals("/memory/blocks")) {
handleMemoryBlocksRequest(exchange);
} else {
// Handle as general memory address request
handleMemoryAddressRequest(exchange);
}
});
// Register the most general endpoint last
server.createContext("/memory", this::handleMemoryRequest);
}
private void handleMemoryRequest(HttpExchange exchange) throws IOException {
try {
if ("GET".equals(exchange.getRequestMethod())) {
Map<String, String> qparams = parseQueryParams(exchange);
String addressStr = qparams.get("address");
String lengthStr = qparams.get("length");
// Create ResponseBuilder for HATEOAS-compliant response
ResponseBuilder builder = new ResponseBuilder(exchange, port)
.success(true)
.addLink("self", "/memory" + (exchange.getRequestURI().getRawQuery() != null ?
"?" + exchange.getRequestURI().getRawQuery() : ""));
// Add common links
builder.addLink("program", "/program");
builder.addLink("blocks", "/memory/blocks");
Program program = getCurrentProgram();
if (program == null) {
sendErrorResponse(exchange, 400, "No program loaded", "NO_PROGRAM_LOADED");
return;
}
if (addressStr == null || addressStr.isEmpty()) {
sendErrorResponse(exchange, 400, "Address parameter is required", "MISSING_PARAMETER");
return;
}
// Parse length parameter
int length = DEFAULT_MEMORY_LENGTH;
if (lengthStr != null && !lengthStr.isEmpty()) {
try {
length = Integer.parseInt(lengthStr);
if (length <= 0) {
sendErrorResponse(exchange, 400, "Length must be positive", "INVALID_PARAMETER");
return;
}
if (length > MAX_MEMORY_LENGTH) {
length = MAX_MEMORY_LENGTH;
}
} catch (NumberFormatException e) {
sendErrorResponse(exchange, 400, "Invalid length parameter", "INVALID_PARAMETER");
return;
}
}
// Parse address with safety fallbacks
AddressFactory addressFactory = program.getAddressFactory();
Address address;
try {
// Try to use provided address
address = addressFactory.getAddress(addressStr);
} catch (Exception e) {
try {
// If there's an exception, try to get the image base address instead
address = program.getImageBase();
Msg.warn(this, "Invalid address format. Using image base address: " + address);
} catch (Exception e2) {
// If image base fails, use min address from default space
address = addressFactory.getDefaultAddressSpace().getMinAddress();
Msg.warn(this, "Could not get image base. Using default address: " + address);
}
}
// Read memory
Memory memory = program.getMemory();
if (!memory.contains(address)) {
// Try to find a valid memory block
MemoryBlock[] blocks = memory.getBlocks();
if (blocks.length > 0) {
// Use the first memory block
address = blocks[0].getStart();
Msg.info(this, "Using first memory block address: " + address);
} else {
sendErrorResponse(exchange, 404, "No valid memory blocks found", "NO_MEMORY_BLOCKS");
return;
}
}
try {
// Read bytes
byte[] bytes = new byte[length];
int bytesRead = memory.getBytes(address, bytes, 0, length);
// Format as hex string
StringBuilder hexString = new StringBuilder();
for (int i = 0; i < bytesRead; i++) {
String hex = Integer.toHexString(bytes[i] & 0xFF).toUpperCase();
if (hex.length() == 1) {
hexString.append('0');
}
hexString.append(hex);
if (i < bytesRead - 1) {
hexString.append(' ');
}
}
// Build result object
Map<String, Object> result = new HashMap<>();
result.put("address", address.toString());
result.put("bytesRead", bytesRead);
result.put("hexBytes", hexString.toString());
result.put("rawBytes", Base64.getEncoder().encodeToString(bytes));
// Add next/prev links
builder.addLink("next", "/memory?address=" + address.add(length) + "&length=" + length);
if (address.getOffset() >= length) {
builder.addLink("prev", "/memory?address=" + address.subtract(length) + "&length=" + length);
}
// Add result and send response
builder.result(result);
sendJsonResponse(exchange, builder.build(), 200);
} catch (MemoryAccessException e) {
sendErrorResponse(exchange, 404, "Cannot read memory at address: " + e.getMessage(), "MEMORY_ACCESS_ERROR");
}
} else {
sendErrorResponse(exchange, 405, "Method Not Allowed");
}
} catch (Exception e) {
Msg.error(this, "Error in /memory endpoint", e);
sendErrorResponse(exchange, 500, "Internal server error: " + e.getMessage());
}
}
/**
* Handle requests to /memory/{address} including child resources like comments
*/
private void handleMemoryAddressRequest(HttpExchange exchange) throws IOException {
try {
// Extract address from path: /memory/{address}/...
String path = exchange.getRequestURI().getPath();
if (path.equals("/memory/") || path.equals("/memory")) {
handleMemoryRequest(exchange);
return;
}
// Parse address from path
String remainingPath = path.substring("/memory/".length());
// Check if this is a request for a specific address's comments
if (remainingPath.contains("/comments/")) {
// Format: /memory/{address}/comments/{comment_type}
String[] parts = remainingPath.split("/comments/", 2);
String addressStr = parts[0];
String commentType = parts.length > 1 ? parts[1] : "plate"; // Default to plate comments
handleMemoryComments(exchange, addressStr, commentType);
return;
}
// Otherwise, treat as a direct memory request with address in the path
String addressStr = remainingPath;
Map<String, String> params = parseQueryParams(exchange);
// Handle same as the query parameter version
params.put("address", addressStr);
exchange.setAttribute("address", addressStr);
// Delegate to the main memory handler
handleMemoryRequest(exchange);
} catch (Exception e) {
Msg.error(this, "Error handling memory address endpoint", e);
sendErrorResponse(exchange, 500, "Internal server error: " + e.getMessage(), "INTERNAL_ERROR");
}
}
/**
* Handle requests to set or get comments at a specific memory address
*/
private void handleMemoryComments(HttpExchange exchange, String addressStr, String commentType) throws IOException {
try {
String method = exchange.getRequestMethod();
Program program = getCurrentProgram();
if (program == null) {
sendErrorResponse(exchange, 400, "No program loaded", "NO_PROGRAM_LOADED");
return;
}
// Parse address
AddressFactory addressFactory = program.getAddressFactory();
Address address;
try {
address = addressFactory.getAddress(addressStr);
} catch (Exception e) {
sendErrorResponse(exchange, 400, "Invalid address format: " + addressStr, "INVALID_ADDRESS");
return;
}
// Validate comment type
if (!isValidCommentType(commentType)) {
sendErrorResponse(exchange, 400, "Invalid comment type: " + commentType, "INVALID_COMMENT_TYPE");
return;
}
if ("GET".equals(method)) {
// Get existing comment
String comment = getCommentByType(program, address, commentType);
Map<String, Object> result = new HashMap<>();
result.put("address", addressStr);
result.put("comment_type", commentType);
result.put("comment", comment != null ? comment : "");
ResponseBuilder builder = new ResponseBuilder(exchange, port)
.success(true)
.result(result)
.addLink("self", "/memory/" + addressStr + "/comments/" + commentType);
sendJsonResponse(exchange, builder.build(), 200);
} else if ("POST".equals(method)) {
// Set comment
Map<String, String> params = parseJsonPostParams(exchange);
String comment = params.get("comment");
if (comment == null) {
sendErrorResponse(exchange, 400, "Comment parameter is required", "MISSING_PARAMETER");
return;
}
boolean success = setCommentByType(program, address, commentType, comment);
if (success) {
Map<String, Object> result = new HashMap<>();
result.put("address", addressStr);
result.put("comment_type", commentType);
result.put("comment", comment);
ResponseBuilder builder = new ResponseBuilder(exchange, port)
.success(true)
.result(result)
.addLink("self", "/memory/" + addressStr + "/comments/" + commentType);
sendJsonResponse(exchange, builder.build(), 200);
} else {
sendErrorResponse(exchange, 500, "Failed to set comment", "COMMENT_SET_FAILED");
}
} else {
sendErrorResponse(exchange, 405, "Method Not Allowed", "METHOD_NOT_ALLOWED");
}
} catch (Exception e) {
Msg.error(this, "Error handling memory comments", e);
sendErrorResponse(exchange, 500, "Internal server error: " + e.getMessage(), "INTERNAL_ERROR");
}
}
/**
* Check if the comment type is valid
*/
private boolean isValidCommentType(String commentType) {
return commentType.equals("plate") ||
commentType.equals("pre") ||
commentType.equals("post") ||
commentType.equals("eol") ||
commentType.equals("repeatable");
}
/**
* Get a comment by type at the specified address
*/
private String getCommentByType(Program program, Address address, String commentType) {
if (program == null) return null;
int type = getCommentTypeInt(commentType);
return program.getListing().getComment(type, address);
}
/**
* Set a comment by type at the specified address
*/
private boolean setCommentByType(Program program, Address address, String commentType, String comment) {
if (program == null) return false;
int type = getCommentTypeInt(commentType);
try {
return TransactionHelper.executeInTransaction(program, "Set Comment", () -> {
program.getListing().setComment(address, type, comment);
return true;
});
} catch (Exception e) {
Msg.error(this, "Error setting comment", e);
return false;
}
}
/**
* Convert comment type string to Ghidra's internal comment type constants
*/
private int getCommentTypeInt(String commentType) {
switch (commentType.toLowerCase()) {
case "plate":
return CodeUnit.PLATE_COMMENT;
case "pre":
return CodeUnit.PRE_COMMENT;
case "post":
return CodeUnit.POST_COMMENT;
case "eol":
return CodeUnit.EOL_COMMENT;
case "repeatable":
return CodeUnit.REPEATABLE_COMMENT;
default:
return CodeUnit.PLATE_COMMENT;
}
}
private void handleMemoryBlocksRequest(HttpExchange exchange) throws IOException {
try {
if ("GET".equals(exchange.getRequestMethod())) {
Map<String, String> qparams = parseQueryParams(exchange);
int offset = parseIntOrDefault(qparams.get("offset"), 0);
int limit = parseIntOrDefault(qparams.get("limit"), 100);
Program program = getCurrentProgram();
if (program == null) {
sendErrorResponse(exchange, 400, "No program loaded", "NO_PROGRAM_LOADED");
return;
}
// Create ResponseBuilder for HATEOAS-compliant response
ResponseBuilder builder = new ResponseBuilder(exchange, port)
.success(true)
.addLink("self", "/memory/blocks" + (exchange.getRequestURI().getRawQuery() != null ?
"?" + exchange.getRequestURI().getRawQuery() : ""));
// Add common links
builder.addLink("program", "/program");
builder.addLink("memory", "/memory");
// Get memory blocks
Memory memory = program.getMemory();
List<Map<String, Object>> blocks = new ArrayList<>();
for (MemoryBlock block : memory.getBlocks()) {
Map<String, Object> blockInfo = new HashMap<>();
blockInfo.put("name", block.getName());
blockInfo.put("start", block.getStart().toString());
blockInfo.put("end", block.getEnd().toString());
blockInfo.put("size", block.getSize());
blockInfo.put("permissions", getPermissionString(block));
blockInfo.put("isInitialized", block.isInitialized());
blockInfo.put("isLoaded", block.isLoaded());
blockInfo.put("isMapped", block.isMapped());
blocks.add(blockInfo);
}
// Apply pagination and add it to result
List<Map<String, Object>> paginatedBlocks =
applyPagination(blocks, offset, limit, builder, "/memory/blocks");
// Add the result to the builder
builder.result(paginatedBlocks);
// Send the HATEOAS-compliant response
sendJsonResponse(exchange, builder.build(), 200);
} else {
sendErrorResponse(exchange, 405, "Method Not Allowed");
}
} catch (Exception e) {
Msg.error(this, "Error in /memory/blocks endpoint", e);
sendErrorResponse(exchange, 500, "Internal server error: " + e.getMessage());
}
}
private String getPermissionString(MemoryBlock block) {
StringBuilder perms = new StringBuilder();
perms.append(block.isRead() ? "r" : "-");
perms.append(block.isWrite() ? "w" : "-");
perms.append(block.isExecute() ? "x" : "-");
perms.append(block.isVolatile() ? "v" : "-");
return perms.toString();
}
}

View File

@ -0,0 +1,87 @@
package eu.starsong.ghidra.endpoints;
import com.google.gson.JsonObject;
import com.sun.net.httpserver.HttpExchange;
import com.sun.net.httpserver.HttpServer;
import ghidra.framework.plugintool.PluginTool;
import ghidra.program.model.address.GlobalNamespace;
import ghidra.program.model.listing.Program;
import ghidra.program.model.symbol.Namespace;
import ghidra.program.model.symbol.Symbol;
import ghidra.util.Msg;
import java.io.IOException;
import java.util.*;
public class NamespaceEndpoints extends AbstractEndpoint {
private PluginTool tool;
public NamespaceEndpoints(Program program, int port) {
super(program, port);
}
public NamespaceEndpoints(Program program, int port, PluginTool tool) {
super(program, port);
this.tool = tool;
}
@Override
protected PluginTool getTool() {
return tool;
}
@Override
public void registerEndpoints(HttpServer server) {
server.createContext("/namespaces", this::handleNamespaces);
}
public void handleNamespaces(HttpExchange exchange) throws IOException {
try {
if ("GET".equals(exchange.getRequestMethod())) {
Map<String, String> qparams = parseQueryParams(exchange);
int offset = parseIntOrDefault(qparams.get("offset"), 0);
int limit = parseIntOrDefault(qparams.get("limit"), 100);
Program program = getCurrentProgram();
if (program == null) {
sendErrorResponse(exchange, 400, "No program loaded", "NO_PROGRAM_LOADED");
return;
}
Set<String> namespaces = new HashSet<>();
for (Symbol symbol : program.getSymbolTable().getAllSymbols(true)) {
Namespace ns = symbol.getParentNamespace();
if (ns != null && !(ns instanceof GlobalNamespace)) {
namespaces.add(ns.getName(true)); // Get fully qualified name
}
}
List<String> sorted = new ArrayList<>(namespaces);
Collections.sort(sorted);
// Build response with HATEOAS links
eu.starsong.ghidra.api.ResponseBuilder builder = new eu.starsong.ghidra.api.ResponseBuilder(exchange, port)
.success(true);
// Apply pagination and get paginated items
List<String> paginated = applyPagination(sorted, offset, limit, builder, "/namespaces");
// Set the paginated result
builder.result(paginated);
// Add program link
builder.addLink("program", "/program");
sendJsonResponse(exchange, builder.build(), 200);
} else {
sendErrorResponse(exchange, 405, "Method Not Allowed", "METHOD_NOT_ALLOWED");
}
} catch (Exception e) {
Msg.error(this, "Error in /namespaces endpoint", e);
sendErrorResponse(exchange, 500, "Internal server error: " + e.getMessage(), "INTERNAL_ERROR");
}
}
// parseIntOrDefault is inherited from AbstractEndpoint
}

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,115 @@
package eu.starsong.ghidra.endpoints;
import com.google.gson.JsonObject;
import com.sun.net.httpserver.HttpExchange;
import com.sun.net.httpserver.HttpServer;
import eu.starsong.ghidra.api.ResponseBuilder;
import ghidra.framework.plugintool.PluginTool;
import ghidra.program.model.listing.Program;
import ghidra.program.model.mem.MemoryBlock;
import ghidra.util.Msg;
import java.io.IOException;
import java.util.*;
public class SegmentEndpoints extends AbstractEndpoint {
private PluginTool tool;
// Updated constructor to accept port
public SegmentEndpoints(Program program, int port) {
super(program, port); // Call super constructor
}
public SegmentEndpoints(Program program, int port, PluginTool tool) {
super(program, port);
this.tool = tool;
}
@Override
protected PluginTool getTool() {
return tool;
}
@Override
public void registerEndpoints(HttpServer server) {
server.createContext("/segments", this::handleSegments);
}
public void handleSegments(HttpExchange exchange) throws IOException {
try {
if ("GET".equals(exchange.getRequestMethod())) {
Map<String, String> qparams = parseQueryParams(exchange);
int offset = parseIntOrDefault(qparams.get("offset"), 0);
int limit = parseIntOrDefault(qparams.get("limit"), 100);
String nameFilter = qparams.get("name");
Program program = getCurrentProgram();
if (program == null) {
sendErrorResponse(exchange, 400, "No program loaded", "NO_PROGRAM_LOADED");
return;
}
List<Map<String, Object>> segments = new ArrayList<>();
for (MemoryBlock block : program.getMemory().getBlocks()) {
// Apply name filter if present
if (nameFilter != null && !block.getName().contains(nameFilter)) {
continue;
}
Map<String, Object> segment = new HashMap<>();
segment.put("name", block.getName());
segment.put("start", block.getStart().toString());
segment.put("end", block.getEnd().toString());
segment.put("size", block.getSize());
// Add permissions
segment.put("readable", block.isRead());
segment.put("writable", block.isWrite());
segment.put("executable", block.isExecute());
segment.put("initialized", block.isInitialized());
// Add HATEOAS links for this segment
Map<String, Object> links = new HashMap<>();
Map<String, String> selfLink = new HashMap<>();
selfLink.put("href", "/segments/" + block.getName());
links.put("self", selfLink);
Map<String, String> memoryLink = new HashMap<>();
memoryLink.put("href", "/memory/" + block.getStart());
links.put("memory", memoryLink);
segment.put("_links", links);
segments.add(segment);
}
// Build response with HATEOAS links
ResponseBuilder builder = new ResponseBuilder(exchange, port)
.success(true);
// Handle optional name filter
String queryParams = nameFilter != null ? "name=" + nameFilter : null;
// Apply pagination and get paginated items
List<Map<String, Object>> paginatedSegments = applyPagination(
segments, offset, limit, builder, "/segments", queryParams);
// Set the paginated result
builder.result(paginatedSegments);
// Add program link
builder.addLink("program", "/program");
sendJsonResponse(exchange, builder.build(), 200);
} else {
sendErrorResponse(exchange, 405, "Method Not Allowed", "METHOD_NOT_ALLOWED");
}
} catch (Exception e) {
Msg.error(this, "Error in /segments endpoint", e);
sendErrorResponse(exchange, 500, "Internal server error: " + e.getMessage(), "INTERNAL_ERROR");
}
}
// parseIntOrDefault is inherited from AbstractEndpoint
}

View File

@ -0,0 +1,776 @@
package eu.starsong.ghidra.endpoints;
import com.sun.net.httpserver.HttpExchange;
import com.sun.net.httpserver.HttpServer;
import eu.starsong.ghidra.api.ResponseBuilder;
import eu.starsong.ghidra.util.TransactionHelper;
import eu.starsong.ghidra.util.TransactionHelper.TransactionException;
import ghidra.framework.plugintool.PluginTool;
import ghidra.program.model.data.*;
import ghidra.program.model.listing.Program;
import ghidra.util.Msg;
import java.io.IOException;
import java.util.*;
/**
* Endpoints for managing struct (composite) data types in Ghidra.
* Provides REST API for creating, listing, modifying, and deleting structs.
*/
public class StructEndpoints extends AbstractEndpoint {
private PluginTool tool;
public StructEndpoints(Program program, int port) {
super(program, port);
}
public StructEndpoints(Program program, int port, PluginTool tool) {
super(program, port);
this.tool = tool;
}
@Override
protected PluginTool getTool() {
return tool;
}
@Override
public void registerEndpoints(HttpServer server) {
server.createContext("/structs", this::handleStructs);
server.createContext("/structs/create", exchange -> {
try {
if ("POST".equals(exchange.getRequestMethod())) {
Map<String, String> params = parseJsonPostParams(exchange);
handleCreateStruct(exchange, params);
} else {
sendErrorResponse(exchange, 405, "Method Not Allowed");
}
} catch (Exception e) {
Msg.error(this, "Error in /structs/create endpoint", e);
sendErrorResponse(exchange, 500, "Internal server error: " + e.getMessage());
}
});
server.createContext("/structs/delete", exchange -> {
try {
if ("POST".equals(exchange.getRequestMethod())) {
Map<String, String> params = parseJsonPostParams(exchange);
handleDeleteStruct(exchange, params);
} else {
sendErrorResponse(exchange, 405, "Method Not Allowed");
}
} catch (Exception e) {
Msg.error(this, "Error in /structs/delete endpoint", e);
sendErrorResponse(exchange, 500, "Internal server error: " + e.getMessage());
}
});
server.createContext("/structs/addfield", exchange -> {
try {
if ("POST".equals(exchange.getRequestMethod())) {
Map<String, String> params = parseJsonPostParams(exchange);
handleAddField(exchange, params);
} else {
sendErrorResponse(exchange, 405, "Method Not Allowed");
}
} catch (Exception e) {
Msg.error(this, "Error in /structs/addfield endpoint", e);
sendErrorResponse(exchange, 500, "Internal server error: " + e.getMessage());
}
});
server.createContext("/structs/updatefield", exchange -> {
try {
if ("POST".equals(exchange.getRequestMethod()) || "PATCH".equals(exchange.getRequestMethod())) {
Map<String, String> params = parseJsonPostParams(exchange);
handleUpdateField(exchange, params);
} else {
sendErrorResponse(exchange, 405, "Method Not Allowed");
}
} catch (Exception e) {
Msg.error(this, "Error in /structs/updatefield endpoint", e);
sendErrorResponse(exchange, 500, "Internal server error: " + e.getMessage());
}
});
}
/**
* Handle GET /structs - list all structs, or GET /structs?name=X - get specific struct details
*/
private void handleStructs(HttpExchange exchange) throws IOException {
try {
if ("GET".equals(exchange.getRequestMethod())) {
Map<String, String> qparams = parseQueryParams(exchange);
String structName = qparams.get("name");
if (structName != null && !structName.isEmpty()) {
handleGetStruct(exchange, structName);
} else {
handleListStructs(exchange);
}
} else {
sendErrorResponse(exchange, 405, "Method Not Allowed");
}
} catch (Exception e) {
Msg.error(this, "Error in /structs endpoint", e);
sendErrorResponse(exchange, 500, "Internal server error: " + e.getMessage());
}
}
/**
* List all struct data types in the program
*/
private void handleListStructs(HttpExchange exchange) throws IOException {
try {
Map<String, String> qparams = parseQueryParams(exchange);
int offset = parseIntOrDefault(qparams.get("offset"), 0);
int limit = parseIntOrDefault(qparams.get("limit"), 100);
String categoryFilter = qparams.get("category");
Program program = getCurrentProgram();
if (program == null) {
sendErrorResponse(exchange, 400, "No program loaded", "NO_PROGRAM_LOADED");
return;
}
DataTypeManager dtm = program.getDataTypeManager();
List<Map<String, Object>> structList = new ArrayList<>();
// Iterate through all data types and filter for structures
dtm.getAllDataTypes().forEachRemaining(dataType -> {
if (dataType instanceof Structure) {
Structure struct = (Structure) dataType;
// Apply category filter if specified
if (categoryFilter != null && !categoryFilter.isEmpty()) {
CategoryPath catPath = struct.getCategoryPath();
if (!catPath.getPath().contains(categoryFilter)) {
return;
}
}
Map<String, Object> structInfo = new HashMap<>();
structInfo.put("name", struct.getName());
structInfo.put("path", struct.getPathName());
structInfo.put("size", struct.getLength());
structInfo.put("numFields", struct.getNumComponents());
structInfo.put("category", struct.getCategoryPath().getPath());
structInfo.put("description", struct.getDescription() != null ? struct.getDescription() : "");
// Add HATEOAS links
Map<String, Object> links = new HashMap<>();
Map<String, String> selfLink = new HashMap<>();
selfLink.put("href", "/structs?name=" + struct.getName());
links.put("self", selfLink);
structInfo.put("_links", links);
structList.add(structInfo);
}
});
// Sort by name for consistency
structList.sort(Comparator.comparing(s -> (String) s.get("name")));
// Build response with pagination
ResponseBuilder builder = new ResponseBuilder(exchange, port).success(true);
List<Map<String, Object>> paginated = applyPagination(structList, offset, limit, builder, "/structs");
builder.result(paginated);
builder.addLink("program", "/program");
sendJsonResponse(exchange, builder.build(), 200);
} catch (Exception e) {
Msg.error(this, "Error listing structs", e);
sendErrorResponse(exchange, 500, "Error listing structs: " + e.getMessage(), "INTERNAL_ERROR");
}
}
/**
* Get details of a specific struct including all fields
*/
private void handleGetStruct(HttpExchange exchange, String structName) throws IOException {
try {
Program program = getCurrentProgram();
if (program == null) {
sendErrorResponse(exchange, 400, "No program loaded", "NO_PROGRAM_LOADED");
return;
}
DataTypeManager dtm = program.getDataTypeManager();
// Try to find the struct - support both full paths and simple names
DataType dataType = null;
// If it looks like a full path (starts with /), try direct lookup
if (structName.startsWith("/")) {
dataType = dtm.getDataType(structName);
if (dataType == null) {
dataType = dtm.findDataType(structName);
}
} else {
// Search by simple name using the helper method
dataType = findStructByName(dtm, structName);
}
if (dataType == null || !(dataType instanceof Structure)) {
sendErrorResponse(exchange, 404, "Struct not found: " + structName, "STRUCT_NOT_FOUND");
return;
}
Structure struct = (Structure) dataType;
Map<String, Object> structInfo = buildStructInfo(struct);
ResponseBuilder builder = new ResponseBuilder(exchange, port)
.success(true)
.result(structInfo);
builder.addLink("self", "/structs?name=" + struct.getName());
builder.addLink("structs", "/structs");
builder.addLink("program", "/program");
sendJsonResponse(exchange, builder.build(), 200);
} catch (Exception e) {
Msg.error(this, "Error getting struct details", e);
sendErrorResponse(exchange, 500, "Error getting struct: " + e.getMessage(), "INTERNAL_ERROR");
}
}
/**
* Create a new struct data type
* POST /structs/create
* Required params: name
* Optional params: category, size, description
*/
private void handleCreateStruct(HttpExchange exchange, Map<String, String> params) throws IOException {
try {
String structName = params.get("name");
String category = params.get("category");
String sizeStr = params.get("size");
String description = params.get("description");
if (structName == null || structName.isEmpty()) {
sendErrorResponse(exchange, 400, "Missing required parameter: name", "MISSING_PARAMETERS");
return;
}
Program program = getCurrentProgram();
if (program == null) {
sendErrorResponse(exchange, 400, "No program loaded", "NO_PROGRAM_LOADED");
return;
}
Map<String, Object> resultMap = new HashMap<>();
resultMap.put("name", structName);
try {
TransactionHelper.executeInTransaction(program, "Create Struct", () -> {
DataTypeManager dtm = program.getDataTypeManager();
// Check if struct already exists
DataType existing = dtm.getDataType("/" + structName);
if (existing != null) {
throw new Exception("Struct already exists: " + structName);
}
// Determine category path
CategoryPath catPath;
if (category != null && !category.isEmpty()) {
catPath = new CategoryPath(category);
} else {
catPath = CategoryPath.ROOT;
}
// Create the structure
StructureDataType struct = new StructureDataType(catPath, structName, 0);
if (description != null && !description.isEmpty()) {
struct.setDescription(description);
}
// Add to data type manager
Structure addedStruct = (Structure) dtm.addDataType(struct, DataTypeConflictHandler.DEFAULT_HANDLER);
resultMap.put("path", addedStruct.getPathName());
resultMap.put("category", addedStruct.getCategoryPath().getPath());
resultMap.put("size", addedStruct.getLength());
return null;
});
resultMap.put("message", "Struct created successfully");
ResponseBuilder builder = new ResponseBuilder(exchange, port)
.success(true)
.result(resultMap);
builder.addLink("self", "/structs?name=" + structName);
builder.addLink("structs", "/structs");
builder.addLink("program", "/program");
sendJsonResponse(exchange, builder.build(), 201);
} catch (TransactionException e) {
Msg.error(this, "Transaction failed: Create Struct", e);
sendErrorResponse(exchange, 500, "Failed to create struct: " + e.getMessage(), "TRANSACTION_ERROR");
} catch (Exception e) {
Msg.error(this, "Error creating struct", e);
sendErrorResponse(exchange, 400, "Error creating struct: " + e.getMessage(), "INVALID_PARAMETER");
}
} catch (Exception e) {
Msg.error(this, "Unexpected error creating struct", e);
sendErrorResponse(exchange, 500, "Error creating struct: " + e.getMessage(), "INTERNAL_ERROR");
}
}
/**
* Add a field to an existing struct
* POST /structs/addfield
* Required params: struct, fieldName, fieldType
* Optional params: offset, comment
*/
private void handleAddField(HttpExchange exchange, Map<String, String> params) throws IOException {
try {
String structName = params.get("struct");
String fieldName = params.get("fieldName");
String fieldType = params.get("fieldType");
String offsetStr = params.get("offset");
String comment = params.get("comment");
if (structName == null || structName.isEmpty()) {
sendErrorResponse(exchange, 400, "Missing required parameter: struct", "MISSING_PARAMETERS");
return;
}
if (fieldName == null || fieldName.isEmpty()) {
sendErrorResponse(exchange, 400, "Missing required parameter: fieldName", "MISSING_PARAMETERS");
return;
}
if (fieldType == null || fieldType.isEmpty()) {
sendErrorResponse(exchange, 400, "Missing required parameter: fieldType", "MISSING_PARAMETERS");
return;
}
Program program = getCurrentProgram();
if (program == null) {
sendErrorResponse(exchange, 400, "No program loaded", "NO_PROGRAM_LOADED");
return;
}
Integer offset = null;
if (offsetStr != null && !offsetStr.isEmpty()) {
try {
offset = Integer.parseInt(offsetStr);
} catch (NumberFormatException e) {
sendErrorResponse(exchange, 400, "Invalid offset parameter: must be an integer", "INVALID_PARAMETER");
return;
}
}
Map<String, Object> resultMap = new HashMap<>();
resultMap.put("struct", structName);
resultMap.put("fieldName", fieldName);
resultMap.put("fieldType", fieldType);
final Integer finalOffset = offset;
try {
TransactionHelper.executeInTransaction(program, "Add Struct Field", () -> {
DataTypeManager dtm = program.getDataTypeManager();
// Find the struct - handle both full paths and simple names
DataType dataType = null;
if (structName.startsWith("/")) {
dataType = dtm.getDataType(structName);
if (dataType == null) {
dataType = dtm.findDataType(structName);
}
} else {
dataType = findStructByName(dtm, structName);
}
if (dataType == null || !(dataType instanceof Structure)) {
throw new Exception("Struct not found: " + structName);
}
Structure struct = (Structure) dataType;
// Find the field type
DataType fieldDataType = findDataType(dtm, fieldType);
if (fieldDataType == null) {
throw new Exception("Field type not found: " + fieldType);
}
// Add the field
DataTypeComponent component;
if (finalOffset != null) {
// Insert at specific offset
component = struct.insertAtOffset(finalOffset, fieldDataType,
fieldDataType.getLength(), fieldName, comment);
} else {
// Append to end
component = struct.add(fieldDataType, fieldName, comment);
}
resultMap.put("offset", component.getOffset());
resultMap.put("length", component.getLength());
resultMap.put("structSize", struct.getLength());
return null;
});
resultMap.put("message", "Field added successfully");
ResponseBuilder builder = new ResponseBuilder(exchange, port)
.success(true)
.result(resultMap);
builder.addLink("struct", "/structs?name=" + structName);
builder.addLink("structs", "/structs");
builder.addLink("program", "/program");
sendJsonResponse(exchange, builder.build(), 200);
} catch (TransactionException e) {
Msg.error(this, "Transaction failed: Add Struct Field", e);
sendErrorResponse(exchange, 500, "Failed to add field: " + e.getMessage(), "TRANSACTION_ERROR");
} catch (Exception e) {
Msg.error(this, "Error adding field", e);
sendErrorResponse(exchange, 400, "Error adding field: " + e.getMessage(), "INVALID_PARAMETER");
}
} catch (Exception e) {
Msg.error(this, "Unexpected error adding field", e);
sendErrorResponse(exchange, 500, "Error adding field: " + e.getMessage(), "INTERNAL_ERROR");
}
}
/**
* Update an existing field in a struct
* POST/PATCH /structs/updatefield
* Required params: struct, fieldOffset (or fieldName)
* Optional params: newName, newType, newComment
*/
private void handleUpdateField(HttpExchange exchange, Map<String, String> params) throws IOException {
try {
String structName = params.get("struct");
String fieldOffsetStr = params.get("fieldOffset");
String fieldName = params.get("fieldName");
String newName = params.get("newName");
String newType = params.get("newType");
String newComment = params.get("newComment");
if (structName == null || structName.isEmpty()) {
sendErrorResponse(exchange, 400, "Missing required parameter: struct", "MISSING_PARAMETERS");
return;
}
// Must have either fieldOffset or fieldName to identify the field
if ((fieldOffsetStr == null || fieldOffsetStr.isEmpty()) && (fieldName == null || fieldName.isEmpty())) {
sendErrorResponse(exchange, 400, "Missing required parameter: either fieldOffset or fieldName must be provided", "MISSING_PARAMETERS");
return;
}
// Must have at least one update parameter
if ((newName == null || newName.isEmpty()) &&
(newType == null || newType.isEmpty()) &&
(newComment == null || newComment.isEmpty())) {
sendErrorResponse(exchange, 400, "At least one of newName, newType, or newComment must be provided", "MISSING_PARAMETERS");
return;
}
Program program = getCurrentProgram();
if (program == null) {
sendErrorResponse(exchange, 400, "No program loaded", "NO_PROGRAM_LOADED");
return;
}
Integer fieldOffset = null;
if (fieldOffsetStr != null && !fieldOffsetStr.isEmpty()) {
try {
fieldOffset = Integer.parseInt(fieldOffsetStr);
} catch (NumberFormatException e) {
sendErrorResponse(exchange, 400, "Invalid fieldOffset parameter: must be an integer", "INVALID_PARAMETER");
return;
}
}
Map<String, Object> resultMap = new HashMap<>();
resultMap.put("struct", structName);
final Integer finalFieldOffset = fieldOffset;
final String finalFieldName = fieldName;
try {
TransactionHelper.executeInTransaction(program, "Update Struct Field", () -> {
DataTypeManager dtm = program.getDataTypeManager();
// Find the struct
DataType dataType = null;
if (structName.startsWith("/")) {
dataType = dtm.getDataType(structName);
if (dataType == null) {
dataType = dtm.findDataType(structName);
}
} else {
dataType = findStructByName(dtm, structName);
}
if (dataType == null || !(dataType instanceof Structure)) {
throw new Exception("Struct not found: " + structName);
}
Structure struct = (Structure) dataType;
// Find the field to update
DataTypeComponent component = null;
if (finalFieldOffset != null) {
component = struct.getComponentAt(finalFieldOffset);
} else {
// Search by field name
for (DataTypeComponent comp : struct.getComponents()) {
if (finalFieldName.equals(comp.getFieldName())) {
component = comp;
break;
}
}
}
if (component == null) {
throw new Exception("Field not found in struct: " + (finalFieldOffset != null ? "offset " + finalFieldOffset : finalFieldName));
}
int componentOffset = component.getOffset();
int componentLength = component.getLength();
DataType originalType = component.getDataType();
String originalName = component.getFieldName();
String originalComment = component.getComment();
// Store original values
resultMap.put("originalName", originalName);
resultMap.put("originalType", originalType.getName());
resultMap.put("originalComment", originalComment != null ? originalComment : "");
resultMap.put("offset", componentOffset);
// Determine new values
String updatedName = (newName != null && !newName.isEmpty()) ? newName : originalName;
String updatedComment = (newComment != null) ? newComment : originalComment;
DataType updatedType = originalType;
if (newType != null && !newType.isEmpty()) {
updatedType = findDataType(dtm, newType);
if (updatedType == null) {
throw new Exception("Field type not found: " + newType);
}
}
// Update the field by replacing it
// Ghidra doesn't have a direct "update" - we need to delete and re-add
struct.deleteAtOffset(componentOffset);
DataTypeComponent newComponent = struct.insertAtOffset(componentOffset, updatedType,
updatedType.getLength(),
updatedName, updatedComment);
resultMap.put("newName", newComponent.getFieldName());
resultMap.put("newType", newComponent.getDataType().getName());
resultMap.put("newComment", newComponent.getComment() != null ? newComponent.getComment() : "");
resultMap.put("length", newComponent.getLength());
return null;
});
resultMap.put("message", "Field updated successfully");
ResponseBuilder builder = new ResponseBuilder(exchange, port)
.success(true)
.result(resultMap);
builder.addLink("struct", "/structs?name=" + structName);
builder.addLink("structs", "/structs");
builder.addLink("program", "/program");
sendJsonResponse(exchange, builder.build(), 200);
} catch (TransactionException e) {
Msg.error(this, "Transaction failed: Update Struct Field", e);
sendErrorResponse(exchange, 500, "Failed to update field: " + e.getMessage(), "TRANSACTION_ERROR");
} catch (Exception e) {
Msg.error(this, "Error updating field", e);
sendErrorResponse(exchange, 400, "Error updating field: " + e.getMessage(), "INVALID_PARAMETER");
}
} catch (Exception e) {
Msg.error(this, "Unexpected error updating field", e);
sendErrorResponse(exchange, 500, "Error updating field: " + e.getMessage(), "INTERNAL_ERROR");
}
}
/**
* Delete a struct data type
* POST /structs/delete
* Required params: name
*/
private void handleDeleteStruct(HttpExchange exchange, Map<String, String> params) throws IOException {
try {
String structName = params.get("name");
if (structName == null || structName.isEmpty()) {
sendErrorResponse(exchange, 400, "Missing required parameter: name", "MISSING_PARAMETERS");
return;
}
Program program = getCurrentProgram();
if (program == null) {
sendErrorResponse(exchange, 400, "No program loaded", "NO_PROGRAM_LOADED");
return;
}
Map<String, Object> resultMap = new HashMap<>();
resultMap.put("name", structName);
try {
TransactionHelper.executeInTransaction(program, "Delete Struct", () -> {
DataTypeManager dtm = program.getDataTypeManager();
// Find the struct - handle both full paths and simple names
DataType dataType = null;
if (structName.startsWith("/")) {
dataType = dtm.getDataType(structName);
if (dataType == null) {
dataType = dtm.findDataType(structName);
}
} else {
dataType = findStructByName(dtm, structName);
}
if (dataType == null) {
throw new Exception("Struct not found: " + structName);
}
if (!(dataType instanceof Structure)) {
throw new Exception("Data type is not a struct: " + structName);
}
// Store info before deletion
resultMap.put("path", dataType.getPathName());
resultMap.put("category", dataType.getCategoryPath().getPath());
// Remove the struct
dtm.remove(dataType, null);
return null;
});
resultMap.put("message", "Struct deleted successfully");
ResponseBuilder builder = new ResponseBuilder(exchange, port)
.success(true)
.result(resultMap);
builder.addLink("structs", "/structs");
builder.addLink("program", "/program");
sendJsonResponse(exchange, builder.build(), 200);
} catch (TransactionException e) {
Msg.error(this, "Transaction failed: Delete Struct", e);
sendErrorResponse(exchange, 500, "Failed to delete struct: " + e.getMessage(), "TRANSACTION_ERROR");
} catch (Exception e) {
Msg.error(this, "Error deleting struct", e);
sendErrorResponse(exchange, 400, "Error deleting struct: " + e.getMessage(), "INVALID_PARAMETER");
}
} catch (Exception e) {
Msg.error(this, "Unexpected error deleting struct", e);
sendErrorResponse(exchange, 500, "Error deleting struct: " + e.getMessage(), "INTERNAL_ERROR");
}
}
/**
* Build a detailed information map for a struct including all fields
*/
private Map<String, Object> buildStructInfo(Structure struct) {
Map<String, Object> structInfo = new HashMap<>();
structInfo.put("name", struct.getName());
structInfo.put("path", struct.getPathName());
structInfo.put("size", struct.getLength());
structInfo.put("category", struct.getCategoryPath().getPath());
structInfo.put("description", struct.getDescription() != null ? struct.getDescription() : "");
structInfo.put("numFields", struct.getNumComponents());
// Add field details
List<Map<String, Object>> fields = new ArrayList<>();
for (DataTypeComponent component : struct.getComponents()) {
Map<String, Object> fieldInfo = new HashMap<>();
fieldInfo.put("name", component.getFieldName() != null ? component.getFieldName() : "");
fieldInfo.put("offset", component.getOffset());
fieldInfo.put("length", component.getLength());
fieldInfo.put("type", component.getDataType().getName());
fieldInfo.put("typePath", component.getDataType().getPathName());
fieldInfo.put("comment", component.getComment() != null ? component.getComment() : "");
fields.add(fieldInfo);
}
structInfo.put("fields", fields);
return structInfo;
}
/**
* Find a struct by name, searching through all data types
*/
private DataType findStructByName(DataTypeManager dtm, String structName) {
final DataType[] result = new DataType[1];
dtm.getAllDataTypes().forEachRemaining(dt -> {
if (dt instanceof Structure && dt.getName().equals(structName)) {
if (result[0] == null) {
result[0] = dt;
}
}
});
return result[0];
}
/**
* Find a data type by name, trying multiple lookup methods
*/
private DataType findDataType(DataTypeManager dtm, String typeName) {
// Try direct lookup with path
DataType dataType = dtm.getDataType("/" + typeName);
// Try without path
if (dataType == null) {
dataType = dtm.findDataType("/" + typeName);
}
// Try built-in primitive types
if (dataType == null) {
switch(typeName.toLowerCase()) {
case "byte":
dataType = new ByteDataType();
break;
case "char":
dataType = new CharDataType();
break;
case "word":
dataType = new WordDataType();
break;
case "dword":
dataType = new DWordDataType();
break;
case "qword":
dataType = new QWordDataType();
break;
case "float":
dataType = new FloatDataType();
break;
case "double":
dataType = new DoubleDataType();
break;
case "int":
dataType = new IntegerDataType();
break;
case "long":
dataType = new LongDataType();
break;
case "pointer":
dataType = new PointerDataType();
break;
case "string":
dataType = new StringDataType();
break;
}
}
return dataType;
}
}

View File

@ -0,0 +1,214 @@
package eu.starsong.ghidra.endpoints;
import com.google.gson.JsonObject;
import com.sun.net.httpserver.HttpExchange;
import com.sun.net.httpserver.HttpServer;
import ghidra.framework.plugintool.PluginTool;
import ghidra.program.model.listing.Program;
import ghidra.program.model.symbol.Symbol;
import ghidra.program.model.symbol.SymbolIterator;
import ghidra.program.model.symbol.SymbolTable;
import ghidra.util.Msg;
import java.io.IOException;
import java.util.*;
public class SymbolEndpoints extends AbstractEndpoint {
private PluginTool tool;
// Updated constructor to accept port
public SymbolEndpoints(Program program, int port) {
super(program, port); // Call super constructor
}
public SymbolEndpoints(Program program, int port, PluginTool tool) {
super(program, port);
this.tool = tool;
}
@Override
protected PluginTool getTool() {
return tool;
}
@Override
public void registerEndpoints(HttpServer server) {
server.createContext("/symbols/imports", this::handleImports);
server.createContext("/symbols/exports", this::handleExports);
server.createContext("/symbols", this::handleSymbols);
}
public void handleSymbols(HttpExchange exchange) throws IOException {
try {
if ("GET".equals(exchange.getRequestMethod())) {
Map<String, String> qparams = parseQueryParams(exchange);
int offset = parseIntOrDefault(qparams.get("offset"), 0);
int limit = parseIntOrDefault(qparams.get("limit"), 100);
Program program = getCurrentProgram();
if (program == null) {
sendErrorResponse(exchange, 400, "No program loaded", "NO_PROGRAM_LOADED");
return;
}
List<Map<String, Object>> symbols = new ArrayList<>();
SymbolTable symbolTable = program.getSymbolTable();
SymbolIterator symbolIterator = symbolTable.getAllSymbols(true);
while (symbolIterator.hasNext()) {
Symbol symbol = symbolIterator.next();
Map<String, Object> symbolInfo = new HashMap<>();
symbolInfo.put("name", symbol.getName());
symbolInfo.put("address", symbol.getAddress().toString());
symbolInfo.put("namespace", symbol.getParentNamespace().getName());
symbolInfo.put("type", symbol.getSymbolType().toString());
symbolInfo.put("isPrimary", symbol.isPrimary());
// Add HATEOAS links
Map<String, Object> links = new HashMap<>();
Map<String, String> selfLink = new HashMap<>();
selfLink.put("href", "/symbols/" + symbol.getAddress().toString());
links.put("self", selfLink);
symbolInfo.put("_links", links);
symbols.add(symbolInfo);
}
// Build response with HATEOAS links
eu.starsong.ghidra.api.ResponseBuilder builder = new eu.starsong.ghidra.api.ResponseBuilder(exchange, port)
.success(true);
// Apply pagination and get paginated items
List<Map<String, Object>> paginatedSymbols = applyPagination(symbols, offset, limit, builder, "/symbols");
// Set the paginated result
builder.result(paginatedSymbols);
// Add program link
builder.addLink("program", "/program");
sendJsonResponse(exchange, builder.build(), 200);
} else {
sendErrorResponse(exchange, 405, "Method Not Allowed", "METHOD_NOT_ALLOWED");
}
} catch (Exception e) {
Msg.error(this, "Error handling /symbols endpoint", e);
sendErrorResponse(exchange, 500, "Internal Server Error: " + e.getMessage(), "INTERNAL_ERROR");
}
}
public void handleImports(HttpExchange exchange) throws IOException {
try {
if ("GET".equals(exchange.getRequestMethod())) {
Map<String, String> qparams = parseQueryParams(exchange);
int offset = parseIntOrDefault(qparams.get("offset"), 0);
int limit = parseIntOrDefault(qparams.get("limit"), 100);
Program program = getCurrentProgram();
if (program == null) {
sendErrorResponse(exchange, 400, "No program loaded", "NO_PROGRAM_LOADED");
return;
}
List<Map<String, Object>> imports = new ArrayList<>();
for (Symbol symbol : program.getSymbolTable().getExternalSymbols()) {
Map<String, Object> imp = new HashMap<>();
imp.put("name", symbol.getName());
imp.put("address", symbol.getAddress().toString());
// Add HATEOAS links
Map<String, Object> links = new HashMap<>();
Map<String, String> selfLink = new HashMap<>();
selfLink.put("href", "/symbols/imports/" + symbol.getAddress().toString());
links.put("self", selfLink);
imp.put("_links", links);
imports.add(imp);
}
// Build response with HATEOAS links
eu.starsong.ghidra.api.ResponseBuilder builder = new eu.starsong.ghidra.api.ResponseBuilder(exchange, port)
.success(true);
// Apply pagination and get paginated items
List<Map<String, Object>> paginated = applyPagination(imports, offset, limit, builder, "/symbols/imports");
// Set the paginated result
builder.result(paginated);
// Add additional links
builder.addLink("program", "/program");
builder.addLink("symbols", "/symbols");
sendJsonResponse(exchange, builder.build(), 200);
} else {
sendErrorResponse(exchange, 405, "Method Not Allowed", "METHOD_NOT_ALLOWED");
}
} catch (Exception e) {
Msg.error(this, "Error in /symbols/imports endpoint", e);
sendErrorResponse(exchange, 500, "Internal server error: " + e.getMessage(), "INTERNAL_ERROR");
}
}
public void handleExports(HttpExchange exchange) throws IOException {
try {
if ("GET".equals(exchange.getRequestMethod())) {
Map<String, String> qparams = parseQueryParams(exchange);
int offset = parseIntOrDefault(qparams.get("offset"), 0);
int limit = parseIntOrDefault(qparams.get("limit"), 100);
Program program = getCurrentProgram();
if (program == null) {
sendErrorResponse(exchange, 400, "No program loaded", "NO_PROGRAM_LOADED");
return;
}
List<Map<String, Object>> exports = new ArrayList<>();
SymbolTable table = program.getSymbolTable();
SymbolIterator it = table.getAllSymbols(true);
while (it.hasNext()) {
Symbol s = it.next();
if (s.isExternalEntryPoint()) {
Map<String, Object> exp = new HashMap<>();
exp.put("name", s.getName());
exp.put("address", s.getAddress().toString());
// Add HATEOAS links
Map<String, Object> links = new HashMap<>();
Map<String, String> selfLink = new HashMap<>();
selfLink.put("href", "/symbols/exports/" + s.getAddress().toString());
links.put("self", selfLink);
exp.put("_links", links);
exports.add(exp);
}
}
// Build response with HATEOAS links
eu.starsong.ghidra.api.ResponseBuilder builder = new eu.starsong.ghidra.api.ResponseBuilder(exchange, port)
.success(true);
// Apply pagination and get paginated items
List<Map<String, Object>> paginated = applyPagination(exports, offset, limit, builder, "/symbols/exports");
// Set the paginated result
builder.result(paginated);
// Add additional links
builder.addLink("program", "/program");
builder.addLink("symbols", "/symbols");
sendJsonResponse(exchange, builder.build(), 200);
} else {
sendErrorResponse(exchange, 405, "Method Not Allowed", "METHOD_NOT_ALLOWED");
}
} catch (Exception e) {
Msg.error(this, "Error in /symbols/exports endpoint", e);
sendErrorResponse(exchange, 500, "Internal server error: " + e.getMessage(), "INTERNAL_ERROR");
}
}
// parseIntOrDefault is inherited from AbstractEndpoint
}

View File

@ -0,0 +1,642 @@
package eu.starsong.ghidra.endpoints;
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import com.sun.net.httpserver.HttpExchange;
import com.sun.net.httpserver.HttpServer;
import eu.starsong.ghidra.util.TransactionHelper;
import eu.starsong.ghidra.util.TransactionHelper.TransactionException;
import ghidra.app.decompiler.DecompInterface;
import ghidra.app.decompiler.DecompileResults;
import ghidra.program.model.address.Address;
import ghidra.program.model.data.DataType;
import ghidra.program.model.listing.Function;
import ghidra.program.model.listing.Parameter;
import ghidra.program.model.listing.Program;
import ghidra.program.model.listing.VariableStorage;
import ghidra.framework.plugintool.PluginTool;
import ghidra.program.model.pcode.HighFunction;
import ghidra.program.model.pcode.HighFunctionDBUtil;
import ghidra.program.model.pcode.HighSymbol;
import ghidra.program.model.pcode.LocalSymbolMap;
import ghidra.program.model.symbol.SourceType;
import ghidra.program.model.symbol.Symbol;
import ghidra.program.model.symbol.SymbolIterator;
import ghidra.program.model.symbol.SymbolTable;
import ghidra.program.model.symbol.SymbolType;
import ghidra.util.Msg;
import ghidra.util.task.ConsoleTaskMonitor;
import eu.starsong.ghidra.api.ResponseBuilder;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.*;
import java.util.concurrent.atomic.AtomicBoolean;
import javax.swing.SwingUtilities;
import java.lang.reflect.InvocationTargetException;
public class VariableEndpoints extends AbstractEndpoint {
private PluginTool tool;
// Updated constructor to accept port
public VariableEndpoints(Program program, int port) {
super(program, port); // Call super constructor
}
public VariableEndpoints(Program program, int port, PluginTool tool) {
super(program, port);
this.tool = tool;
}
@Override
protected PluginTool getTool() {
return tool;
}
@Override
public void registerEndpoints(HttpServer server) {
server.createContext("/variables", this::handleGlobalVariables);
}
private void handleGlobalVariables(HttpExchange exchange) throws IOException {
try {
if ("GET".equals(exchange.getRequestMethod())) {
Map<String, String> qparams = parseQueryParams(exchange);
int offset = parseIntOrDefault(qparams.get("offset"), 0);
int limit = parseIntOrDefault(qparams.get("limit"), 100);
String search = qparams.get("search"); // Renamed from 'query' for clarity
boolean globalOnly = Boolean.parseBoolean(qparams.getOrDefault("global_only", "false"));
// Always get the most current program from the tool
Program program = getCurrentProgram();
if (program == null) {
sendErrorResponse(exchange, 400, "No program loaded", "NO_PROGRAM_LOADED");
return;
}
// Create ResponseBuilder for HATEOAS-compliant response
ResponseBuilder builder = new ResponseBuilder(exchange, port)
.success(true)
.addLink("self", "/variables" + (exchange.getRequestURI().getRawQuery() != null ?
"?" + exchange.getRequestURI().getRawQuery() : ""));
// Add common links
builder.addLink("program", "/program");
builder.addLink("search", "/variables?search={term}", "GET");
builder.addLink("globals", "/variables?global_only=true", "GET");
// Use more efficient pagination by limiting data collection up-front
PaginatedResult paginatedResult;
if (search != null && !search.isEmpty()) {
paginatedResult = searchVariablesPaginated(program, search, offset, limit, globalOnly);
} else {
paginatedResult = listVariablesPaginated(program, offset, limit, globalOnly);
}
// Add pagination links
String baseUrl = "/variables";
String queryParams = "";
if (search != null && !search.isEmpty()) {
queryParams = "search=" + search;
}
if (globalOnly) {
queryParams = queryParams.isEmpty() ? "global_only=true" : queryParams + "&global_only=true";
}
// Add metadata
Map<String, Object> metadata = new HashMap<>();
metadata.put("total_estimate", paginatedResult.getTotalEstimate());
metadata.put("offset", offset);
metadata.put("limit", limit);
builder.metadata(metadata);
// Add self link
String selfLink = baseUrl;
if (!queryParams.isEmpty()) {
selfLink += "?" + queryParams;
selfLink += "&offset=" + offset + "&limit=" + limit;
} else {
selfLink += "?offset=" + offset + "&limit=" + limit;
}
builder.addLink("self", selfLink);
// Add next link if needed
if (paginatedResult.hasMore()) {
String nextLink = baseUrl;
if (!queryParams.isEmpty()) {
nextLink += "?" + queryParams;
nextLink += "&offset=" + (offset + limit) + "&limit=" + limit;
} else {
nextLink += "?offset=" + (offset + limit) + "&limit=" + limit;
}
builder.addLink("next", nextLink);
}
// Add prev link if needed
if (offset > 0) {
int prevOffset = Math.max(0, offset - limit);
String prevLink = baseUrl;
if (!queryParams.isEmpty()) {
prevLink += "?" + queryParams;
prevLink += "&offset=" + prevOffset + "&limit=" + limit;
} else {
prevLink += "?offset=" + prevOffset + "&limit=" + limit;
}
builder.addLink("prev", prevLink);
}
// Add the result to the builder
builder.result(paginatedResult.getResults());
// Send the HATEOAS-compliant response
sendJsonResponse(exchange, builder.build(), 200);
} else {
sendErrorResponse(exchange, 405, "Method Not Allowed");
}
} catch (Exception e) {
Msg.error(this, "Error in /variables endpoint", e);
sendErrorResponse(exchange, 500, "Internal server error: " + e.getMessage());
}
}
/**
* Class to represent a paginated result with metadata
*/
private static class PaginatedResult {
private final List<Map<String, String>> results;
private final boolean hasMore;
private final int totalEstimate;
public PaginatedResult(List<Map<String, String>> results, boolean hasMore, int totalEstimate) {
this.results = results;
this.hasMore = hasMore;
this.totalEstimate = totalEstimate;
}
public List<Map<String, String>> getResults() {
return results;
}
public boolean hasMore() {
return hasMore;
}
public int getTotalEstimate() {
return totalEstimate;
}
}
/**
* Legacy method kept for backward compatibility
*/
private List<Map<String, String>> listVariables(Program program) {
PaginatedResult result = listVariablesPaginated(program, 0, Integer.MAX_VALUE, false);
return result.getResults();
}
/**
* List variables with efficient pagination - only loads what's needed
*/
private PaginatedResult listVariablesPaginated(Program program, int offset, int limit, boolean globalOnly) {
if (program == null) {
return new PaginatedResult(new ArrayList<>(), false, 0);
}
List<Map<String, String>> variables = new ArrayList<>();
int globalVarCount = 0;
int totalEstimate = 0;
boolean hasMore = false;
// Calculate range of items to fetch
int startIdx = offset;
int endIdx = offset + limit;
int currentIndex = 0;
// Get global variables - these are quick to get so we can get them all
SymbolTable symbolTable = program.getSymbolTable();
ArrayList<Symbol> globalSymbols = new ArrayList<>();
// First, collect global variables efficiently
for (Symbol symbol : symbolTable.getDefinedSymbols()) {
if (symbol.isGlobal() && !symbol.isExternal() &&
symbol.getSymbolType() != SymbolType.FUNCTION &&
symbol.getSymbolType() != SymbolType.LABEL) {
globalSymbols.add(symbol);
}
}
// Sort globals by name first
globalSymbols.sort(Comparator.comparing(Symbol::getName));
globalVarCount = globalSymbols.size();
totalEstimate = globalVarCount;
// Now extract just the global variables we need for the current page
for (Symbol symbol : globalSymbols) {
if (currentIndex >= startIdx && currentIndex < endIdx) {
Map<String, String> varInfo = new HashMap<>();
varInfo.put("name", symbol.getName());
varInfo.put("address", symbol.getAddress().toString());
varInfo.put("type", "global");
varInfo.put("dataType", getDataTypeName(program, symbol.getAddress()));
variables.add(varInfo);
}
currentIndex++;
// If we've added enough items, break
if (currentIndex >= endIdx) {
hasMore = currentIndex < globalVarCount || !globalOnly;
break;
}
}
// If we only want globals, or if we've already fetched enough for this page, return now
if (globalOnly || currentIndex >= endIdx) {
return new PaginatedResult(variables, hasMore, totalEstimate);
}
// Get local variables - only if needed (these are expensive)
// We need to perform some estimation for locals, as decompiling all functions is too slow
// First estimate the total count
int funcCount = 0;
for (Function f : program.getFunctionManager().getFunctions(true)) {
funcCount++;
}
// Roughly estimate 2 local variables per function
totalEstimate = globalVarCount + (funcCount * 2);
// If we don't need locals for the current page, return globals with estimation
if (startIdx >= globalVarCount) {
// Adjust for local variable processing
int localOffset = startIdx - globalVarCount;
int localLimit = limit;
// Process functions to get the local variables
DecompInterface decomp = null;
try {
decomp = new DecompInterface();
if (decomp.openProgram(program)) {
int localVarIndex = 0;
int functionsProcessed = 0;
int maxFunctionsToProcess = 20; // Limit how many functions we process per request
for (Function function : program.getFunctionManager().getFunctions(true)) {
try {
DecompileResults results = decomp.decompileFunction(function, 10, new ConsoleTaskMonitor());
if (results != null && results.decompileCompleted()) {
HighFunction highFunc = results.getHighFunction();
if (highFunc != null) {
List<Map<String, String>> functionVars = new ArrayList<>();
Iterator<HighSymbol> symbolIter = highFunc.getLocalSymbolMap().getSymbols();
while (symbolIter.hasNext()) {
HighSymbol symbol = symbolIter.next();
if (!symbol.isParameter()) { // Only list locals
Map<String, String> varInfo = new HashMap<>();
varInfo.put("name", symbol.getName());
varInfo.put("type", "local");
varInfo.put("function", function.getName());
Address pcAddr = symbol.getPCAddress();
varInfo.put("address", pcAddr != null ? pcAddr.toString() : "N/A");
varInfo.put("dataType", symbol.getDataType() != null ? symbol.getDataType().getName() : "unknown");
functionVars.add(varInfo);
}
}
// Sort function variables by name
functionVars.sort(Comparator.comparing(a -> a.get("name")));
// Add only the needed variables for this page
for (Map<String, String> varInfo : functionVars) {
if (localVarIndex >= localOffset && localVarIndex < localOffset + localLimit) {
variables.add(varInfo);
}
localVarIndex++;
if (localVarIndex >= localOffset + localLimit) {
break;
}
}
}
}
} catch (Exception e) {
Msg.warn(this, "listVariablesPaginated: Error processing function " + function.getName(), e);
}
functionsProcessed++;
if (functionsProcessed >= maxFunctionsToProcess || localVarIndex >= localOffset + localLimit) {
// Stop processing if we've hit our limits
break;
}
}
// Determine if we have more variables
hasMore = functionsProcessed < funcCount || localVarIndex >= localOffset + localLimit;
}
} catch (Exception e) {
Msg.error(this, "listVariablesPaginated: Error during local variable processing", e);
} finally {
if (decomp != null) {
decomp.dispose();
}
}
} else {
// This means we already have some globals and may need a few locals to complete the page
int remainingSpace = limit - variables.size();
if (remainingSpace > 0) {
// Process just enough functions to fill the page
DecompInterface decomp = null;
try {
decomp = new DecompInterface();
if (decomp.openProgram(program)) {
int functionsProcessed = 0;
int maxFunctionsToProcess = 5; // Limit how many functions we process
int localVarsAdded = 0;
for (Function function : program.getFunctionManager().getFunctions(true)) {
try {
DecompileResults results = decomp.decompileFunction(function, 10, new ConsoleTaskMonitor());
if (results != null && results.decompileCompleted()) {
HighFunction highFunc = results.getHighFunction();
if (highFunc != null) {
Iterator<HighSymbol> symbolIter = highFunc.getLocalSymbolMap().getSymbols();
while (symbolIter.hasNext() && localVarsAdded < remainingSpace) {
HighSymbol symbol = symbolIter.next();
if (!symbol.isParameter()) { // Only list locals
Map<String, String> varInfo = new HashMap<>();
varInfo.put("name", symbol.getName());
varInfo.put("type", "local");
varInfo.put("function", function.getName());
Address pcAddr = symbol.getPCAddress();
varInfo.put("address", pcAddr != null ? pcAddr.toString() : "N/A");
varInfo.put("dataType", symbol.getDataType() != null ? symbol.getDataType().getName() : "unknown");
variables.add(varInfo);
localVarsAdded++;
}
}
}
}
} catch (Exception e) {
Msg.warn(this, "listVariablesPaginated: Error processing function " + function.getName(), e);
}
functionsProcessed++;
if (functionsProcessed >= maxFunctionsToProcess || localVarsAdded >= remainingSpace) {
// Stop processing if we've hit our limits
break;
}
}
// Determine if we have more variables
hasMore = functionsProcessed < funcCount || localVarsAdded >= remainingSpace;
}
} catch (Exception e) {
Msg.error(this, "listVariablesPaginated: Error during local variable processing", e);
} finally {
if (decomp != null) {
decomp.dispose();
}
}
}
}
// Sort the combined results
variables.sort(Comparator.comparing(a -> a.get("name")));
return new PaginatedResult(variables, hasMore, totalEstimate);
}
/**
* Legacy method kept for backward compatibility
*/
private List<Map<String, String>> searchVariables(Program program, String searchTerm) {
PaginatedResult result = searchVariablesPaginated(program, searchTerm, 0, Integer.MAX_VALUE, false);
return result.getResults();
}
/**
* Search variables with efficient pagination - only loads what's needed
*/
private PaginatedResult searchVariablesPaginated(Program program, String searchTerm, int offset, int limit, boolean globalOnly) {
if (program == null || searchTerm == null || searchTerm.isEmpty()) {
return new PaginatedResult(new ArrayList<>(), false, 0);
}
List<Map<String, String>> matchedVars = new ArrayList<>();
String lowerSearchTerm = searchTerm.toLowerCase();
int totalEstimate = 0;
boolean hasMore = false;
// Calculate range of items to fetch
int startIdx = offset;
int endIdx = offset + limit;
int currentIndex = 0;
// Search global variables - these are quick to search
SymbolTable symbolTable = program.getSymbolTable();
List<Map<String, String>> globalMatches = new ArrayList<>();
SymbolIterator it = symbolTable.getSymbolIterator();
while (it.hasNext()) {
Symbol symbol = it.next();
if (symbol.isGlobal() &&
symbol.getSymbolType() != SymbolType.FUNCTION &&
symbol.getSymbolType() != SymbolType.LABEL &&
symbol.getName().toLowerCase().contains(lowerSearchTerm)) {
Map<String, String> varInfo = new HashMap<>();
varInfo.put("name", symbol.getName());
varInfo.put("address", symbol.getAddress().toString());
varInfo.put("type", "global");
varInfo.put("dataType", getDataTypeName(program, symbol.getAddress()));
globalMatches.add(varInfo);
}
}
// Sort global matches by name
globalMatches.sort(Comparator.comparing(a -> a.get("name")));
// Extract just the global variables needed for this page
int globalCount = globalMatches.size();
totalEstimate = globalCount;
for (Map<String, String> varInfo : globalMatches) {
if (currentIndex >= startIdx && currentIndex < endIdx) {
matchedVars.add(varInfo);
}
currentIndex++;
// If we've added enough items, break
if (currentIndex >= endIdx) {
hasMore = currentIndex < globalCount || !globalOnly;
break;
}
}
// If we only want globals, or if we've already fetched enough for this page, return now
if (globalOnly || currentIndex >= endIdx) {
return new PaginatedResult(matchedVars, hasMore, totalEstimate);
}
// Search local variables - only do this if we need more results
// We need to perform some estimation for locals, as decompiling all functions is too slow
// First estimate the total count
int funcCount = 0;
for (Function f : program.getFunctionManager().getFunctions(true)) {
funcCount++;
}
// Roughly estimate 1 match per 5 functions when searching
totalEstimate = globalCount + (funcCount / 5);
// If we don't need locals for the current page, return globals with estimation
if (startIdx >= globalCount) {
// Adjust for local variable processing
int localOffset = startIdx - globalCount;
int localLimit = limit;
// Process functions to get the local variables
DecompInterface decomp = null;
try {
decomp = new DecompInterface();
if (decomp.openProgram(program)) {
int localVarIndex = 0;
int functionsProcessed = 0;
int maxFunctionsToProcess = 30; // Limit how many functions we process for search
for (Function function : program.getFunctionManager().getFunctions(true)) {
try {
DecompileResults results = decomp.decompileFunction(function, 5, new ConsoleTaskMonitor());
if (results != null && results.decompileCompleted()) {
HighFunction highFunc = results.getHighFunction();
if (highFunc != null) {
List<Map<String, String>> functionMatches = new ArrayList<>();
Iterator<HighSymbol> symbolIter = highFunc.getLocalSymbolMap().getSymbols();
while (symbolIter.hasNext()) {
HighSymbol symbol = symbolIter.next();
if (symbol.getName().toLowerCase().contains(lowerSearchTerm)) {
Map<String, String> varInfo = new HashMap<>();
varInfo.put("name", symbol.getName());
varInfo.put("function", function.getName());
varInfo.put("type", symbol.isParameter() ? "parameter" : "local");
Address pcAddr = symbol.getPCAddress();
varInfo.put("address", pcAddr != null ? pcAddr.toString() : "N/A");
varInfo.put("dataType", symbol.getDataType() != null ? symbol.getDataType().getName() : "unknown");
functionMatches.add(varInfo);
}
}
// Sort function matches by name
functionMatches.sort(Comparator.comparing(a -> a.get("name")));
// Add only the needed variables for this page
for (Map<String, String> varInfo : functionMatches) {
if (localVarIndex >= localOffset && localVarIndex < localOffset + localLimit) {
matchedVars.add(varInfo);
}
localVarIndex++;
if (localVarIndex >= localOffset + localLimit) {
break;
}
}
}
}
} catch (Exception e) {
Msg.warn(this, "searchVariablesPaginated: Error processing function " + function.getName(), e);
}
functionsProcessed++;
if (functionsProcessed >= maxFunctionsToProcess || localVarIndex >= localOffset + localLimit) {
// Stop processing if we've hit our limits
break;
}
}
// Determine if we have more variables
hasMore = functionsProcessed < funcCount || localVarIndex >= localOffset + localLimit;
}
} catch (Exception e) {
Msg.error(this, "searchVariablesPaginated: Error during local variable search", e);
} finally {
if (decomp != null) {
decomp.dispose();
}
}
} else {
// This means we already have some globals and may need a few locals to complete the page
int remainingSpace = limit - matchedVars.size();
if (remainingSpace > 0) {
// Process functions until we've filled the page
DecompInterface decomp = null;
try {
decomp = new DecompInterface();
if (decomp.openProgram(program)) {
int functionsProcessed = 0;
int maxFunctionsToProcess = 5; // Limit how many functions we process
int localVarsAdded = 0;
for (Function function : program.getFunctionManager().getFunctions(true)) {
try {
DecompileResults results = decomp.decompileFunction(function, 5, new ConsoleTaskMonitor());
if (results != null && results.decompileCompleted()) {
HighFunction highFunc = results.getHighFunction();
if (highFunc != null) {
Iterator<HighSymbol> symbolIter = highFunc.getLocalSymbolMap().getSymbols();
while (symbolIter.hasNext() && localVarsAdded < remainingSpace) {
HighSymbol symbol = symbolIter.next();
if (symbol.getName().toLowerCase().contains(lowerSearchTerm)) {
Map<String, String> varInfo = new HashMap<>();
varInfo.put("name", symbol.getName());
varInfo.put("function", function.getName());
varInfo.put("type", symbol.isParameter() ? "parameter" : "local");
Address pcAddr = symbol.getPCAddress();
varInfo.put("address", pcAddr != null ? pcAddr.toString() : "N/A");
varInfo.put("dataType", symbol.getDataType() != null ? symbol.getDataType().getName() : "unknown");
matchedVars.add(varInfo);
localVarsAdded++;
}
}
}
}
} catch (Exception e) {
Msg.warn(this, "searchVariablesPaginated: Error processing function " + function.getName(), e);
}
functionsProcessed++;
if (functionsProcessed >= maxFunctionsToProcess || localVarsAdded >= remainingSpace) {
// Stop processing if we've hit our limits
break;
}
}
// Determine if we have more variables
hasMore = functionsProcessed < funcCount || localVarsAdded >= remainingSpace;
}
} catch (Exception e) {
Msg.error(this, "searchVariablesPaginated: Error during local variable search", e);
} finally {
if (decomp != null) {
decomp.dispose();
}
}
}
}
// Sort the combined results
matchedVars.sort(Comparator.comparing(a -> a.get("name")));
return new PaginatedResult(matchedVars, hasMore, totalEstimate);
}
// --- Helper Methods ---
private String getDataTypeName(Program program, Address address) {
// This might be better in GhidraUtil if used elsewhere
ghidra.program.model.listing.Data data = program.getListing().getDataAt(address);
if (data == null) return "undefined";
DataType dt = data.getDataType();
return dt != null ? dt.getName() : "unknown";
}
}

View File

@ -0,0 +1,355 @@
package eu.starsong.ghidra.endpoints;
import com.sun.net.httpserver.HttpExchange;
import com.sun.net.httpserver.HttpServer;
import eu.starsong.ghidra.api.ResponseBuilder;
import ghidra.program.model.address.Address;
import ghidra.program.model.address.AddressFactory;
import ghidra.program.model.listing.Function;
import ghidra.program.model.listing.Program;
import ghidra.program.model.symbol.Reference;
import ghidra.program.model.symbol.ReferenceIterator;
import ghidra.program.model.symbol.ReferenceManager;
import ghidra.program.model.symbol.RefType;
import ghidra.program.model.listing.CodeUnit;
import ghidra.program.model.symbol.Symbol;
import ghidra.program.model.symbol.SymbolTable;
import ghidra.framework.plugintool.PluginTool;
import ghidra.util.Msg;
import java.io.IOException;
import java.util.*;
public class XrefsEndpoints extends AbstractEndpoint {
private PluginTool tool;
public XrefsEndpoints(Program program, int port) {
super(program, port);
}
public XrefsEndpoints(Program program, int port, PluginTool tool) {
super(program, port);
this.tool = tool;
}
@Override
protected PluginTool getTool() {
return tool;
}
/**
* Helper method to convert ReferenceIterator to an iterable collection
*/
private List<Reference> collectReferences(ReferenceIterator iterator) {
List<Reference> references = new ArrayList<>();
while (iterator.hasNext()) {
references.add(iterator.next());
}
return references;
}
/**
* Helper method to convert Reference[] to a list
*/
private List<Reference> collectReferences(Reference[] refs) {
return Arrays.asList(refs);
}
@Override
public void registerEndpoints(HttpServer server) {
server.createContext("/xrefs", this::handleXrefsRequest);
}
private void handleXrefsRequest(HttpExchange exchange) throws IOException {
try {
if ("GET".equals(exchange.getRequestMethod())) {
Map<String, String> qparams = parseQueryParams(exchange);
String toAddrStr = qparams.get("to_addr");
String fromAddrStr = qparams.get("from_addr");
String refTypeStr = qparams.get("type");
int offset = parseIntOrDefault(qparams.get("offset"), 0);
int limit = parseIntOrDefault(qparams.get("limit"), 50);
Program program = getCurrentProgram();
if (program == null) {
sendErrorResponse(exchange, 400, "No program loaded", "NO_PROGRAM_LOADED");
return;
}
// Create ResponseBuilder for HATEOAS-compliant response
ResponseBuilder builder = new ResponseBuilder(exchange, port)
.success(true)
.addLink("self", "/xrefs" + (exchange.getRequestURI().getRawQuery() != null ?
"?" + exchange.getRequestURI().getRawQuery() : ""));
// Add common links
builder.addLink("program", "/program");
// At least one of to_addr or from_addr must be provided
if ((toAddrStr == null || toAddrStr.isEmpty()) &&
(fromAddrStr == null || fromAddrStr.isEmpty())) {
sendErrorResponse(exchange, 400, "Either to_addr or from_addr parameter is required", "MISSING_PARAMETER");
return;
}
// Parse addresses
AddressFactory addressFactory = program.getAddressFactory();
Address toAddr = null;
Address fromAddr = null;
if (toAddrStr != null && !toAddrStr.isEmpty()) {
try {
toAddr = addressFactory.getAddress(toAddrStr);
} catch (Exception e) {
sendErrorResponse(exchange, 400, "Invalid to_addr format: " + toAddrStr, "INVALID_PARAMETER");
return;
}
}
if (fromAddrStr != null && !fromAddrStr.isEmpty()) {
try {
fromAddr = addressFactory.getAddress(fromAddrStr);
} catch (Exception e) {
sendErrorResponse(exchange, 400, "Invalid from_addr format: " + fromAddrStr, "INVALID_PARAMETER");
return;
}
}
// Get reference manager
ReferenceManager refManager = program.getReferenceManager();
List<Map<String, Object>> referencesList = new ArrayList<>();
// Get references to this address
if (toAddr != null) {
// Get references to this address - must manually convert array
for (Reference ref : collectReferences(refManager.getReferencesTo(toAddr))) {
if (refTypeStr != null && !ref.getReferenceType().getName().equalsIgnoreCase(refTypeStr)) {
continue; // Skip if type filter doesn't match
}
Map<String, Object> refMap = createReferenceMap(program, ref, "to");
referencesList.add(refMap);
}
}
// Get references from this address
if (fromAddr != null) {
// Get references from this address - must manually convert array
for (Reference ref : collectReferences(refManager.getReferencesFrom(fromAddr))) {
if (refTypeStr != null && !ref.getReferenceType().getName().equalsIgnoreCase(refTypeStr)) {
continue; // Skip if type filter doesn't match
}
Map<String, Object> refMap = createReferenceMap(program, ref, "from");
referencesList.add(refMap);
}
}
// Sort by type and address
Collections.sort(referencesList, (a, b) -> {
// First sort by direction
int directionCompare = ((String)a.get("direction")).compareTo((String)b.get("direction"));
if (directionCompare != 0) return directionCompare;
// Then by reference type
int typeCompare = ((String)a.get("refType")).compareTo((String)b.get("refType"));
if (typeCompare != 0) return typeCompare;
// Finally by from_address
return ((String)a.get("from_addr")).compareTo((String)b.get("from_addr"));
});
// Apply pagination
List<Map<String, Object>> paginatedRefs =
applyPagination(referencesList, offset, limit, builder, "/xrefs",
buildQueryString(toAddrStr, fromAddrStr, refTypeStr));
// Create result object
Map<String, Object> result = new HashMap<>();
if (toAddr != null) {
result.put("to_addr", toAddrStr);
}
if (fromAddr != null) {
result.put("from_addr", fromAddrStr);
}
result.put("references", paginatedRefs);
// Add the result to the builder
builder.result(result);
// Add specific links
if (toAddr != null) {
builder.addLink("to_function", "/functions/" + toAddrStr);
}
if (fromAddr != null) {
builder.addLink("from_function", "/functions/" + fromAddrStr);
}
// Send the HATEOAS-compliant response
sendJsonResponse(exchange, builder.build(), 200);
} else {
sendErrorResponse(exchange, 405, "Method Not Allowed");
}
} catch (Exception e) {
Msg.error(this, "Error in /xrefs endpoint", e);
sendErrorResponse(exchange, 500, "Internal server error: " + e.getMessage());
}
}
private Map<String, Object> createReferenceMap(Program program, Reference ref, String direction) {
Map<String, Object> refMap = new HashMap<>();
// Basic reference information
refMap.put("direction", direction);
refMap.put("from_addr", ref.getFromAddress().toString());
refMap.put("to_addr", ref.getToAddress().toString());
refMap.put("refType", ref.getReferenceType().getName());
refMap.put("isPrimary", ref.isPrimary());
// Get source function (if any)
Function fromFunc = program.getFunctionManager().getFunctionContaining(ref.getFromAddress());
if (fromFunc != null) {
Map<String, Object> fromFuncMap = new HashMap<>();
fromFuncMap.put("name", fromFunc.getName());
fromFuncMap.put("address", fromFunc.getEntryPoint().toString());
fromFuncMap.put("offset", ref.getFromAddress().subtract(fromFunc.getEntryPoint()));
refMap.put("from_function", fromFuncMap);
}
// Get target function (if any)
Function toFunc = program.getFunctionManager().getFunctionContaining(ref.getToAddress());
if (toFunc != null) {
Map<String, Object> toFuncMap = new HashMap<>();
toFuncMap.put("name", toFunc.getName());
toFuncMap.put("address", toFunc.getEntryPoint().toString());
toFuncMap.put("offset", ref.getToAddress().subtract(toFunc.getEntryPoint()));
refMap.put("to_function", toFuncMap);
}
// Get source symbol (if any)
SymbolTable symbolTable = program.getSymbolTable();
Symbol[] fromSymbols = symbolTable.getSymbols(ref.getFromAddress());
if (fromSymbols != null && fromSymbols.length > 0) {
refMap.put("from_symbol", fromSymbols[0].getName());
}
// Get target symbol (if any)
Symbol[] toSymbols = symbolTable.getSymbols(ref.getToAddress());
if (toSymbols != null && toSymbols.length > 0) {
refMap.put("to_symbol", toSymbols[0].getName());
}
// Get the instruction/data at the from address (if applicable)
try {
CodeUnit codeUnit = program.getListing().getCodeUnitAt(ref.getFromAddress());
if (codeUnit != null) {
refMap.put("from_instruction", codeUnit.toString());
}
} catch (Exception e) {
// Ignore exceptions when getting code units
}
// Get the instruction/data at the to address (if applicable)
try {
CodeUnit codeUnit = program.getListing().getCodeUnitAt(ref.getToAddress());
if (codeUnit != null) {
refMap.put("to_instruction", codeUnit.toString());
}
} catch (Exception e) {
// Ignore exceptions when getting code units
}
return refMap;
}
private String buildQueryString(String toAddr, String fromAddr, String refType) {
StringBuilder query = new StringBuilder();
if (toAddr != null && !toAddr.isEmpty()) {
query.append("to_addr=").append(toAddr);
}
if (fromAddr != null && !fromAddr.isEmpty()) {
if (query.length() > 0) query.append("&");
query.append("from_addr=").append(fromAddr);
}
if (refType != null && !refType.isEmpty()) {
if (query.length() > 0) query.append("&");
query.append("type=").append(refType);
}
return query.toString();
}
private Address getCurrentAddress(Program program) {
if (program == null) return null;
// Try to get current address from tool
PluginTool tool = getTool();
if (tool != null) {
try {
// Try to get the address from the code browser service (most reliable in Ghidra 11+)
ghidra.app.services.CodeViewerService codeViewerService =
tool.getService(ghidra.app.services.CodeViewerService.class);
if (codeViewerService != null) {
ghidra.app.nav.Navigatable navigatable = codeViewerService.getNavigatable();
if (navigatable != null && navigatable.getProgram() == program) {
Address addr = navigatable.getLocation().getAddress();
if (addr != null) {
return addr;
}
}
}
// Try to get the address from the current listing using LocationService
ghidra.app.services.ProgramManager programManager =
tool.getService(ghidra.app.services.ProgramManager.class);
if (programManager != null && programManager.getCurrentProgram() == program) {
// Try to get the current location using CurrentLocation service
try {
// Try to find the current location from the program manager
java.lang.reflect.Method getCurrentLocationMethod =
programManager.getClass().getMethod("getCurrentLocation");
if (getCurrentLocationMethod != null) {
ghidra.program.util.ProgramLocation location =
(ghidra.program.util.ProgramLocation)getCurrentLocationMethod.invoke(programManager);
if (location != null && location.getProgram() == program) {
return location.getAddress();
}
}
} catch (Exception e) {
// Method doesn't exist, ignore and continue with other approaches
}
// If program is selected, use its memory address as a fallback
if (program.equals(programManager.getCurrentProgram())) {
ghidra.program.model.listing.Listing listing = program.getListing();
if (listing != null) {
// Return the first instruction or defined data we can find
ghidra.program.model.listing.InstructionIterator instructions =
listing.getInstructions(true);
if (instructions.hasNext()) {
return instructions.next().getAddress();
}
// Or try defined data
ghidra.program.model.listing.DataIterator data =
listing.getDefinedData(true);
if (data.hasNext()) {
return data.next().getAddress();
}
}
}
}
} catch (Exception e) {
Msg.error(this, "Error getting current address from tool", e);
}
}
// Fallback to program's min address
return program.getMinAddress();
}
}

View File

@ -0,0 +1,395 @@
package eu.starsong.ghidra.model;
import java.util.ArrayList;
import java.util.List;
/**
* Model class representing Ghidra function information.
* This provides a structured object for function data instead of using Map<String, Object>.
*/
public class FunctionInfo {
private String name;
private String address;
private String signature;
private String returnType;
private List<ParameterInfo> parameters;
private String decompilation;
private boolean isExternal;
private String callingConvention;
private String namespace;
/**
* Default constructor for serialization frameworks
*/
public FunctionInfo() {
this.parameters = new ArrayList<>();
}
/**
* Constructor with essential fields
*/
public FunctionInfo(String name, String address, String signature) {
this.name = name;
this.address = address;
this.signature = signature;
this.parameters = new ArrayList<>();
}
/**
* Full constructor
*/
public FunctionInfo(String name, String address, String signature, String returnType,
List<ParameterInfo> parameters, String decompilation,
boolean isExternal, String callingConvention, String namespace) {
this.name = name;
this.address = address;
this.signature = signature;
this.returnType = returnType;
this.parameters = parameters != null ? parameters : new ArrayList<>();
this.decompilation = decompilation;
this.isExternal = isExternal;
this.callingConvention = callingConvention;
this.namespace = namespace;
}
/**
* @return The function name
*/
public String getName() {
return name;
}
/**
* @param name The function name
*/
public void setName(String name) {
this.name = name;
}
/**
* @return The function entry point address
*/
public String getAddress() {
return address;
}
/**
* @param address The function entry point address
*/
public void setAddress(String address) {
this.address = address;
}
/**
* @return The function signature (prototype string)
*/
public String getSignature() {
return signature;
}
/**
* @param signature The function signature
*/
public void setSignature(String signature) {
this.signature = signature;
}
/**
* @return The function return type
*/
public String getReturnType() {
return returnType;
}
/**
* @param returnType The function return type
*/
public void setReturnType(String returnType) {
this.returnType = returnType;
}
/**
* @return The function parameters
*/
public List<ParameterInfo> getParameters() {
return parameters;
}
/**
* @param parameters The function parameters
*/
public void setParameters(List<ParameterInfo> parameters) {
this.parameters = parameters != null ? parameters : new ArrayList<>();
}
/**
* @return The decompiled C code for the function
*/
public String getDecompilation() {
return decompilation;
}
/**
* @param decompilation The decompiled C code
*/
public void setDecompilation(String decompilation) {
this.decompilation = decompilation;
}
/**
* @return Whether the function is external (imported)
*/
public boolean isExternal() {
return isExternal;
}
/**
* @param external Whether the function is external
*/
public void setExternal(boolean external) {
isExternal = external;
}
/**
* @return The function's calling convention
*/
public String getCallingConvention() {
return callingConvention;
}
/**
* @param callingConvention The function's calling convention
*/
public void setCallingConvention(String callingConvention) {
this.callingConvention = callingConvention;
}
/**
* @return The function's namespace
*/
public String getNamespace() {
return namespace;
}
/**
* @param namespace The function's namespace
*/
public void setNamespace(String namespace) {
this.namespace = namespace;
}
/**
* Add a parameter to the function
* @param parameter The parameter to add
*/
public void addParameter(ParameterInfo parameter) {
if (parameter != null) {
this.parameters.add(parameter);
}
}
/**
* Builder pattern for FunctionInfo
*/
public static class Builder {
private String name;
private String address;
private String signature;
private String returnType;
private List<ParameterInfo> parameters = new ArrayList<>();
private String decompilation;
private boolean isExternal;
private String callingConvention;
private String namespace;
public Builder name(String name) {
this.name = name;
return this;
}
public Builder address(String address) {
this.address = address;
return this;
}
public Builder signature(String signature) {
this.signature = signature;
return this;
}
public Builder returnType(String returnType) {
this.returnType = returnType;
return this;
}
public Builder parameters(List<ParameterInfo> parameters) {
this.parameters = parameters;
return this;
}
public Builder addParameter(ParameterInfo parameter) {
this.parameters.add(parameter);
return this;
}
public Builder decompilation(String decompilation) {
this.decompilation = decompilation;
return this;
}
public Builder isExternal(boolean isExternal) {
this.isExternal = isExternal;
return this;
}
public Builder callingConvention(String callingConvention) {
this.callingConvention = callingConvention;
return this;
}
public Builder namespace(String namespace) {
this.namespace = namespace;
return this;
}
public FunctionInfo build() {
return new FunctionInfo(
name, address, signature, returnType,
parameters, decompilation, isExternal,
callingConvention, namespace
);
}
}
/**
* Create a new builder for FunctionInfo
* @return A new builder instance
*/
public static Builder builder() {
return new Builder();
}
/**
* Inner class representing function parameter information
*/
public static class ParameterInfo {
private String name;
private String dataType;
private int ordinal;
private String storage;
/**
* Default constructor for serialization frameworks
*/
public ParameterInfo() {
}
/**
* Full constructor
*/
public ParameterInfo(String name, String dataType, int ordinal, String storage) {
this.name = name;
this.dataType = dataType;
this.ordinal = ordinal;
this.storage = storage;
}
/**
* @return The parameter name
*/
public String getName() {
return name;
}
/**
* @param name The parameter name
*/
public void setName(String name) {
this.name = name;
}
/**
* @return The parameter data type
*/
public String getDataType() {
return dataType;
}
/**
* @param dataType The parameter data type
*/
public void setDataType(String dataType) {
this.dataType = dataType;
}
/**
* @return The parameter position (0-based)
*/
public int getOrdinal() {
return ordinal;
}
/**
* @param ordinal The parameter position
*/
public void setOrdinal(int ordinal) {
this.ordinal = ordinal;
}
/**
* @return The parameter storage location
*/
public String getStorage() {
return storage;
}
/**
* @param storage The parameter storage location
*/
public void setStorage(String storage) {
this.storage = storage;
}
/**
* Builder pattern for ParameterInfo
*/
public static class Builder {
private String name;
private String dataType;
private int ordinal;
private String storage;
public Builder name(String name) {
this.name = name;
return this;
}
public Builder dataType(String dataType) {
this.dataType = dataType;
return this;
}
public Builder ordinal(int ordinal) {
this.ordinal = ordinal;
return this;
}
public Builder storage(String storage) {
this.storage = storage;
return this;
}
public ParameterInfo build() {
return new ParameterInfo(name, dataType, ordinal, storage);
}
}
/**
* Create a new builder for ParameterInfo
* @return A new builder instance
*/
public static Builder builder() {
return new Builder();
}
}
}

View File

@ -0,0 +1,175 @@
package eu.starsong.ghidra.model;
import java.util.HashMap;
import java.util.Map;
/**
* Standardized response object for API responses.
* This class follows the common response structure used throughout the API.
*/
public class JsonResponse {
private boolean success;
private Object result;
private Map<String, Object> error;
private Map<String, Object> links;
private String id;
private String instance;
// Private constructor for builder pattern
private JsonResponse() {
this.links = new HashMap<>();
}
/**
* @return Whether the request was successful
*/
public boolean isSuccess() {
return success;
}
/**
* @return The result data for successful requests
*/
public Object getResult() {
return result;
}
/**
* @return Error information for failed requests
*/
public Map<String, Object> getError() {
return error;
}
/**
* @return HATEOAS links
*/
public Map<String, Object> getLinks() {
return links;
}
/**
* @return Request ID
*/
public String getId() {
return id;
}
/**
* @return Server instance information
*/
public String getInstance() {
return instance;
}
/**
* Creates a new builder for constructing a JsonResponse
* @return A new builder instance
*/
public static Builder builder() {
return new Builder();
}
/**
* Builder class for JsonResponse
*/
public static class Builder {
private final JsonResponse response;
private Builder() {
response = new JsonResponse();
}
/**
* Set the success status
* @param success Whether the request was successful
* @return This builder
*/
public Builder success(boolean success) {
response.success = success;
return this;
}
/**
* Set the result data
* @param result The result data
* @return This builder
*/
public Builder result(Object result) {
response.result = result;
return this;
}
/**
* Set error information
* @param message Error message
* @param code Error code
* @return This builder
*/
public Builder error(String message, String code) {
Map<String, Object> error = new HashMap<>();
error.put("message", message);
if (code != null && !code.isEmpty()) {
error.put("code", code);
}
response.error = error;
return this;
}
/**
* Add a link
* @param rel Relation name
* @param href Link URL
* @return This builder
*/
public Builder addLink(String rel, String href) {
Map<String, String> link = new HashMap<>();
link.put("href", href);
response.links.put(rel, link);
return this;
}
/**
* Add a link with method
* @param rel Relation name
* @param href Link URL
* @param method HTTP method
* @return This builder
*/
public Builder addLink(String rel, String href, String method) {
Map<String, String> link = new HashMap<>();
link.put("href", href);
link.put("method", method);
response.links.put(rel, link);
return this;
}
/**
* Set request ID
* @param id Request ID
* @return This builder
*/
public Builder id(String id) {
response.id = id;
return this;
}
/**
* Set instance information
* @param instance Instance information
* @return This builder
*/
public Builder instance(String instance) {
response.instance = instance;
return this;
}
/**
* Build the JsonResponse
* @return The constructed JsonResponse
*/
public JsonResponse build() {
return response;
}
}
}

View File

@ -0,0 +1,218 @@
package eu.starsong.ghidra.model;
/**
* Model class representing Ghidra program information.
* This provides a structured object for program data instead of using Map<String, Object>.
*/
public class ProgramInfo {
private String programId;
private String name;
private String languageId;
private String compilerSpecId;
private String imageBase;
private long memorySize;
private boolean isOpen;
private boolean analysisComplete;
/**
* Default constructor for serialization frameworks
*/
public ProgramInfo() {
}
/**
* Full constructor
*/
public ProgramInfo(String programId, String name, String languageId, String compilerSpecId,
String imageBase, long memorySize, boolean isOpen, boolean analysisComplete) {
this.programId = programId;
this.name = name;
this.languageId = languageId;
this.compilerSpecId = compilerSpecId;
this.imageBase = imageBase;
this.memorySize = memorySize;
this.isOpen = isOpen;
this.analysisComplete = analysisComplete;
}
/**
* @return The program's unique identifier (typically the file pathname)
*/
public String getProgramId() {
return programId;
}
/**
* @param programId The program's unique identifier
*/
public void setProgramId(String programId) {
this.programId = programId;
}
/**
* @return The program's name
*/
public String getName() {
return name;
}
/**
* @param name The program's name
*/
public void setName(String name) {
this.name = name;
}
/**
* @return The program's language ID
*/
public String getLanguageId() {
return languageId;
}
/**
* @param languageId The program's language ID
*/
public void setLanguageId(String languageId) {
this.languageId = languageId;
}
/**
* @return The program's compiler specification ID
*/
public String getCompilerSpecId() {
return compilerSpecId;
}
/**
* @param compilerSpecId The program's compiler specification ID
*/
public void setCompilerSpecId(String compilerSpecId) {
this.compilerSpecId = compilerSpecId;
}
/**
* @return The program's image base address
*/
public String getImageBase() {
return imageBase;
}
/**
* @param imageBase The program's image base address
*/
public void setImageBase(String imageBase) {
this.imageBase = imageBase;
}
/**
* @return The program's memory size in bytes
*/
public long getMemorySize() {
return memorySize;
}
/**
* @param memorySize The program's memory size in bytes
*/
public void setMemorySize(long memorySize) {
this.memorySize = memorySize;
}
/**
* @return Whether the program is currently open
*/
public boolean isOpen() {
return isOpen;
}
/**
* @param open Whether the program is currently open
*/
public void setOpen(boolean open) {
isOpen = open;
}
/**
* @return Whether analysis has been completed on the program
*/
public boolean isAnalysisComplete() {
return analysisComplete;
}
/**
* @param analysisComplete Whether analysis has been completed on the program
*/
public void setAnalysisComplete(boolean analysisComplete) {
this.analysisComplete = analysisComplete;
}
/**
* Builder pattern for ProgramInfo
*/
public static class Builder {
private String programId;
private String name;
private String languageId;
private String compilerSpecId;
private String imageBase;
private long memorySize;
private boolean isOpen;
private boolean analysisComplete;
public Builder programId(String programId) {
this.programId = programId;
return this;
}
public Builder name(String name) {
this.name = name;
return this;
}
public Builder languageId(String languageId) {
this.languageId = languageId;
return this;
}
public Builder compilerSpecId(String compilerSpecId) {
this.compilerSpecId = compilerSpecId;
return this;
}
public Builder imageBase(String imageBase) {
this.imageBase = imageBase;
return this;
}
public Builder memorySize(long memorySize) {
this.memorySize = memorySize;
return this;
}
public Builder isOpen(boolean isOpen) {
this.isOpen = isOpen;
return this;
}
public Builder analysisComplete(boolean analysisComplete) {
this.analysisComplete = analysisComplete;
return this;
}
public ProgramInfo build() {
return new ProgramInfo(
programId, name, languageId, compilerSpecId,
imageBase, memorySize, isOpen, analysisComplete
);
}
}
/**
* Create a new builder for ProgramInfo
* @return A new builder instance
*/
public static Builder builder() {
return new Builder();
}
}

View File

@ -0,0 +1,226 @@
package eu.starsong.ghidra.model;
/**
* Model class representing Ghidra variable information.
* This provides a structured object for variable data instead of using Map<String, Object>.
*/
public class VariableInfo {
private String name;
private String dataType;
private String address;
private String type; // "local", "parameter", "global", etc.
private String function; // Function name if local/parameter
private String storage; // Storage location
private String value; // Value if known
/**
* Default constructor for serialization frameworks
*/
public VariableInfo() {
}
/**
* Constructor with essential fields
*/
public VariableInfo(String name, String dataType, String type) {
this.name = name;
this.dataType = dataType;
this.type = type;
}
/**
* Full constructor
*/
public VariableInfo(String name, String dataType, String address, String type,
String function, String storage, String value) {
this.name = name;
this.dataType = dataType;
this.address = address;
this.type = type;
this.function = function;
this.storage = storage;
this.value = value;
}
/**
* @return The variable name
*/
public String getName() {
return name;
}
/**
* @param name The variable name
*/
public void setName(String name) {
this.name = name;
}
/**
* @return The variable data type
*/
public String getDataType() {
return dataType;
}
/**
* @param dataType The variable data type
*/
public void setDataType(String dataType) {
this.dataType = dataType;
}
/**
* @return The variable address (if applicable)
*/
public String getAddress() {
return address;
}
/**
* @param address The variable address
*/
public void setAddress(String address) {
this.address = address;
}
/**
* @return The variable type (local, parameter, global, etc.)
*/
public String getType() {
return type;
}
/**
* @param type The variable type
*/
public void setType(String type) {
this.type = type;
}
/**
* @return The function name (for local variables and parameters)
*/
public String getFunction() {
return function;
}
/**
* @param function The function name
*/
public void setFunction(String function) {
this.function = function;
}
/**
* @return The variable storage location
*/
public String getStorage() {
return storage;
}
/**
* @param storage The variable storage location
*/
public void setStorage(String storage) {
this.storage = storage;
}
/**
* @return The variable value (if known)
*/
public String getValue() {
return value;
}
/**
* @param value The variable value
*/
public void setValue(String value) {
this.value = value;
}
/**
* @return Whether this variable is a local variable
*/
public boolean isLocal() {
return "local".equals(type);
}
/**
* @return Whether this variable is a parameter
*/
public boolean isParameter() {
return "parameter".equals(type);
}
/**
* @return Whether this variable is a global variable
*/
public boolean isGlobal() {
return "global".equals(type);
}
/**
* Builder pattern for VariableInfo
*/
public static class Builder {
private String name;
private String dataType;
private String address;
private String type;
private String function;
private String storage;
private String value;
public Builder name(String name) {
this.name = name;
return this;
}
public Builder dataType(String dataType) {
this.dataType = dataType;
return this;
}
public Builder address(String address) {
this.address = address;
return this;
}
public Builder type(String type) {
this.type = type;
return this;
}
public Builder function(String function) {
this.function = function;
return this;
}
public Builder storage(String storage) {
this.storage = storage;
return this;
}
public Builder value(String value) {
this.value = value;
return this;
}
public VariableInfo build() {
return new VariableInfo(
name, dataType, address, type,
function, storage, value
);
}
}
/**
* Create a new builder for VariableInfo
* @return A new builder instance
*/
public static Builder builder() {
return new Builder();
}
}

View File

@ -0,0 +1,6 @@
package eu.starsong.ghidra.util;
@FunctionalInterface
public interface GhidraSupplier<T> {
T get() throws Exception;
}

View File

@ -0,0 +1,541 @@
package eu.starsong.ghidra.util;
import ghidra.app.decompiler.DecompInterface;
import ghidra.app.decompiler.DecompileOptions;
import ghidra.app.decompiler.DecompileResults;
import ghidra.app.services.GoToService;
import ghidra.app.services.ProgramManager;
import ghidra.framework.plugintool.PluginTool;
import ghidra.program.model.address.Address;
import ghidra.program.model.address.AddressFactory;
import ghidra.program.model.data.DataType;
import ghidra.program.model.data.DataTypeManager;
import ghidra.program.model.listing.Function;
import ghidra.program.model.listing.FunctionManager;
import ghidra.program.model.listing.Parameter;
import ghidra.program.model.listing.ParameterImpl;
import ghidra.program.model.listing.Program;
import ghidra.program.model.listing.Variable;
import ghidra.program.model.symbol.SourceType;
import ghidra.program.model.pcode.HighFunction;
import ghidra.program.model.pcode.HighVariable;
import ghidra.program.model.pcode.PcodeOp;
import ghidra.program.model.pcode.Varnode;
import ghidra.program.model.symbol.SymbolTable;
import ghidra.program.util.ProgramLocation;
import ghidra.util.Msg;
import ghidra.util.exception.CancelledException;
import ghidra.util.task.TaskMonitor;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class GhidraUtil {
/**
* Parse an integer from a string, or return defaultValue if null/invalid.
*/
public static int parseIntOrDefault(String val, int defaultValue) {
if (val == null) return defaultValue;
try {
return Integer.parseInt(val);
}
catch (NumberFormatException e) {
return defaultValue;
}
}
/**
* Finds a data type by name within the program's data type managers.
* @param program The current program.
* @param dataTypeName The name of the data type to find.
* @return The found DataType, or null if not found.
*/
public static DataType findDataType(Program program, String dataTypeName) {
if (program == null || dataTypeName == null || dataTypeName.isEmpty()) {
return null;
}
DataTypeManager dtm = program.getDataTypeManager();
List<DataType> foundTypes = new ArrayList<>();
dtm.findDataTypes(dataTypeName, foundTypes);
if (!foundTypes.isEmpty()) {
// Prefer the first match, might need more sophisticated logic
// if multiple types with the same name exist in different categories.
return foundTypes.get(0);
} else {
Msg.warn(GhidraUtil.class, "Data type not found: " + dataTypeName);
return null;
}
}
/**
* Gets the current address as a string from the Ghidra tool.
* @param tool The Ghidra plugin tool.
* @return The current address as a string, or null if not available.
*/
public static String getCurrentAddressString(PluginTool tool) {
if (tool == null) {
return null;
}
// Get current program
ProgramManager programManager = tool.getService(ProgramManager.class);
if (programManager == null) {
return null;
}
Program program = programManager.getCurrentProgram();
if (program == null) {
return null;
}
// Get the current cursor location using CodeViewerService
ghidra.app.services.CodeViewerService codeViewerService = tool.getService(ghidra.app.services.CodeViewerService.class);
if (codeViewerService == null) {
// Fallback to program's entry point if service not available
return program.getImageBase().toString();
}
ghidra.program.util.ProgramLocation currentLocation = codeViewerService.getCurrentLocation();
if (currentLocation == null) {
// Fallback to program's entry point if location not available
return program.getImageBase().toString();
}
// Return the current address
return currentLocation.getAddress().toString();
}
/**
* Gets information about the current function in the Ghidra tool.
* @param tool The Ghidra plugin tool.
* @param program The current program.
* @return A map containing information about the current function, or an empty map if not available.
*/
public static Map<String, Object> getCurrentFunctionInfo(PluginTool tool, Program program) {
Map<String, Object> result = new HashMap<>();
if (tool == null || program == null) {
return result;
}
// Get the current cursor location using CodeViewerService
ghidra.app.services.CodeViewerService codeViewerService = tool.getService(ghidra.app.services.CodeViewerService.class);
if (codeViewerService == null) {
return result;
}
ghidra.program.util.ProgramLocation currentLocation = codeViewerService.getCurrentLocation();
if (currentLocation == null) {
return result;
}
// Get the function at the current location
Address currentAddress = currentLocation.getAddress();
FunctionManager functionManager = program.getFunctionManager();
Function function = functionManager.getFunctionContaining(currentAddress);
if (function == null) {
// If we couldn't find a function at the current address, return the first function as a fallback
for (Function f : functionManager.getFunctions(true)) {
function = f;
break;
}
if (function == null) {
return result;
}
}
// Build the function info
result.put("name", function.getName());
result.put("address", function.getEntryPoint().toString());
result.put("signature", function.getSignature().getPrototypeString());
// Add more details
if (function.getReturnType() != null) {
result.put("returnType", function.getReturnType().getName());
}
if (function.getCallingConventionName() != null) {
result.put("callingConvention", function.getCallingConventionName());
}
// Add parameters
List<Map<String, String>> parameters = new ArrayList<>();
for (Parameter param : function.getParameters()) {
Map<String, String> paramInfo = new HashMap<>();
paramInfo.put("name", param.getName());
paramInfo.put("type", param.getDataType().getName());
parameters.add(paramInfo);
}
result.put("parameters", parameters);
return result;
}
/**
* Gets information about a function by its name or address.
* @param program The current program.
* @param addressOrName The function address or name.
* @return A map containing information about the function, or null if not found.
*/
public static Map<String, Object> getFunctionInfoByAddress(Program program, String addressOrName) {
if (program == null || addressOrName == null || addressOrName.isEmpty()) {
return null;
}
Function function = null;
// First try to interpret as an address
try {
Address address = program.getAddressFactory().getAddress(addressOrName);
if (address != null) {
function = program.getFunctionManager().getFunctionAt(address);
if (function == null) {
function = program.getFunctionManager().getFunctionContaining(address);
}
}
} catch (Exception e) {
// Not a valid address, try as a name
Msg.debug(GhidraUtil.class, "Could not interpret as address: " + addressOrName);
}
// If not found by address, try by name
if (function == null) {
for (Function f : program.getFunctionManager().getFunctions(true)) {
if (f.getName().equals(addressOrName)) {
function = f;
break;
}
}
}
if (function == null) {
return null;
}
// Build the function info
Map<String, Object> result = new HashMap<>();
result.put("name", function.getName());
result.put("address", function.getEntryPoint().toString());
result.put("signature", function.getSignature().getPrototypeString());
// Add more details
if (function.getReturnType() != null) {
result.put("returnType", function.getReturnType().getName());
}
if (function.getCallingConventionName() != null) {
result.put("callingConvention", function.getCallingConventionName());
}
return result;
}
/**
* Gets information about a function at the specified address.
* @param program The current program.
* @param addressStr The address as a string.
* @return A map containing information about the function, or an empty map if not found.
*/
public static Map<String, Object> getFunctionByAddress(Program program, String addressStr) {
Map<String, Object> result = new HashMap<>();
if (program == null || addressStr == null || addressStr.isEmpty()) {
return result;
}
AddressFactory addressFactory = program.getAddressFactory();
Address address;
try {
address = addressFactory.getAddress(addressStr);
} catch (Exception e) {
Msg.error(GhidraUtil.class, "Invalid address format: " + addressStr, e);
return result;
}
if (address == null) {
return result;
}
FunctionManager functionManager = program.getFunctionManager();
Function function = functionManager.getFunctionAt(address);
if (function == null) {
function = functionManager.getFunctionContaining(address);
}
if (function == null) {
return result;
}
result.put("name", function.getName());
result.put("address", function.getEntryPoint().toString());
result.put("signature", function.getSignature().getPrototypeString());
// Add decompilation
String decompilation = decompileFunction(function);
result.put("decompilation", decompilation != null ? decompilation : "");
return result;
}
/**
* Decompiles a function at the specified address.
* @param program The current program.
* @param addressStr The address as a string.
* @return A map containing the decompilation result, or an empty map if not found.
*/
public static Map<String, Object> decompileFunction(Program program, String addressStr) {
Map<String, Object> result = new HashMap<>();
if (program == null || addressStr == null || addressStr.isEmpty()) {
return result;
}
AddressFactory addressFactory = program.getAddressFactory();
Address address;
try {
address = addressFactory.getAddress(addressStr);
} catch (Exception e) {
Msg.error(GhidraUtil.class, "Invalid address format: " + addressStr, e);
return result;
}
if (address == null) {
return result;
}
FunctionManager functionManager = program.getFunctionManager();
Function function = functionManager.getFunctionAt(address);
if (function == null) {
function = functionManager.getFunctionContaining(address);
}
if (function == null) {
return result;
}
String decompilation = decompileFunction(function);
result.put("decompilation", decompilation != null ? decompilation : "");
return result;
}
/**
* Helper method to decompile a function.
* @param function The function to decompile.
* @return The decompiled code as a string, or null if decompilation failed.
*/
public static String decompileFunction(Function function) {
if (function == null) {
return null;
}
Program program = function.getProgram();
DecompInterface decompiler = new DecompInterface();
DecompileOptions options = new DecompileOptions();
decompiler.setOptions(options);
decompiler.openProgram(program);
try {
DecompileResults results = decompiler.decompileFunction(function, 30, TaskMonitor.DUMMY);
if (results.decompileCompleted()) {
return results.getDecompiledFunction().getC();
} else {
Msg.warn(GhidraUtil.class, "Decompilation failed for function: " + function.getName());
return "// Decompilation failed for " + function.getName();
}
} catch (Exception e) {
Msg.error(GhidraUtil.class, "Error during decompilation of function: " + function.getName(), e);
return "// Error during decompilation: " + e.getMessage();
} finally {
decompiler.dispose();
}
}
/**
* Gets information about variables in a function, including decompiler variables.
* @param function The function to get variables from.
* @return A list of maps containing information about each variable.
*/
public static List<Map<String, Object>> getFunctionVariables(Function function) {
List<Map<String, Object>> variables = new ArrayList<>();
if (function == null) {
return variables;
}
// Add parameters
for (Parameter param : function.getParameters()) {
Map<String, Object> varInfo = new HashMap<>();
varInfo.put("name", param.getName());
varInfo.put("type", param.getDataType().getName());
varInfo.put("isParameter", true);
varInfo.put("storage", param.getVariableStorage().toString());
varInfo.put("source", "database");
variables.add(varInfo);
}
// Add local variables from database
for (Variable var : function.getAllVariables()) {
if (var instanceof Parameter) {
continue; // Skip parameters, already added
}
Map<String, Object> varInfo = new HashMap<>();
varInfo.put("name", var.getName());
varInfo.put("type", var.getDataType().getName());
varInfo.put("isParameter", false);
varInfo.put("storage", var.getVariableStorage().toString());
varInfo.put("source", "database");
variables.add(varInfo);
}
// Add decompiler-generated variables
DecompInterface decompiler = new DecompInterface();
try {
decompiler.openProgram(function.getProgram());
DecompileResults results = decompiler.decompileFunction(function, 30, TaskMonitor.DUMMY);
if (results.decompileCompleted()) {
HighFunction highFunc = results.getHighFunction();
if (highFunc != null) {
// Iterate over local variables from decompiler
for (java.util.Iterator<ghidra.program.model.pcode.HighSymbol> iter =
highFunc.getLocalSymbolMap().getSymbols(); iter.hasNext(); ) {
ghidra.program.model.pcode.HighSymbol highSymbol = iter.next();
// Skip if this is already a tracked variable
boolean alreadyAdded = false;
for (Map<String, Object> var : variables) {
if (var.get("name").equals(highSymbol.getName())) {
alreadyAdded = true;
break;
}
}
if (!alreadyAdded) {
Map<String, Object> varInfo = new HashMap<>();
varInfo.put("name", highSymbol.getName());
varInfo.put("type", highSymbol.getDataType() != null ?
highSymbol.getDataType().getName() : "unknown");
varInfo.put("isParameter", highSymbol.isParameter());
varInfo.put("storage", highSymbol.getStorage() != null ?
highSymbol.getStorage().toString() : "unknown");
varInfo.put("source", "decompiler");
// Add PC address if available
if (highSymbol.getPCAddress() != null) {
varInfo.put("pcAddress", highSymbol.getPCAddress().toString());
}
variables.add(varInfo);
}
}
}
}
}
catch (Exception e) {
Msg.error(GhidraUtil.class, "Error analyzing decompiler variables", e);
}
finally {
decompiler.dispose();
}
return variables;
}
/**
* Applies a function signature to an existing function.
* @param function The function to update
* @param signatureStr The C-style function signature string
* @return true if successful, false otherwise
*/
public static boolean setFunctionSignature(Function function, String signatureStr) {
if (function == null || signatureStr == null || signatureStr.isEmpty()) {
return false;
}
Program program = function.getProgram();
if (program == null) {
return false;
}
try {
// Create a function signature parser
ghidra.app.util.parser.FunctionSignatureParser parser =
new ghidra.app.util.parser.FunctionSignatureParser(
program.getDataTypeManager(), null);
// Parse the signature string
ghidra.program.model.data.FunctionDefinitionDataType functionDef =
parser.parse(function.getSignature(), signatureStr);
if (functionDef == null) {
return false;
}
// Get source type for update
ghidra.program.model.symbol.SourceType sourceType =
ghidra.program.model.symbol.SourceType.USER_DEFINED;
// Get the parameters from the function definition
ghidra.program.model.data.ParameterDefinition[] paramDefs =
functionDef.getArguments();
try {
// Get return type from the function definition
ghidra.program.model.data.DataType returnType = functionDef.getReturnType();
// Set the return type
function.setReturnType(returnType, sourceType);
// Get calling convention if available
if (functionDef.getCallingConvention() != null) {
String callingConvention = functionDef.getCallingConvention().getName();
function.setCallingConvention(callingConvention);
}
// Remove all existing parameters
while (function.getParameterCount() > 0) {
function.removeParameter(0);
}
// Add each parameter
if (paramDefs != null) {
for (int i = 0; i < paramDefs.length; i++) {
ghidra.program.model.data.ParameterDefinition paramDef = paramDefs[i];
String name = paramDef.getName();
ghidra.program.model.data.DataType dataType = paramDef.getDataType();
// Create parameter and then add it
Parameter param = new ParameterImpl(name, dataType, program);
function.addParameter(param, sourceType);
}
}
return true;
} catch (ghidra.util.exception.InvalidInputException e) {
ghidra.util.Msg.error(GhidraUtil.class,
"Error setting function parameters: " + e.getMessage(), e);
return false;
}
}
catch (Exception e) {
ghidra.util.Msg.error(GhidraUtil.class,
"Error setting function signature: " + e.getMessage(), e);
return false;
}
}
}

View File

@ -0,0 +1,156 @@
package eu.starsong.ghidra.util;
import com.google.gson.Gson;
import com.google.gson.JsonElement;
import com.google.gson.JsonObject;
import com.sun.net.httpserver.HttpExchange;
import com.sun.net.httpserver.Headers;
import eu.starsong.ghidra.api.ResponseBuilder; // Use the ResponseBuilder
import ghidra.util.Msg;
import java.io.IOException;
import java.io.OutputStream;
import java.nio.charset.StandardCharsets;
import java.util.HashMap;
import java.util.Map;
public class HttpUtil {
private static final Gson gson = new Gson();
/**
* Sends a JSON response with the given status code.
* Uses the ResponseBuilder internally.
*/
/**
* Add CORS headers to the response
*/
public static void addCorsHeaders(HttpExchange exchange) {
Headers headers = exchange.getResponseHeaders();
headers.set("Access-Control-Allow-Origin", "http://localhost");
headers.set("Access-Control-Allow-Methods", "GET, POST, PUT, PATCH, DELETE, OPTIONS");
headers.set("Access-Control-Allow-Headers", "Content-Type, X-Request-ID");
headers.set("Access-Control-Max-Age", "3600");
}
/**
* Handle OPTIONS requests for CORS preflight
* @return true if the request was handled (OPTIONS request), false otherwise
*/
public static boolean handleOptionsRequest(HttpExchange exchange) throws IOException {
if ("OPTIONS".equals(exchange.getRequestMethod())) {
addCorsHeaders(exchange);
exchange.sendResponseHeaders(204, -1);
return true;
}
return false;
}
public static void sendJsonResponse(HttpExchange exchange, JsonObject jsonObj, int statusCode, int port) throws IOException {
try {
// Handle OPTIONS requests for CORS preflight
if (handleOptionsRequest(exchange)) {
return;
}
String json = gson.toJson(jsonObj);
byte[] bytes = json.getBytes(StandardCharsets.UTF_8);
exchange.getResponseHeaders().set("Content-Type", "application/json; charset=utf-8");
addCorsHeaders(exchange);
long responseLength = (statusCode == 204) ? -1 : bytes.length;
exchange.sendResponseHeaders(statusCode, responseLength);
if (responseLength != -1) {
try (OutputStream os = exchange.getResponseBody()) {
os.write(bytes);
}
} else {
exchange.getResponseBody().close(); // Important for 204
}
} catch (Exception e) {
Msg.error(HttpUtil.class, "Error sending JSON response: " + e.getMessage(), e);
// Avoid sending another error response here to prevent potential loops
if (!exchange.getResponseHeaders().containsKey("Content-Type")) {
byte[] errorBytes = ("Internal Server Error: " + e.getMessage()).getBytes(StandardCharsets.UTF_8);
exchange.getResponseHeaders().set("Content-Type", "text/plain; charset=utf-8");
exchange.sendResponseHeaders(500, errorBytes.length);
try (OutputStream os = exchange.getResponseBody()) {
os.write(errorBytes);
} catch (IOException writeErr) {
Msg.error(HttpUtil.class, "Failed to send even plain text error response", writeErr);
}
}
throw new IOException("Failed to send JSON response", e);
}
}
/**
* Sends a standardized error response using ResponseBuilder.
*/
public static void sendErrorResponse(HttpExchange exchange, int statusCode, String message, String errorCode, int port) throws IOException {
ResponseBuilder builder = new ResponseBuilder(exchange, port)
.success(false)
.error(message, errorCode);
sendJsonResponse(exchange, builder.build(), statusCode, port);
}
/**
* Parses query parameters from the URL.
*/
public static Map<String, String> parseQueryParams(HttpExchange exchange) {
Map<String, String> result = new HashMap<>();
String query = exchange.getRequestURI().getQuery();
if (query != null) {
String[] pairs = query.split("&");
for (String p : pairs) {
String[] kv = p.split("=");
if (kv.length == 2) {
try {
result.put(kv[0], java.net.URLDecoder.decode(kv[1], StandardCharsets.UTF_8));
} catch (Exception e) {
Msg.warn(HttpUtil.class, "Failed to decode query parameter: " + kv[0]);
result.put(kv[0], kv[1]);
}
} else if (kv.length == 1 && !kv[0].isEmpty()) {
result.put(kv[0], "");
}
}
}
return result;
}
/**
* Parses POST body parameters strictly as JSON.
*/
public static Map<String, String> parseJsonPostParams(HttpExchange exchange) throws IOException {
byte[] body = exchange.getRequestBody().readAllBytes();
String bodyStr = new String(body, StandardCharsets.UTF_8);
// Debug - log raw request body
ghidra.util.Msg.info(HttpUtil.class, "DEBUG Raw request body: " + bodyStr);
Map<String, String> params = new HashMap<>();
try {
JsonObject json = gson.fromJson(bodyStr, JsonObject.class);
if (json == null) {
return params;
}
for (Map.Entry<String, JsonElement> entry : json.entrySet()) {
String key = entry.getKey();
JsonElement value = entry.getValue();
if (value.isJsonPrimitive()) {
params.put(key, value.getAsString());
} else {
params.put(key, value.toString()); // Stringify non-primitives
}
}
} catch (Exception e) {
Msg.error(HttpUtil.class, "Failed to parse JSON request body: " + bodyStr, e);
throw new IOException("Invalid JSON request body: " + e.getMessage(), e);
}
return params;
}
}

View File

@ -0,0 +1,62 @@
package eu.starsong.ghidra.util;
import ghidra.program.model.listing.Program;
import ghidra.util.Msg;
import javax.swing.SwingUtilities;
import java.util.concurrent.atomic.AtomicReference;
public class TransactionHelper {
@FunctionalInterface
public interface GhidraSupplier<T> {
T get() throws Exception;
}
public static <T> T executeInTransaction(Program program, String transactionName, GhidraSupplier<T> operation)
throws TransactionException {
if (program == null) {
throw new IllegalArgumentException("Program cannot be null for transaction");
}
AtomicReference<T> result = new AtomicReference<>();
AtomicReference<Exception> exception = new AtomicReference<>();
try {
SwingUtilities.invokeAndWait(() -> {
int txId = -1;
boolean success = false;
try {
txId = program.startTransaction(transactionName);
if (txId < 0) {
throw new TransactionException("Failed to start transaction: " + transactionName);
}
result.set(operation.get());
success = true;
} catch (Exception e) {
exception.set(e);
Msg.error(TransactionHelper.class, "Transaction failed: " + transactionName, e);
} finally {
if (txId >= 0) {
if (!program.endTransaction(txId, success)) {
Msg.error(TransactionHelper.class, "Failed to end transaction: " + transactionName);
exception.set(new TransactionException("Failed to end transaction: " + transactionName));
}
}
}
});
} catch (Exception e) {
throw new TransactionException("Swing thread execution failed", e);
}
if (exception.get() != null) {
throw new TransactionException("Operation failed", exception.get());
}
return result.get();
}
public static class TransactionException extends Exception {
public TransactionException(String message) { super(message); }
public TransactionException(String message, Throwable cause) { super(message, cause); }
}
}

View File

@ -1,7 +1,7 @@
Manifest-Version: 1.0
Plugin-Class: eu.starsong.ghidra.GhydraMCP
Plugin-Name: GhydraMCP
Plugin-Version: 11.3.1
Plugin-Class: eu.starsong.ghidra.MCGhidra
Plugin-Name: MCGhidra
Plugin-Version: 11.4.2
Bundle-Version: dev-SNAPSHOT
Plugin-Author: LaurieWired, Teal Bauer
Plugin-Description: Expose multiple Ghidra tools to MCP servers with variable management

View File

@ -1,3 +1,9 @@
Manifest-Version: 1.0
GHIDRA_MODULE_NAME: GhydraMCP
GHIDRA_MODULE_DESC: A multi-headed REST interface for Ghidra for use with MCP agents.
# MCGhidra Module Manifest
#
# This file lists third-party libraries bundled with this extension and their licenses.
# Module metadata (name, description, version) is defined in extension.properties.
#
# Format: MODULE FILE LICENSE: lib/filename.jar License Name
#
# Currently, MCGhidra has no bundled third-party libraries.
# Gson is provided by Ghidra itself.

View File

@ -1,6 +1,6 @@
name=GhydraMCP
name=MCGhidra
description=A multi-headed REST interface for Ghidra for use with MCP agents.
author=Laurie Wired, Teal Bauer
createdOn=2025-03-29
version=11.3.1
ghidraVersion=11.3.1
version=11.4.2
ghidraVersion=11.4.2

15
src/mcghidra/__init__.py Normal file
View File

@ -0,0 +1,15 @@
"""MCGhidra - Reverse engineering bridge for Ghidra.
Multi-instance Ghidra plugin with HATEOAS REST API and MCP server
for decompilation, analysis & binary manipulation.
"""
try:
from importlib.metadata import version
__version__ = version("mcghidra")
except Exception:
__version__ = "2025.12.1"
from .server import create_server, main
__all__ = ["create_server", "main", "__version__"]

9
src/mcghidra/__main__.py Normal file
View File

@ -0,0 +1,9 @@
"""MCGhidra package entry point.
Allows running with: python -m mcghidra
"""
from .server import main
if __name__ == "__main__":
main()

137
src/mcghidra/config.py Normal file
View File

@ -0,0 +1,137 @@
"""Configuration management for MCGhidra.
Handles environment variables, default settings, and runtime configuration.
"""
import os
from dataclasses import dataclass, field
from pathlib import Path
from typing import Optional
@dataclass
class DockerConfig:
"""Docker-specific configuration."""
# Docker image settings
image_name: str = "mcghidra"
image_tag: str = field(default_factory=lambda: os.environ.get("MCGHIDRAMCP_VERSION", "latest"))
# Default container settings
default_port: int = field(default_factory=lambda: int(os.environ.get("MCGHIDRA_PORT", "8192")))
default_memory: str = field(default_factory=lambda: os.environ.get("MCGHIDRA_MAXMEM", "2G"))
# Project directory (for building)
project_dir: Optional[Path] = None
# Auto-start settings
auto_start_enabled: bool = field(default_factory=lambda: os.environ.get("MCGHIDRA_DOCKER_AUTO", "false").lower() == "true")
auto_start_wait: bool = True
auto_start_timeout: float = 300.0
# Docker configuration instance
_docker_config: Optional[DockerConfig] = None
def get_docker_config() -> DockerConfig:
"""Get the Docker configuration instance."""
global _docker_config
if _docker_config is None:
_docker_config = DockerConfig()
return _docker_config
def set_docker_config(config: DockerConfig) -> None:
"""Set the Docker configuration instance."""
global _docker_config
_docker_config = config
@dataclass
class MCGhidraConfig:
"""Configuration for MCGhidra server."""
# Ghidra connection settings
ghidra_host: str = field(default_factory=lambda: os.environ.get("GHIDRA_HOST", "localhost"))
default_port: Optional[int] = None
# Port scanning ranges for instance discovery
quick_discovery_range: range = field(default_factory=lambda: range(18489, 18499))
full_discovery_range: range = field(default_factory=lambda: range(18400, 18600))
# HTTP client settings
request_timeout: float = 30.0
discovery_timeout: float = 0.5
# Pagination defaults
default_page_size: int = 50
max_page_size: int = 500
# Cursor management
cursor_ttl_seconds: int = 300 # 5 minutes
max_cursors_per_session: int = 100
# Response size limits (for return_all guard)
max_response_tokens: int = 8000 # Hard budget — guard triggers above this
large_response_threshold: int = 4000 # Warn above this in normal pagination
# Expected API version
expected_api_version: int = 2
# Feedback collection
feedback_enabled: bool = field(
default_factory=lambda: os.environ.get("MCGHIDRA_FEEDBACK", "true").lower() == "true"
)
feedback_db_path: str = field(
default_factory=lambda: os.environ.get(
"MCGHIDRA_FEEDBACK_DB",
str(Path.home() / ".mcghidra" / "feedback.db"),
)
)
# Resource caps for enumeration endpoints
resource_caps: dict = field(default_factory=lambda: {
"functions": 1000,
"strings": 500,
"data": 1000,
"structs": 500,
"xrefs": 500,
"symbols": 1000,
"segments": 500,
"variables": 1000,
"namespaces": 500,
"classes": 500,
"bookmarks": 1000,
"enums": 500,
"typedefs": 500,
})
def __post_init__(self):
"""Validate configuration after initialization."""
if self.default_page_size > self.max_page_size:
self.default_page_size = self.max_page_size
# Global configuration instance (can be replaced for testing)
_config: Optional[MCGhidraConfig] = None
def get_config() -> MCGhidraConfig:
"""Get the global configuration instance."""
global _config
if _config is None:
_config = MCGhidraConfig()
return _config
def set_config(config: MCGhidraConfig) -> None:
"""Set the global configuration instance."""
global _config
_config = config
def reset_config() -> None:
"""Reset to default configuration."""
global _config
_config = None

View File

@ -0,0 +1,67 @@
"""Core infrastructure for MCGhidra.
Contains HTTP client, pagination, progress reporting, and logging utilities.
"""
from .filtering import (
apply_grep,
estimate_and_guard,
project_fields,
)
from .http_client import (
get_instance_url,
safe_delete,
safe_get,
safe_patch,
safe_post,
safe_put,
simplify_response,
)
from .logging import (
log_debug,
log_error,
log_info,
log_warning,
)
from .pagination import (
CursorManager,
CursorState,
estimate_tokens,
get_cursor_manager,
paginate_response,
)
from .progress import (
ProgressReporter,
report_progress,
report_step,
)
__all__ = [
# HTTP client
"safe_get",
"safe_post",
"safe_put",
"safe_patch",
"safe_delete",
"simplify_response",
"get_instance_url",
# Pagination
"CursorManager",
"CursorState",
"paginate_response",
"get_cursor_manager",
"estimate_tokens",
# Progress
"ProgressReporter",
"report_progress",
"report_step",
# Filtering
"project_fields",
"apply_grep",
"estimate_and_guard",
# Logging
"log_info",
"log_debug",
"log_warning",
"log_error",
]

View File

@ -0,0 +1,237 @@
"""Field projection and response size guard for MCGhidra.
Provides jq-style field projection, grep filtering, and token budget
enforcement to prevent oversized MCP tool results.
"""
import json
import re
import time
from typing import Any, Dict, List, Optional
from ..config import get_config
# Token estimation (same ratio as pagination.py)
TOKEN_ESTIMATION_RATIO = 4.0
def project_fields(items: List[Any], fields: List[str]) -> List[Any]:
"""Select only specified keys from each item (jq-style projection).
Works on dicts and strings. For dicts, returns only the requested
keys. For non-dict items (e.g. lines of decompiled code), returns
them unchanged.
Args:
items: List of items to project
fields: List of field names to keep
Returns:
List of projected items
"""
if not fields or not items:
return items
field_set = set(fields)
projected = []
for item in items:
if isinstance(item, dict):
projected.append({k: v for k, v in item.items() if k in field_set})
else:
projected.append(item)
return projected
def apply_grep(items: List[Any], pattern: str, ignorecase: bool = True) -> List[Any]:
"""Filter items by regex pattern across all string values.
Searches all string-coercible values in each item. For dicts,
searches all values recursively. For strings, searches directly.
Args:
items: List of items to filter
pattern: Regex pattern string
ignorecase: Case-insensitive matching (default True)
Returns:
Filtered list of matching items
"""
if not pattern or not items:
return items
flags = re.IGNORECASE if ignorecase else 0
compiled = re.compile(pattern, flags)
return [item for item in items if _matches(item, compiled)]
def _matches(item: Any, pattern: re.Pattern, depth: int = 0) -> bool:
"""Check if item matches pattern (recursive for nested structures)."""
if depth > 10:
return False
if isinstance(item, dict):
for value in item.values():
if isinstance(value, str) and pattern.search(value):
return True
elif isinstance(value, (int, float)):
if pattern.search(str(value)):
return True
elif isinstance(value, (dict, list, tuple)):
if _matches(value, pattern, depth + 1):
return True
return False
elif isinstance(item, (list, tuple)):
return any(_matches(i, pattern, depth + 1) for i in item)
elif isinstance(item, str):
return bool(pattern.search(item))
else:
return bool(pattern.search(str(item)))
def _estimate_tokens(data: Any) -> int:
"""Estimate token count from serialized JSON size.
Uses a simple heuristic: ~4 characters per token on average.
This matches the TOKEN_ESTIMATION_RATIO constant.
Args:
data: Any JSON-serializable data structure
Returns:
Estimated token count
"""
text = json.dumps(data, default=str)
return int(len(text) / TOKEN_ESTIMATION_RATIO)
def _extract_available_fields(items: List[Any]) -> List[str]:
"""Extract the set of field names from the first few dict items.
Samples up to 5 items to discover available keys, useful for
suggesting field projections to reduce response size.
Args:
items: List of items (only dicts are examined)
Returns:
Sorted list of unique field names (excludes internal _links)
"""
fields = set()
for item in items[:5]:
if isinstance(item, dict):
fields.update(item.keys())
# Remove internal/HATEOAS fields
fields.discard("_links")
return sorted(fields)
def estimate_and_guard(
data: List[Any],
tool_name: str,
budget: Optional[int] = None,
query_hints: Optional[Dict[str, Any]] = None,
) -> Optional[Dict[str, Any]]:
"""Check if data exceeds token budget; return guard response if so.
If data fits within budget, returns None (caller should proceed
normally). If data exceeds budget, returns a structured summary
with instructions for narrowing the query.
Args:
data: The full data list to check
tool_name: Name of the tool (for hint messages)
budget: Token budget override (defaults to config.max_response_tokens)
query_hints: Original query params (for building hint commands)
Returns:
None if data fits within budget, or a guard response dict
"""
config = get_config()
if budget is None:
budget = config.max_response_tokens
estimated = _estimate_tokens(data)
if estimated <= budget:
return None
# Build sample from first 3 items
sample = data[:3]
available_fields = _extract_available_fields(data)
# Build actionable hints based on the tool name
hints = _build_hints(tool_name, available_fields, query_hints)
return {
"success": True,
"guarded": True,
"total_count": len(data),
"estimated_tokens": estimated,
"budget": budget,
"sample": sample,
"available_fields": available_fields,
"message": (
"Response too large (%d items, ~%s tokens, budget: %s). "
"To read this data:\n%s"
) % (
len(data),
_format_tokens(estimated),
_format_tokens(budget),
hints,
),
"timestamp": int(time.time() * 1000),
}
def _format_tokens(n: int) -> str:
"""Format token count for human-readable display.
Large numbers are abbreviated with 'k' suffix for readability
in error messages and hints.
Args:
n: Token count
Returns:
Formatted string (e.g., 45000 -> '45k', 500 -> '500')
"""
if n >= 1000:
return "%dk" % (n // 1000)
return str(n)
def _build_hints(
tool_name: str,
available_fields: List[str],
query_hints: Optional[Dict[str, Any]] = None,
) -> str:
"""Build actionable hint text for the guard message."""
lines = []
# Pagination hint
lines.append(
" - Paginate: %s(page_size=50) then cursor_next(cursor_id='...')"
% tool_name
)
# Grep hint
grep_example = "main" if "functions" in tool_name else ".*pattern.*"
lines.append(
" - Filter: %s(grep='%s')" % (tool_name, grep_example)
)
# Fields hint (only if we have dict items with fields)
if available_fields:
short_fields = available_fields[:2]
lines.append(
" - Project: %s(fields=%s)" % (tool_name, short_fields)
)
# Combined hint
if available_fields:
lines.append(
" - Combine: %s(grep='...', fields=%s, return_all=True)"
% (tool_name, available_fields[:2])
)
return "\n".join(lines)

View File

@ -0,0 +1,400 @@
"""HTTP client for Ghidra REST API communication.
Provides safe request methods with error handling, HATEOAS compliance,
and response simplification for MCP tool consumption.
"""
import time
from typing import Any, Dict, Optional, Union
from urllib.parse import urlparse
import requests
from ..config import get_config
# Allowed origins for CORS-like validation
ALLOWED_ORIGINS = {
"http://localhost",
"http://127.0.0.1",
"https://localhost",
"https://127.0.0.1",
}
def validate_origin(headers: Dict[str, str]) -> bool:
"""Validate request origin against allowed origins.
Args:
headers: Request headers dict
Returns:
True if origin is allowed or not present
"""
origin = headers.get("Origin")
if not origin:
# No origin header - allow (browser same-origin policy applies)
return True
try:
parsed = urlparse(origin)
origin_base = f"{parsed.scheme}://{parsed.hostname}"
if parsed.port:
origin_base += f":{parsed.port}"
except Exception:
return False
return origin_base in ALLOWED_ORIGINS
def get_instance_url(port: int, host: Optional[str] = None) -> str:
"""Get URL for a Ghidra instance by port.
Args:
port: Port number
host: Optional host override (defaults to config)
Returns:
Full URL for the Ghidra instance
"""
if host is None:
host = get_config().ghidra_host
return f"http://{host}:{port}"
def _make_request(
method: str,
port: int,
endpoint: str,
params: Optional[Dict[str, Any]] = None,
json_data: Optional[Dict[str, Any]] = None,
data: Optional[str] = None,
headers: Optional[Dict[str, str]] = None,
host: Optional[str] = None,
) -> Dict[str, Any]:
"""Make HTTP request to Ghidra instance with error handling.
Args:
method: HTTP method (GET, POST, PUT, PATCH, DELETE)
port: Ghidra instance port
endpoint: API endpoint path
params: Query parameters
json_data: JSON payload for POST/PUT/PATCH
data: Raw text payload
headers: Additional headers
host: Optional host override
Returns:
Response dict with success flag and result or error
"""
config = get_config()
url = f"{get_instance_url(port, host)}/{endpoint}"
# Set up headers for HATEOAS API
request_headers = {
"Accept": "application/json",
"X-Request-ID": f"mcp-bridge-{int(time.time() * 1000)}",
}
if headers:
request_headers.update(headers)
# Validate origin for state-changing requests
is_state_changing = method.upper() in ["POST", "PUT", "PATCH", "DELETE"]
if is_state_changing:
check_headers = (
json_data.get("headers", {})
if isinstance(json_data, dict)
else (headers or {})
)
if not validate_origin(check_headers):
return {
"success": False,
"error": {
"code": "ORIGIN_NOT_ALLOWED",
"message": "Origin not allowed for state-changing request",
},
"status_code": 403,
"timestamp": int(time.time() * 1000),
}
if json_data is not None:
request_headers["Content-Type"] = "application/json"
elif data is not None:
request_headers["Content-Type"] = "text/plain"
try:
response = requests.request(
method,
url,
params=params,
json=json_data,
data=data,
headers=request_headers,
timeout=config.request_timeout,
)
try:
parsed_json = response.json()
# Add timestamp if not present
if isinstance(parsed_json, dict) and "timestamp" not in parsed_json:
parsed_json["timestamp"] = int(time.time() * 1000)
# Normalize error format
if (
not response.ok
and isinstance(parsed_json, dict)
and "success" in parsed_json
and not parsed_json["success"]
):
if "error" in parsed_json and not isinstance(
parsed_json["error"], dict
):
error_message = parsed_json["error"]
parsed_json["error"] = {
"code": f"HTTP_{response.status_code}",
"message": error_message,
}
return parsed_json
except ValueError:
if response.ok:
return {
"success": False,
"error": {
"code": "NON_JSON_RESPONSE",
"message": "Received non-JSON success response",
},
"status_code": response.status_code,
"response_text": response.text[:500],
"timestamp": int(time.time() * 1000),
}
else:
return {
"success": False,
"error": {
"code": f"HTTP_{response.status_code}",
"message": f"Non-JSON error: {response.text[:100]}...",
},
"status_code": response.status_code,
"response_text": response.text[:500],
"timestamp": int(time.time() * 1000),
}
except requests.exceptions.Timeout:
return {
"success": False,
"error": {"code": "REQUEST_TIMEOUT", "message": "Request timed out"},
"status_code": 408,
"timestamp": int(time.time() * 1000),
}
except requests.exceptions.ConnectionError:
return {
"success": False,
"error": {
"code": "CONNECTION_ERROR",
"message": f"Failed to connect to Ghidra instance at {url}",
},
"status_code": 503,
"timestamp": int(time.time() * 1000),
}
except Exception as e:
return {
"success": False,
"error": {
"code": "UNEXPECTED_ERROR",
"message": f"Unexpected error: {str(e)}",
},
"exception": e.__class__.__name__,
"timestamp": int(time.time() * 1000),
}
def safe_get(
port: int, endpoint: str, params: Optional[Dict[str, Any]] = None
) -> Dict[str, Any]:
"""Make GET request to Ghidra instance.
Args:
port: Ghidra instance port
endpoint: API endpoint path
params: Query parameters
Returns:
Response dict
"""
return _make_request("GET", port, endpoint, params=params)
def safe_post(
port: int, endpoint: str, data: Union[Dict[str, Any], str]
) -> Dict[str, Any]:
"""Make POST request to Ghidra instance.
Args:
port: Ghidra instance port
endpoint: API endpoint path
data: JSON dict or raw string payload
Returns:
Response dict
"""
headers = None
json_payload = None
text_payload = None
if isinstance(data, dict):
data = data.copy() # Don't mutate caller's dict
headers = data.pop("headers", None)
json_payload = data
else:
text_payload = data
return _make_request(
"POST", port, endpoint, json_data=json_payload, data=text_payload, headers=headers
)
def safe_put(port: int, endpoint: str, data: Dict[str, Any]) -> Dict[str, Any]:
"""Make PUT request to Ghidra instance.
Args:
port: Ghidra instance port
endpoint: API endpoint path
data: JSON payload
Returns:
Response dict
"""
if isinstance(data, dict):
data = data.copy() # Don't mutate caller's dict
headers = data.pop("headers", None)
else:
headers = None
return _make_request("PUT", port, endpoint, json_data=data, headers=headers)
def safe_patch(port: int, endpoint: str, data: Dict[str, Any]) -> Dict[str, Any]:
"""Make PATCH request to Ghidra instance.
Args:
port: Ghidra instance port
endpoint: API endpoint path
data: JSON payload
Returns:
Response dict
"""
if isinstance(data, dict):
data = data.copy() # Don't mutate caller's dict
headers = data.pop("headers", None)
else:
headers = None
return _make_request("PATCH", port, endpoint, json_data=data, headers=headers)
def safe_delete(port: int, endpoint: str) -> Dict[str, Any]:
"""Make DELETE request to Ghidra instance.
Args:
port: Ghidra instance port
endpoint: API endpoint path
Returns:
Response dict
"""
return _make_request("DELETE", port, endpoint)
def simplify_response(response: Dict[str, Any]) -> Dict[str, Any]:
"""Simplify HATEOAS response for MCP tool consumption.
- Removes _links from result entries
- Flattens nested structures
- Preserves important metadata
- Converts structured data to text
Args:
response: Raw API response
Returns:
Simplified response dict
"""
if not isinstance(response, dict):
return response
result = response.copy()
# Store API metadata
api_metadata = {}
for key in ["id", "instance", "timestamp", "size", "offset", "limit"]:
if key in result:
api_metadata[key] = result.get(key)
# Simplify result data
if "result" in result:
if isinstance(result["result"], list):
simplified_items = []
for item in result["result"]:
if isinstance(item, dict):
item_copy = item.copy()
links = item_copy.pop("_links", None)
if isinstance(links, dict):
for link_name, link_data in links.items():
if isinstance(link_data, dict) and "href" in link_data:
item_copy[f"{link_name}_url"] = link_data["href"]
simplified_items.append(item_copy)
else:
simplified_items.append(item)
result["result"] = simplified_items
elif isinstance(result["result"], dict):
result_copy = result["result"].copy()
links = result_copy.pop("_links", None)
if isinstance(links, dict):
for link_name, link_data in links.items():
if isinstance(link_data, dict) and "href" in link_data:
result_copy[f"{link_name}_url"] = link_data["href"]
# Convert disassembly to text
if "instructions" in result_copy and isinstance(
result_copy["instructions"], list
):
disasm_text = ""
for instr in result_copy["instructions"]:
if isinstance(instr, dict):
addr = instr.get("address", "")
mnemonic = instr.get("mnemonic", "")
operands = instr.get("operands", "")
bytes_str = instr.get("bytes", "")
disasm_text += (
f"{addr}: {bytes_str.ljust(10)} {mnemonic} {operands}\n"
)
result_copy["disassembly_text"] = disasm_text
result_copy.pop("instructions", None)
# Make decompiled code accessible
if "ccode" in result_copy:
result_copy["decompiled_text"] = result_copy["ccode"]
elif "decompiled" in result_copy:
result_copy["decompiled_text"] = result_copy["decompiled"]
result["result"] = result_copy
# Simplify top-level links
links = result.pop("_links", None)
if isinstance(links, dict):
api_links = {}
for link_name, link_data in links.items():
if isinstance(link_data, dict) and "href" in link_data:
api_links[link_name] = link_data["href"]
if api_links:
result["api_links"] = api_links
# Restore metadata
for key, value in api_metadata.items():
if key not in result:
result[key] = value
return result

View File

@ -0,0 +1,88 @@
"""Logging utilities for MCP context-aware logging.
Provides async logging functions that use FastMCP's Context for
client-visible logging when available, with fallback to standard logging.
"""
import logging
from typing import TYPE_CHECKING, Optional
if TYPE_CHECKING:
from fastmcp import Context
# Standard Python logger as fallback
logger = logging.getLogger("mcghidra")
async def log_debug(ctx: Optional["Context"], message: str) -> None:
"""Log a debug message to the MCP client and/or standard logger.
Args:
ctx: FastMCP context (may be None)
message: Debug message to log
"""
logger.debug(message)
if ctx is not None:
try:
await ctx.debug(message)
except Exception:
pass # Silently ignore if context doesn't support logging
async def log_info(ctx: Optional["Context"], message: str) -> None:
"""Log an info message to the MCP client and/or standard logger.
Args:
ctx: FastMCP context (may be None)
message: Info message to log
"""
logger.info(message)
if ctx is not None:
try:
await ctx.info(message)
except Exception:
pass
async def log_warning(ctx: Optional["Context"], message: str) -> None:
"""Log a warning message to the MCP client and/or standard logger.
Args:
ctx: FastMCP context (may be None)
message: Warning message to log
"""
logger.warning(message)
if ctx is not None:
try:
await ctx.warning(message)
except Exception:
pass
async def log_error(ctx: Optional["Context"], message: str) -> None:
"""Log an error message to the MCP client and/or standard logger.
Args:
ctx: FastMCP context (may be None)
message: Error message to log
"""
logger.error(message)
if ctx is not None:
try:
await ctx.error(message)
except Exception:
pass
def configure_logging(level: int = logging.INFO) -> None:
"""Configure the standard logger for MCGhidra.
Args:
level: Logging level (default: INFO)
"""
handler = logging.StreamHandler()
handler.setFormatter(
logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
)
logger.addHandler(handler)
logger.setLevel(level)

View File

@ -0,0 +1,529 @@
"""Cursor-based pagination system for large MCP responses.
Provides efficient pagination with grep filtering, session isolation,
and TTL-based cursor expiration.
"""
import hashlib
import json
import re
import time
from collections import OrderedDict
from dataclasses import dataclass, field
from threading import Lock
from typing import Any, Dict, List, Optional, Tuple
from ..config import get_config
from .filtering import estimate_and_guard, project_fields
# ReDoS Protection Configuration
MAX_GREP_PATTERN_LENGTH = 500
MAX_GREP_REPETITION_OPS = 15
MAX_GREP_RECURSION_DEPTH = 10
# Token estimation (roughly 4 chars per token)
TOKEN_ESTIMATION_RATIO = 4.0
def compile_safe_pattern(pattern: str, flags: int = 0) -> re.Pattern:
"""Compile regex pattern with ReDoS protection.
Validates pattern to prevent catastrophic backtracking attacks.
Args:
pattern: Regex pattern string
flags: Regex compilation flags
Returns:
Compiled regex pattern
Raises:
ValueError: If pattern fails safety validation
"""
if not pattern:
raise ValueError("Empty pattern")
if len(pattern) > MAX_GREP_PATTERN_LENGTH:
raise ValueError(
f"Pattern too long ({len(pattern)} chars, max {MAX_GREP_PATTERN_LENGTH}). "
"Consider using a simpler pattern."
)
# Count repetition operators
repetition_ops = pattern.count("*") + pattern.count("+") + pattern.count("?")
repetition_ops += len(re.findall(r"\{[0-9,]+\}", pattern))
if repetition_ops > MAX_GREP_REPETITION_OPS:
raise ValueError(
f"Pattern has too many repetition operators ({repetition_ops}, "
f"max {MAX_GREP_REPETITION_OPS}). Consider simplifying."
)
# Check for dangerous nested quantifiers
dangerous_patterns = [
r"\([^)]*[*+][^)]*\)[*+]", # (a+)+ or (a*)*
r"\([^)]*[*+][^)]*\)\{", # (a+){n,m}
]
for dangerous in dangerous_patterns:
if re.search(dangerous, pattern):
raise ValueError(
"Pattern contains nested quantifiers which could cause "
"exponential backtracking. Consider simplifying."
)
try:
return re.compile(pattern, flags)
except re.error as e:
raise ValueError(f"Invalid regex pattern: {e}")
@dataclass
class CursorState:
"""Represents the state of a paginated query with session isolation."""
cursor_id: str
session_id: str
tool_name: str
query_hash: str
data: List[Any]
total_count: int
filtered_count: int
current_offset: int = 0
page_size: int = 50
grep_pattern: Optional[str] = None
grep_flags: int = 0
created_at: float = field(default_factory=time.time)
last_accessed: float = field(default_factory=time.time)
@property
def is_expired(self) -> bool:
config = get_config()
return time.time() - self.last_accessed > config.cursor_ttl_seconds
@property
def has_more(self) -> bool:
return self.current_offset + self.page_size < self.filtered_count
@property
def current_page(self) -> int:
return (self.current_offset // self.page_size) + 1
@property
def total_pages(self) -> int:
return max(1, (self.filtered_count + self.page_size - 1) // self.page_size)
@property
def ttl_remaining(self) -> int:
config = get_config()
return max(0, int(config.cursor_ttl_seconds - (time.time() - self.last_accessed)))
def verify_session(self, session_id: str) -> bool:
"""Verify cursor belongs to requesting session."""
return self.session_id == session_id
class CursorManager:
"""Thread-safe cursor manager with TTL-based expiration and session isolation."""
def __init__(self):
self._cursors: OrderedDict[str, CursorState] = OrderedDict()
self._session_cursors: Dict[str, set] = {}
self._lock = Lock()
def _generate_cursor_id(self, query_hash: str, session_id: str) -> str:
"""Generate a unique cursor ID."""
unique = f"{session_id}-{query_hash}-{time.time()}-{id(self)}"
return hashlib.sha256(unique.encode()).hexdigest()[:16]
def _cleanup_expired(self) -> None:
"""Remove expired cursors (call while holding lock)."""
config = get_config()
expired = [cid for cid, state in self._cursors.items() if state.is_expired]
for cid in expired:
state = self._cursors[cid]
if state.session_id in self._session_cursors:
self._session_cursors[state.session_id].discard(cid)
del self._cursors[cid]
# LRU eviction
while len(self._cursors) > config.max_cursors_per_session:
oldest_id, oldest_state = self._cursors.popitem(last=False)
if oldest_state.session_id in self._session_cursors:
self._session_cursors[oldest_state.session_id].discard(oldest_id)
def create_cursor(
self,
data: List[Any],
query_params: Dict[str, Any],
tool_name: str = "unknown",
session_id: str = "default",
grep_pattern: Optional[str] = None,
grep_flags: int = 0,
page_size: int = 50,
) -> Tuple[str, CursorState]:
"""Create a new cursor for paginated results.
Args:
data: The full result set to paginate
query_params: Original query parameters (for hashing)
tool_name: Name of tool creating cursor
session_id: Session identifier for isolation
grep_pattern: Optional regex pattern to filter results
grep_flags: Regex flags
page_size: Items per page
Returns:
Tuple of (cursor_id, cursor_state)
"""
config = get_config()
# Apply grep filtering
filtered_data = data
if grep_pattern:
pattern = compile_safe_pattern(grep_pattern, grep_flags)
filtered_data = [
item for item in data if self._matches_grep(item, pattern)
]
# Create query hash (SHA-256 for consistency with cursor ID generation)
query_hash = hashlib.sha256(
json.dumps(query_params, sort_keys=True, default=str).encode()
).hexdigest()[:12]
with self._lock:
self._cleanup_expired()
cursor_id = self._generate_cursor_id(query_hash, session_id)
state = CursorState(
cursor_id=cursor_id,
session_id=session_id,
tool_name=tool_name,
query_hash=query_hash,
data=filtered_data,
total_count=len(data),
filtered_count=len(filtered_data),
page_size=min(page_size, config.max_page_size),
grep_pattern=grep_pattern,
grep_flags=grep_flags,
)
self._cursors[cursor_id] = state
if session_id not in self._session_cursors:
self._session_cursors[session_id] = set()
self._session_cursors[session_id].add(cursor_id)
return cursor_id, state
def get_cursor(
self, cursor_id: str, session_id: Optional[str] = None
) -> Optional[CursorState]:
"""Retrieve a cursor by ID, optionally validating session."""
with self._lock:
self._cleanup_expired()
if cursor_id not in self._cursors:
return None
state = self._cursors[cursor_id]
if state.is_expired:
del self._cursors[cursor_id]
if state.session_id in self._session_cursors:
self._session_cursors[state.session_id].discard(cursor_id)
return None
if session_id and not state.verify_session(session_id):
return None
state.last_accessed = time.time()
self._cursors.move_to_end(cursor_id)
return state
def advance_cursor(
self, cursor_id: str, session_id: Optional[str] = None
) -> Optional[CursorState]:
"""Advance cursor to next page."""
with self._lock:
state = self._cursors.get(cursor_id)
if not state or state.is_expired:
return None
if session_id and not state.verify_session(session_id):
return None
state.current_offset += state.page_size
state.last_accessed = time.time()
self._cursors.move_to_end(cursor_id)
return state
def delete_cursor(
self, cursor_id: str, session_id: Optional[str] = None
) -> bool:
"""Explicitly delete a cursor."""
with self._lock:
if cursor_id not in self._cursors:
return False
state = self._cursors[cursor_id]
if session_id and not state.verify_session(session_id):
return False
if state.session_id in self._session_cursors:
self._session_cursors[state.session_id].discard(cursor_id)
del self._cursors[cursor_id]
return True
def delete_session_cursors(self, session_id: str) -> int:
"""Delete all cursors for a session."""
with self._lock:
if session_id not in self._session_cursors:
return 0
cursor_ids = list(self._session_cursors[session_id])
count = 0
for cid in cursor_ids:
if cid in self._cursors:
del self._cursors[cid]
count += 1
del self._session_cursors[session_id]
return count
def get_page(self, state: CursorState) -> List[Any]:
"""Get current page of data from cursor state."""
start = state.current_offset
end = start + state.page_size
return state.data[start:end]
def _matches_grep(
self, item: Any, pattern: re.Pattern, depth: int = 0
) -> bool:
"""Check if an item matches the grep pattern.
Searches through string representations of dict values,
list items, or the item itself.
"""
if depth > MAX_GREP_RECURSION_DEPTH:
return False
if isinstance(item, dict):
for value in item.values():
if isinstance(value, str) and pattern.search(value):
return True
elif isinstance(value, (int, float)):
if pattern.search(str(value)):
return True
elif isinstance(value, dict):
if self._matches_grep(value, pattern, depth + 1):
return True
elif isinstance(value, (list, tuple)):
if self._matches_grep(value, pattern, depth + 1):
return True
return False
elif isinstance(item, (list, tuple)):
return any(self._matches_grep(i, pattern, depth + 1) for i in item)
elif isinstance(item, str):
return bool(pattern.search(item))
else:
return bool(pattern.search(str(item)))
def list_cursors(self, session_id: Optional[str] = None) -> List[Dict[str, Any]]:
"""List active cursors, optionally filtered by session."""
with self._lock:
self._cleanup_expired()
return [
{
"cursor_id": cid,
"session_id": state.session_id,
"tool_name": state.tool_name,
"total_count": state.total_count,
"filtered_count": state.filtered_count,
"current_page": state.current_page,
"total_pages": state.total_pages,
"current_offset": state.current_offset,
"page_size": state.page_size,
"has_more": state.has_more,
"grep_pattern": state.grep_pattern,
"age_seconds": int(time.time() - state.created_at),
"ttl_remaining": state.ttl_remaining,
}
for cid, state in self._cursors.items()
if session_id is None or state.session_id == session_id
]
def get_stats(self) -> Dict[str, Any]:
"""Get cursor manager statistics."""
config = get_config()
with self._lock:
self._cleanup_expired()
return {
"total_cursors": len(self._cursors),
"total_sessions": len(self._session_cursors),
"max_cache_size": config.max_cursors_per_session,
"ttl_seconds": config.cursor_ttl_seconds,
"cursors_per_session": {
sid: len(cids) for sid, cids in self._session_cursors.items()
},
}
# Global cursor manager instance
_cursor_manager: Optional[CursorManager] = None
def get_cursor_manager() -> CursorManager:
"""Get the global cursor manager instance."""
global _cursor_manager
if _cursor_manager is None:
_cursor_manager = CursorManager()
return _cursor_manager
def estimate_tokens(data: List[Any]) -> int:
"""Estimate token count for a list of items."""
text = json.dumps(data, default=str)
return int(len(text) / TOKEN_ESTIMATION_RATIO)
def paginate_response(
data: List[Any],
query_params: Dict[str, Any],
tool_name: str = "unknown",
session_id: str = "default",
page_size: int = 50,
grep: Optional[str] = None,
grep_ignorecase: bool = True,
return_all: bool = False,
fields: Optional[List[str]] = None,
) -> Dict[str, Any]:
"""Create a paginated response with optional grep filtering and field projection.
Args:
data: Full result list to paginate
query_params: Original query parameters (for cursor creation)
tool_name: Name of the tool creating this response
session_id: Session identifier for cursor isolation
page_size: Items per page (default: 50, max: 500)
grep: Optional regex pattern to filter results
grep_ignorecase: Case-insensitive grep (default: True)
return_all: Bypass pagination and return all results (with budget guard)
fields: Optional list of field names to project (jq-style)
Returns:
dict with pagination metadata and results
"""
config = get_config()
cursor_manager = get_cursor_manager()
grep_flags = re.IGNORECASE if grep_ignorecase else 0
# Handle return_all bypass
if return_all:
filtered_data = data
if grep:
try:
pattern = compile_safe_pattern(grep, grep_flags)
filtered_data = [
item
for item in data
if cursor_manager._matches_grep(item, pattern)
]
except ValueError as e:
return {
"success": False,
"error": {"code": "INVALID_GREP_PATTERN", "message": str(e)},
"timestamp": int(time.time() * 1000),
}
# Apply field projection before size estimation
if fields:
filtered_data = project_fields(filtered_data, fields)
# Check token budget — return guard if exceeded
guard = estimate_and_guard(
data=filtered_data,
tool_name=tool_name,
query_hints=query_params,
)
if guard is not None:
return guard
estimated_tokens = estimate_tokens(filtered_data)
warning = None
if estimated_tokens > 50000:
warning = f"EXTREMELY LARGE response (~{estimated_tokens:,} tokens)"
elif estimated_tokens > 20000:
warning = f"VERY LARGE response (~{estimated_tokens:,} tokens)"
elif estimated_tokens > config.large_response_threshold:
warning = f"Large response (~{estimated_tokens:,} tokens)"
return {
"success": True,
"result": filtered_data,
"pagination": {
"bypassed": True,
"total_count": len(data),
"filtered_count": len(filtered_data),
"grep_pattern": grep,
"fields_projected": fields,
"estimated_tokens": estimated_tokens,
"warning": warning,
},
"timestamp": int(time.time() * 1000),
}
# Normal pagination flow — apply field projection before cursoring
paginated_data = project_fields(data, fields) if fields else data
try:
cursor_id, state = cursor_manager.create_cursor(
data=paginated_data,
query_params=query_params,
tool_name=tool_name,
session_id=session_id,
grep_pattern=grep,
grep_flags=grep_flags,
page_size=min(page_size, config.max_page_size),
)
except ValueError as e:
return {
"success": False,
"error": {"code": "INVALID_GREP_PATTERN", "message": str(e)},
"timestamp": int(time.time() * 1000),
}
current_page = cursor_manager.get_page(state)
response_cursor = cursor_id if state.has_more else None
response = {
"success": True,
"result": current_page,
"pagination": {
"cursor_id": response_cursor,
"session_id": session_id,
"total_count": state.total_count,
"filtered_count": state.filtered_count,
"page_size": state.page_size,
"current_page": state.current_page,
"total_pages": state.total_pages,
"has_more": state.has_more,
"grep_pattern": grep,
"items_returned": len(current_page),
},
"timestamp": int(time.time() * 1000),
}
# Add LLM-friendly continuation message
if state.has_more:
remaining = state.filtered_count - (state.current_page * state.page_size)
response["_message"] = (
f"Showing {len(current_page)} of {state.filtered_count} items "
f"(page {state.current_page}/{state.total_pages}). "
f"To get the next {min(state.page_size, remaining)} items, call: "
f"cursor_next(cursor_id='{cursor_id}')"
)
else:
response["_message"] = (
f"Complete: {len(current_page)} items returned (all results)"
)
return response

View File

@ -0,0 +1,161 @@
"""Progress reporting utilities for long-running operations.
Provides async progress reporting using FastMCP's Context for
real-time progress notifications to MCP clients.
"""
from typing import TYPE_CHECKING, Optional
if TYPE_CHECKING:
from mcp.server.fastmcp import Context
class ProgressReporter:
"""Helper class for reporting progress during long operations.
Provides throttled progress updates to avoid spamming the client
with too many notifications.
Example:
async def long_operation(ctx: Context):
progress = ProgressReporter(ctx, "Scanning", total=100)
for i in range(100):
await progress.update(message=f"Processing item {i}")
await progress.complete("Scan finished")
"""
def __init__(
self,
ctx: Optional["Context"],
operation: str,
total: int = 100
):
"""Initialize the progress reporter.
Args:
ctx: FastMCP context for progress reporting (may be None)
operation: Name of the operation (used in log messages)
total: Total number of steps (default: 100)
"""
self.ctx = ctx
self.operation = operation
self.total = total
self.current = 0
self._last_reported = 0
# Report every 5% at minimum
self._report_threshold = max(1, total // 20)
async def update(
self,
progress: Optional[int] = None,
message: Optional[str] = None
) -> None:
"""Update progress, reporting to client if threshold reached.
Args:
progress: Current progress value (if None, increments by 1)
message: Optional message to log with the progress update
"""
if progress is not None:
self.current = progress
else:
self.current += 1
# Only report if we've crossed a threshold or reached the end
should_report = (
self.current - self._last_reported >= self._report_threshold
or self.current >= self.total
)
if self.ctx and should_report:
try:
await self.ctx.report_progress(
progress=self.current,
total=self.total
)
if message:
await self.ctx.info(f"{self.operation}: {message}")
self._last_reported = self.current
except Exception:
pass # Silently ignore if context doesn't support progress
async def info(self, message: str) -> None:
"""Send an info message to the client.
Args:
message: Message to send
"""
if self.ctx:
try:
await self.ctx.info(f"{self.operation}: {message}")
except Exception:
pass
async def complete(self, message: Optional[str] = None) -> None:
"""Mark operation as complete.
Args:
message: Optional completion message (supports format placeholders:
{count}, {total}, {operation})
"""
self.current = self.total
if self.ctx:
try:
await self.ctx.report_progress(
progress=self.total,
total=self.total
)
if message:
formatted = message.format(
count=self.current,
total=self.total,
operation=self.operation
)
await self.ctx.info(formatted)
except Exception:
pass
async def report_progress(
ctx: Optional["Context"],
progress: int,
total: int,
message: Optional[str] = None
) -> None:
"""Convenience function for one-off progress updates.
Args:
ctx: FastMCP context (may be None)
progress: Current progress value
total: Total progress value
message: Optional message to log
"""
if ctx:
try:
await ctx.report_progress(progress=progress, total=total)
if message:
await ctx.info(message)
except Exception:
pass
async def report_step(
ctx: Optional["Context"],
step: int,
total_steps: int,
description: str
) -> None:
"""Report a discrete step in a multi-step operation.
Args:
ctx: FastMCP context (may be None)
step: Current step number (1-indexed)
total_steps: Total number of steps
description: Description of the current step
"""
if ctx:
try:
await ctx.report_progress(progress=step, total=total_steps)
await ctx.info(f"Step {step}/{total_steps}: {description}")
except Exception:
pass

View File

@ -0,0 +1,41 @@
"""MCP Mixins for MCGhidra.
Domain-specific mixins that organize tools, resources, and prompts by functionality.
Uses FastMCP's contrib.mcp_mixin pattern for clean modular organization.
"""
from .analysis import AnalysisMixin
from .base import MCGhidraMixinBase
from .bookmarks import BookmarksMixin
from .cursors import CursorsMixin
from .data import DataMixin
from .datatypes import DataTypesMixin
from .docker import DockerMixin
from .functions import FunctionsMixin
from .instances import InstancesMixin
from .memory import MemoryMixin
from .namespaces import NamespacesMixin
from .segments import SegmentsMixin
from .structs import StructsMixin
from .symbols import SymbolsMixin
from .variables import VariablesMixin
from .xrefs import XrefsMixin
__all__ = [
"MCGhidraMixinBase",
"InstancesMixin",
"FunctionsMixin",
"DataMixin",
"StructsMixin",
"AnalysisMixin",
"MemoryMixin",
"XrefsMixin",
"CursorsMixin",
"DockerMixin",
"SymbolsMixin",
"SegmentsMixin",
"VariablesMixin",
"NamespacesMixin",
"BookmarksMixin",
"DataTypesMixin",
]

View File

@ -0,0 +1,372 @@
"""Analysis mixin for MCGhidra.
Provides tools for program analysis operations.
"""
from typing import Any, Dict, List, Optional
from fastmcp import Context
from fastmcp.contrib.mcp_mixin import mcp_tool
from ..config import get_config
from ..core.logging import logger
from .base import MCGhidraMixinBase
class AnalysisMixin(MCGhidraMixinBase):
"""Mixin for analysis operations.
Provides tools for:
- Running program analysis
- Call graph analysis
- Data flow analysis
- UI state queries
- Comment management
"""
@mcp_tool()
def analysis_run(
self,
port: Optional[int] = None,
analysis_options: Optional[Dict[str, Any]] = None,
) -> Dict[str, Any]:
"""Run analysis on the current program.
Args:
port: Ghidra instance port (optional)
analysis_options: Analysis options to enable/disable
Returns:
Analysis operation result
"""
try:
port = self.get_instance_port(port)
except ValueError as e:
return {"success": False, "error": {"code": "NO_INSTANCE", "message": str(e)}}
response = self.safe_post(port, "analysis", analysis_options or {})
return self.simplify_response(response)
@mcp_tool()
def analysis_get_callgraph(
self,
name: Optional[str] = None,
address: Optional[str] = None,
max_depth: int = 3,
port: Optional[int] = None,
page_size: int = 50,
grep: Optional[str] = None,
grep_ignorecase: bool = True,
return_all: bool = False,
fields: Optional[List[str]] = None,
ctx: Optional[Context] = None,
) -> Dict[str, Any]:
"""Get function call graph with edge pagination.
Args:
name: Starting function name (mutually exclusive with address)
address: Starting function address
max_depth: Maximum call depth (default: 3)
port: Ghidra instance port (optional)
page_size: Edges per page (default: 50, max: 500)
grep: Regex pattern to filter edges
grep_ignorecase: Case-insensitive grep (default: True)
return_all: Return all edges without pagination
fields: Field names to keep per edge (e.g. ['from', 'to']). Reduces response size.
ctx: FastMCP context (auto-injected)
Returns:
Call graph with paginated edges
"""
try:
port = self.get_instance_port(port)
except ValueError as e:
return {"success": False, "error": {"code": "NO_INSTANCE", "message": str(e)}}
config = get_config()
params = {"max_depth": max_depth}
if address:
params["address"] = address
func_id = address
elif name:
params["name"] = name
func_id = name
else:
func_id = "entry_point"
response = self.safe_get(port, "analysis/callgraph", params)
simplified = self.simplify_response(response)
if not simplified.get("success", True):
return simplified
result = simplified.get("result", {})
edges = result.get("edges", []) if isinstance(result, dict) else []
nodes = result.get("nodes", []) if isinstance(result, dict) else []
if not edges:
return simplified
query_params = {
"tool": "analysis_get_callgraph",
"port": port,
"name": name,
"address": address,
"max_depth": max_depth,
"grep": grep,
}
session_id = self._get_session_id(ctx)
paginated = self.filtered_paginate(
data=edges,
query_params=query_params,
tool_name="analysis_get_callgraph",
session_id=session_id,
page_size=min(page_size, config.max_page_size),
grep=grep,
grep_ignorecase=grep_ignorecase,
return_all=return_all,
fields=fields,
)
if paginated.get("success") and not paginated.get("guarded"):
paginated["result"] = {
"root_function": func_id,
"max_depth": max_depth,
"nodes": nodes,
"edges": paginated.get("result", []),
"total_nodes": len(nodes),
}
return paginated
@mcp_tool()
def analysis_get_dataflow(
self,
address: str,
direction: str = "forward",
max_steps: int = 50,
port: Optional[int] = None,
page_size: int = 50,
grep: Optional[str] = None,
grep_ignorecase: bool = True,
return_all: bool = False,
fields: Optional[List[str]] = None,
ctx: Optional[Context] = None,
) -> Dict[str, Any]:
"""Perform data flow analysis with step pagination.
Args:
address: Starting address in hex format
direction: "forward" or "backward" (default: "forward")
max_steps: Maximum analysis steps (default: 50)
port: Ghidra instance port (optional)
page_size: Steps per page (default: 50, max: 500)
grep: Regex pattern to filter steps
grep_ignorecase: Case-insensitive grep (default: True)
return_all: Return all steps without pagination
fields: Field names to keep per step. Reduces response size.
ctx: FastMCP context (auto-injected)
Returns:
Data flow steps with pagination
"""
if not address:
return {
"success": False,
"error": {
"code": "MISSING_PARAMETER",
"message": "Address parameter is required",
},
}
try:
port = self.get_instance_port(port)
except ValueError as e:
return {"success": False, "error": {"code": "NO_INSTANCE", "message": str(e)}}
config = get_config()
params = {
"address": address,
"direction": direction,
"max_steps": max_steps,
}
response = self.safe_get(port, "analysis/dataflow", params)
simplified = self.simplify_response(response)
if not simplified.get("success", True):
return simplified
result = simplified.get("result", {})
steps = result.get("steps", []) if isinstance(result, dict) else []
if not steps:
return simplified
query_params = {
"tool": "analysis_get_dataflow",
"port": port,
"address": address,
"direction": direction,
"max_steps": max_steps,
"grep": grep,
}
session_id = self._get_session_id(ctx)
paginated = self.filtered_paginate(
data=steps,
query_params=query_params,
tool_name="analysis_get_dataflow",
session_id=session_id,
page_size=min(page_size, config.max_page_size),
grep=grep,
grep_ignorecase=grep_ignorecase,
return_all=return_all,
fields=fields,
)
# Merge metadata into result (skip if guarded)
if paginated.get("success") and not paginated.get("guarded"):
paginated["result"] = {
"start_address": address,
"direction": direction,
"steps": paginated.get("result", []),
}
if isinstance(result, dict):
for key in ["sources", "sinks", "total_steps"]:
if key in result:
paginated["result"][key] = result[key]
return paginated
# NOTE: ui_get_current_address and ui_get_current_function were removed
# because they require Ghidra GUI context which is never available in
# headless MCP mode. Use functions_get(address=...) or data_list(addr=...)
# with explicit addresses instead.
@mcp_tool()
def comments_get(
self,
address: str,
comment_type: str = "plate",
port: Optional[int] = None,
) -> Dict[str, Any]:
"""Get a comment at the specified address.
Args:
address: Memory address in hex format
comment_type: "plate", "pre", "post", "eol", "repeatable"
port: Ghidra instance port (optional)
Returns:
Comment text and metadata
"""
if not address:
return {
"success": False,
"error": {
"code": "MISSING_PARAMETER",
"message": "Address parameter is required",
},
}
try:
port = self.get_instance_port(port)
except ValueError as e:
return {"success": False, "error": {"code": "NO_INSTANCE", "message": str(e)}}
response = self.safe_get(port, f"memory/{address}/comments/{comment_type}")
return self.simplify_response(response)
@mcp_tool()
def comments_set(
self,
address: str,
comment: str = "",
comment_type: str = "plate",
port: Optional[int] = None,
) -> Dict[str, Any]:
"""Set a comment at the specified address.
Args:
address: Memory address in hex format
comment: Comment text (empty string removes comment)
comment_type: "plate", "pre", "post", "eol", "repeatable"
port: Ghidra instance port (optional)
Returns:
Operation result
"""
if not address:
return {
"success": False,
"error": {
"code": "MISSING_PARAMETER",
"message": "Address parameter is required",
},
}
try:
port = self.get_instance_port(port)
except ValueError as e:
return {"success": False, "error": {"code": "NO_INSTANCE", "message": str(e)}}
payload = {"comment": comment}
response = self.safe_post(port, f"memory/{address}/comments/{comment_type}", payload)
return self.simplify_response(response)
@mcp_tool()
def functions_set_comment(
self,
address: str,
comment: str = "",
port: Optional[int] = None,
) -> Dict[str, Any]:
"""Set a decompiler-friendly comment (function comment with fallback).
Args:
address: Memory address (preferably function entry point)
comment: Comment text (empty string removes comment)
port: Ghidra instance port (optional)
Returns:
Operation result
"""
if not address:
return {
"success": False,
"error": {
"code": "MISSING_PARAMETER",
"message": "Address parameter is required",
},
}
try:
port = self.get_instance_port(port)
except ValueError as e:
return {"success": False, "error": {"code": "NO_INSTANCE", "message": str(e)}}
# Try setting as function comment first
payload = {"comment": comment}
response = self.safe_patch(port, f"functions/{address}", payload)
if response.get("success", False):
return self.simplify_response(response)
# Log why function comment failed before falling back
error = response.get("error", {})
logger.debug(
"Function comment at %s failed (%s), falling back to pre-comment",
address,
error.get("code", "UNKNOWN"),
)
# Fallback to pre-comment
return self.comments_set(
address=address,
comment=comment,
comment_type="pre",
port=port,
)

285
src/mcghidra/mixins/base.py Normal file
View File

@ -0,0 +1,285 @@
"""Base mixin class for MCGhidra domain mixins.
Provides shared state and utilities for all domain mixins.
"""
import time
from threading import RLock
from typing import Any, Dict, Optional
from fastmcp import Context
from fastmcp.contrib.mcp_mixin import MCPMixin
from ..config import get_config
from ..core.http_client import (
safe_delete,
safe_get,
safe_patch,
safe_post,
safe_put,
simplify_response,
)
from ..core.logging import log_debug, log_error, log_info, log_warning
from ..core.pagination import paginate_response
class MCGhidraMixinBase(MCPMixin):
"""Base class for MCGhidra domain mixins.
Provides shared instance state and common utilities.
All domain mixins should inherit from this class.
"""
# Shared state across all mixins
_instances: Dict[int, Dict[str, Any]] = {}
_instances_lock = RLock()
_current_port: Optional[int] = None
def __init__(self):
"""Initialize the mixin with shared state."""
pass
@classmethod
def get_current_port(cls) -> Optional[int]:
"""Get the current working instance port."""
return cls._current_port
@classmethod
def set_current_port(cls, port: int) -> None:
"""Set the current working instance port."""
cls._current_port = port
@classmethod
def get_instance_port(cls, port: Optional[int] = None) -> int:
"""Get instance port, using current if not specified.
Args:
port: Explicit port (optional)
Returns:
Port number to use
Raises:
ValueError: If no port specified and no current instance set
"""
if port is not None:
return port
if cls._current_port is not None:
return cls._current_port
config = get_config()
# Try default port
default_port = config.quick_discovery_range.start
if default_port in cls._instances:
return default_port
raise ValueError(
"No Ghidra instance specified. Use instances_use(port) to set a working instance, "
"or pass port= parameter explicitly."
)
@classmethod
def register_instance(cls, port: int, url: Optional[str] = None) -> str:
"""Register a Ghidra instance.
Args:
port: Port number
url: Optional URL override
Returns:
Status message
"""
config = get_config()
if url is None:
url = f"http://{config.ghidra_host}:{port}"
# Verify instance is responsive
try:
response = safe_get(port, "")
if not response.get("success", False):
return f"Failed to connect to Ghidra instance on port {port}"
# Check API version
api_version = response.get("api_version", 0)
if api_version < config.expected_api_version:
return (
f"API version mismatch: got {api_version}, "
f"expected {config.expected_api_version}"
)
with cls._instances_lock:
cls._instances[port] = {
"url": url,
"project": response.get("project", ""),
"file": response.get("file", ""),
"registered_at": time.time(),
}
return f"Registered Ghidra instance on port {port}"
except Exception as e:
return f"Error registering instance: {e}"
@classmethod
def unregister_instance(cls, port: int) -> str:
"""Unregister a Ghidra instance.
Args:
port: Port number
Returns:
Status message
"""
with cls._instances_lock:
if port in cls._instances:
del cls._instances[port]
if cls._current_port == port:
cls._current_port = None
return f"Unregistered Ghidra instance on port {port}"
return f"No instance registered on port {port}"
@classmethod
def list_instances(cls) -> Dict[int, Dict[str, Any]]:
"""Get all registered instances.
Returns:
Dict mapping port to instance info
"""
with cls._instances_lock:
return dict(cls._instances)
@classmethod
def get_instance_info(cls, port: int) -> Optional[Dict[str, Any]]:
"""Get info for a specific instance.
Args:
port: Port number
Returns:
Instance info dict or None
"""
with cls._instances_lock:
return cls._instances.get(port)
def _get_session_id(self, ctx: Optional[Context]) -> str:
"""Extract session ID from FastMCP context.
Args:
ctx: FastMCP context
Returns:
Session identifier string
"""
if ctx is None:
return "default"
# Try various context attributes
if hasattr(ctx, "session") and ctx.session:
return str(ctx.session)
if hasattr(ctx, "client_id") and ctx.client_id:
return str(ctx.client_id)
if hasattr(ctx, "request_id") and ctx.request_id:
return f"req-{ctx.request_id}"
return "default"
# Convenience methods for subclasses
def safe_get(
self, port: int, endpoint: str, params: Optional[Dict[str, Any]] = None
) -> Dict[str, Any]:
"""Make GET request to Ghidra instance."""
return safe_get(port, endpoint, params)
def safe_post(self, port: int, endpoint: str, data: Any) -> Dict[str, Any]:
"""Make POST request to Ghidra instance."""
return safe_post(port, endpoint, data)
def safe_put(
self, port: int, endpoint: str, data: Dict[str, Any]
) -> Dict[str, Any]:
"""Make PUT request to Ghidra instance."""
return safe_put(port, endpoint, data)
def safe_patch(
self, port: int, endpoint: str, data: Dict[str, Any]
) -> Dict[str, Any]:
"""Make PATCH request to Ghidra instance."""
return safe_patch(port, endpoint, data)
def safe_delete(self, port: int, endpoint: str) -> Dict[str, Any]:
"""Make DELETE request to Ghidra instance."""
return safe_delete(port, endpoint)
def simplify_response(self, response: Dict[str, Any]) -> Dict[str, Any]:
"""Simplify HATEOAS response."""
return simplify_response(response)
def paginate_response(
self,
data: list,
query_params: Dict,
tool_name: str,
session_id: str = "default",
page_size: int = 50,
grep: Optional[str] = None,
grep_ignorecase: bool = True,
return_all: bool = False,
fields: Optional[list] = None,
) -> Dict:
"""Create paginated response with optional field projection."""
return paginate_response(
data=data,
query_params=query_params,
tool_name=tool_name,
session_id=session_id,
page_size=page_size,
grep=grep,
grep_ignorecase=grep_ignorecase,
return_all=return_all,
fields=fields,
)
def filtered_paginate(
self,
data: list,
query_params: Dict,
tool_name: str,
session_id: str = "default",
page_size: int = 50,
grep: Optional[str] = None,
grep_ignorecase: bool = True,
return_all: bool = False,
fields: Optional[list] = None,
) -> Dict:
"""Paginate with field projection and budget guard.
Convenience wrapper that applies field projection then delegates
to paginate_response. Prefer this over paginate_response for any
tool that could return large result sets.
"""
return self.paginate_response(
data=data,
query_params=query_params,
tool_name=tool_name,
session_id=session_id,
page_size=page_size,
grep=grep,
grep_ignorecase=grep_ignorecase,
return_all=return_all,
fields=fields,
)
# Async logging helpers
async def log_info(self, ctx: Optional[Context], message: str) -> None:
"""Log info message."""
await log_info(ctx, message)
async def log_debug(self, ctx: Optional[Context], message: str) -> None:
"""Log debug message."""
await log_debug(ctx, message)
async def log_warning(self, ctx: Optional[Context], message: str) -> None:
"""Log warning message."""
await log_warning(ctx, message)
async def log_error(self, ctx: Optional[Context], message: str) -> None:
"""Log error message."""
await log_error(ctx, message)

Some files were not shown because too many files have changed in this diff Show More