MCP Usage Guide¶

CodeClone MCP is a read-only, baseline-aware analysis server for AI agents and MCP-capable clients. It exposes the existing deterministic pipeline without mutating source files, baselines, cache, or on-disk report artifacts. Only session-local review/run state is mutable in memory. It is not only bounded in payload shape — it actively guides agents toward low-cost, high-signal workflows.

MCP is a client integration surface, not a model-specific feature. It works with any MCP-capable client regardless of the backend model. In practice, the cheapest useful path is also the most obvious one: summary or triage first, then hotspots or focused checks, then single-finding drill-down.

Install¶

pip install "codeclone[mcp]"        # add MCP extra
# or
uv tool install "codeclone[mcp]"    # install as a standalone tool

Start the server¶

Local agents (Claude Code, Codex, Copilot Chat, Gemini CLI):

codeclone-mcp --transport stdio

MCP analysis tools require an absolute repository root. Relative roots such as . are rejected, because the server process working directory may differ from the client workspace. The same absolute-path rule applies to check_* tools when a root filter is provided.

Remote / HTTP-only clients:

codeclone-mcp --transport streamable-http --host 127.0.0.1 --port 8000

Non-loopback hosts require --allow-remote (no built-in auth). Run retention is bounded: default 4, max 10 (--history-limit). If a tool request omits processes, MCP defers process-count policy to the core CodeClone runtime.

Tool surface¶

Tool	Purpose
`analyze_repository`	Full analysis → register as latest run and return a compact MCP summary; then prefer `get_run_summary` or `get_production_triage` for the first pass
`analyze_changed_paths`	Diff-aware analysis with `changed_paths` or `git_diff_ref`; returns a compact changed-files snapshot; then prefer `get_report_section(section="changed")` or `get_production_triage` before broader list calls
`get_run_summary`	Cheapest run-level snapshot: compact health/findings/baseline summary with slim inventory counts; `health` is explicit `available=false` when metrics were skipped
`get_production_triage`	Compact production-first view: health, cache freshness, production hotspots, production suggestions; best default first pass on noisy repos
`compare_runs`	Regressions, improvements, and run-to-run health delta between comparable runs; returns `mixed` for conflicting signals and `incomparable` when roots/settings differ, with empty comparison cards and `health_delta=null` in that case
`list_findings`	Filtered, paginated finding groups with compact summary payloads by default; use after hotspots or `check_*` when you need a broader filtered list
`get_finding`	Deep inspection of one finding by id; defaults to normal detail and accepts `detail_level`; use after `list_hotspots`, `list_findings`, or `check_*`
`get_remediation`	Structured remediation payload for one finding; defaults to normal detail; use when you only need the fix packet for a single finding
`list_hotspots`	Derived views: highest priority, production hotspots, spread, etc., with compact summary cards; preferred first-pass triage before broader listing
`get_report_section`	Read canonical report sections; prefer specific sections over `section="all"`; `metrics` is summary-only, `metrics_detail` is paginated/bounded
`evaluate_gates`	Preview CI/gating decisions without exiting
`check_clones`	Clone findings from a stored run; cheaper and narrower than `list_findings` when you only need clone debt
`check_complexity`	Complexity hotspots from a stored run; cheaper and narrower than `list_findings` when you only need complexity
`check_coupling`	Coupling hotspots from a stored run; cheaper and narrower than `list_findings` when you only need coupling
`check_cohesion`	Cohesion hotspots from a stored run; cheaper and narrower than `list_findings` when you only need cohesion
`check_dead_code`	Dead-code findings from a stored run; cheaper and narrower than `list_findings` when you only need dead code
`generate_pr_summary`	PR-friendly markdown or JSON summary; prefer `markdown` for compact LLM-facing output and `json` for machine post-processing
`mark_finding_reviewed`	Session-local review marker (in-memory only)
`list_reviewed_findings`	List reviewed findings for a run
`clear_session_runs`	Reset all in-memory runs and session caches

check_* tools query stored runs only. Call analyze_repository or analyze_changed_paths first.

check_* responses keep health.score and health.grade, but slim health.dimensions down to the one dimension relevant to that tool. List-style finding responses now use short MCP finding ids and compact relative locations by default; normal keeps structured {path, line, end_line, symbol} locations, while full keeps the richer compatibility payload including uri. Summary-style MCP cache payloads expose freshness (fresh, mixed, reused). Inline design-threshold parameters on analyze_repository / analyze_changed_paths become part of the canonical run: they are recorded in meta.analysis_thresholds.design_findings and define that run's canonical design findings.

Run ids in MCP payloads are short session handles (first 8 hex chars of the canonical digest). MCP tools and run-scoped resources accept both short and full run ids. Finding ids follow the same rule: MCP responses use compact ids, while the canonical report.json keeps full finding ids unchanged. When a short finding id would collide within a run, MCP lengthens it just enough to keep it unique.

Resource surface¶

Fixed resources:

Resource	Content
`codeclone://latest/summary`	Latest run summary
`codeclone://latest/triage`	Latest production-first triage
`codeclone://latest/report.json`	Full canonical report
`codeclone://latest/health`	Health score and dimensions
`codeclone://latest/gates`	Last gate evaluation result
`codeclone://latest/changed`	Changed-files projection (diff-aware runs)
`codeclone://schema`	Canonical report shape descriptor

Run-scoped resource templates:

URI template	Content
`codeclone://runs/{run_id}/summary`	Summary for a specific run
`codeclone://runs/{run_id}/report.json`	Report for a specific run
`codeclone://runs/{run_id}/findings/{finding_id}`	One finding from a specific run

Resources and URI templates are read-only views over stored runs; they do not trigger analysis.

codeclone://latest/* always resolves to the most recent run registered in the current MCP server session. A later analyze_repository or analyze_changed_paths call moves that pointer. mark_finding_reviewed and clear_session_runs mutate only in-memory session state. They never touch source files, baselines, cache, or report artifacts.

Recommended workflows¶

Budget-aware first pass¶

analyze_repository → get_run_summary or get_production_triage
→ list_hotspots or check_* → get_finding → get_remediation

Full repository review¶

analyze_repository → get_production_triage
→ list_hotspots(kind="highest_priority") → get_finding → evaluate_gates

Changed-files review (PR / patch)¶

analyze_changed_paths → get_report_section(section="changed")
→ list_findings(changed_paths=..., sort_by="priority") → get_remediation → generate_pr_summary

Session-based review loop¶

list_findings → get_finding → mark_finding_reviewed
→ list_findings(exclude_reviewed=true) → … → clear_session_runs

Prompt patterns¶

Good prompts include scope, goal, and constraint:

Health check¶

Use codeclone MCP to analyze this repository. Give me a concise structural health summary
and explain which findings are worth looking at first.

Clone triage (production only)¶

Analyze through codeclone MCP, filter to clone findings in production code only,
and show me the top 3 clone groups worth fixing first.

Changed-files review¶

Use codeclone MCP in changed-files mode for my latest edits.
Focus only on findings that touch changed files and rank them by priority.

Dead-code review¶

Use codeclone MCP to review dead-code findings. Separate actionable items from
likely framework false positives. Do not add suppressions automatically.

Gate preview¶

Run codeclone through MCP and preview gating with fail_on_new plus a zero clone threshold.
Explain the exact reasons. Do not change any files.

AI-generated code check¶

I added code with an AI agent. Use codeclone MCP to check for new structural drift:
clone groups, dead code, duplicated branches, design hotspots.
Separate accepted baseline debt from new regressions.

Safe refactor planning¶

Use codeclone MCP to pick one production finding that looks safe to refactor.
Explain why it is a good candidate and outline a minimal plan.

Run comparison¶

Compare the latest CodeClone MCP run against the previous one.
Show regressions, resolved findings, and health delta.

Tips:

Use analyze_changed_paths for PRs, not full analysis.
Prefer get_run_summary or get_production_triage for the first pass on a new run.
Prefer list_hotspots or the narrow check_* tools before broad list_findings calls.
Use get_finding / get_remediation for one finding instead of raising detail_level on larger lists.
Set cache_policy="off" when you need the freshest truth from a new analysis run, not whatever older session state currently sits behind latest/*.
Pass an absolute root to analyze_repository / analyze_changed_paths. MCP intentionally rejects relative roots like . to avoid analyzing the wrong directory when server cwd and client workspace differ.
Prefer generate_pr_summary(format="markdown") for agent-facing output; use json only when another machine step needs it.
Avoid get_report_section(section="all") unless you truly need the full canonical report document.
Use get_report_section(section="metrics_detail", family=..., limit=...) for metrics drill-down; the unfiltered call is intentionally bounded.
Use "production-only" / source_kind filters to cut test/fixture noise.
Use mark_finding_reviewed + exclude_reviewed=true in long sessions.
Ask the agent to separate baseline debt from new regressions.

Client configuration¶

All clients use the same CodeClone server — only the registration differs.

Claude Code / Anthropic¶

{
  "mcpServers": {
    "codeclone": {
      "command": "codeclone-mcp",
      "args": [
        "--transport",
        "stdio"
      ]
    }
  }
}

Codex / OpenAI (command-based)¶

[mcp_servers.codeclone]
enabled = true
command = "codeclone-mcp"
args = ["--transport", "stdio"]

For the Responses API or remote-only OpenAI clients, use streamable-http.

GitHub Copilot Chat¶

{
  "mcpServers": {
    "codeclone": {
      "command": "codeclone-mcp",
      "args": [
        "--transport",
        "stdio"
      ]
    }
  }
}

Gemini CLI¶

Same stdio registration. If the client only accepts remote URLs, use streamable-http and point to the /mcp endpoint.

Other clients¶

stdio for local analysis
streamable-http for remote/HTTP-only clients

If codeclone-mcp is not on PATH, use an absolute path to the launcher.

Security¶

Read-only by design: no source mutation, no baseline/cache writes.
Run history and review markers are in-memory only — lost on process stop.
Repository access is limited to what the server process can read locally.
streamable-http binds to loopback by default; --allow-remote is explicit opt-in.

Troubleshooting¶

Problem	Fix
`CodeClone MCP support requires the optional 'mcp' extra`	`pip install "codeclone[mcp]"`
Client cannot find `codeclone-mcp`	`uv tool install "codeclone[mcp]"` or use absolute path
Client only accepts remote MCP	Use `streamable-http` transport
Agent reads stale results	Call `analyze_repository` again; `latest` always points to the most recent run
`changed_paths` rejected	Pass a `list[str]` of repo-relative paths, not a comma-separated string