Skip to content

MCP Usage Guide

CodeClone MCP is a read-only, baseline-aware analysis server for AI agents and MCP-capable clients. It exposes the existing deterministic pipeline without mutating source files, baselines, cache, or on-disk report artifacts. Only session-local review/run state is mutable in memory. It is not only bounded in payload shape — it actively guides agents toward low-cost, high-signal workflows.

MCP is a client integration surface, not a model-specific feature. It works with any MCP-capable client regardless of the backend model. In practice, the cheapest useful path is also the most obvious one: summary or triage first, then hotspots or focused checks, then single-finding drill-down.

Install

pip install "codeclone[mcp]"        # add MCP extra
# or
uv tool install "codeclone[mcp]"    # install as a standalone tool

Start the server

Local agents (Claude Code, Codex, Copilot Chat, Gemini CLI):

codeclone-mcp --transport stdio

MCP analysis tools require an absolute repository root. Relative roots such as . are rejected, because the server process working directory may differ from the client workspace. The same absolute-path rule applies to check_* tools when a root filter is provided.

Remote / HTTP-only clients:

codeclone-mcp --transport streamable-http --host 127.0.0.1 --port 8000

Non-loopback hosts require --allow-remote (no built-in auth). Run retention is bounded: default 4, max 10 (--history-limit). If a tool request omits processes, MCP defers process-count policy to the core CodeClone runtime.

Tool surface

Tool Purpose
analyze_repository Full analysis → register as latest run and return a compact MCP summary; then prefer get_run_summary or get_production_triage for the first pass
analyze_changed_paths Diff-aware analysis with changed_paths or git_diff_ref; returns a compact changed-files snapshot; then prefer get_report_section(section="changed") or get_production_triage before broader list calls
get_run_summary Cheapest run-level snapshot: compact health/findings/baseline summary with slim inventory counts; health is explicit available=false when metrics were skipped
get_production_triage Compact production-first view: health, cache freshness, production hotspots, production suggestions; best default first pass on noisy repos
compare_runs Regressions, improvements, and run-to-run health delta between comparable runs; returns mixed for conflicting signals and incomparable when roots/settings differ, with empty comparison cards and health_delta=null in that case
list_findings Filtered, paginated finding groups with compact summary payloads by default; use after hotspots or check_* when you need a broader filtered list
get_finding Deep inspection of one finding by id; defaults to normal detail and accepts detail_level; use after list_hotspots, list_findings, or check_*
get_remediation Structured remediation payload for one finding; defaults to normal detail; use when you only need the fix packet for a single finding
list_hotspots Derived views: highest priority, production hotspots, spread, etc., with compact summary cards; preferred first-pass triage before broader listing
get_report_section Read canonical report sections; prefer specific sections over section="all"; metrics is summary-only, metrics_detail is paginated/bounded
evaluate_gates Preview CI/gating decisions without exiting
check_clones Clone findings from a stored run; cheaper and narrower than list_findings when you only need clone debt
check_complexity Complexity hotspots from a stored run; cheaper and narrower than list_findings when you only need complexity
check_coupling Coupling hotspots from a stored run; cheaper and narrower than list_findings when you only need coupling
check_cohesion Cohesion hotspots from a stored run; cheaper and narrower than list_findings when you only need cohesion
check_dead_code Dead-code findings from a stored run; cheaper and narrower than list_findings when you only need dead code
generate_pr_summary PR-friendly markdown or JSON summary; prefer markdown for compact LLM-facing output and json for machine post-processing
mark_finding_reviewed Session-local review marker (in-memory only)
list_reviewed_findings List reviewed findings for a run
clear_session_runs Reset all in-memory runs and session caches

check_* tools query stored runs only. Call analyze_repository or analyze_changed_paths first.

check_* responses keep health.score and health.grade, but slim health.dimensions down to the one dimension relevant to that tool. List-style finding responses now use short MCP finding ids and compact relative locations by default; normal keeps structured {path, line, end_line, symbol} locations, while full keeps the richer compatibility payload including uri. Summary-style MCP cache payloads expose freshness (fresh, mixed, reused). Inline design-threshold parameters on analyze_repository / analyze_changed_paths become part of the canonical run: they are recorded in meta.analysis_thresholds.design_findings and define that run's canonical design findings.

Run ids in MCP payloads are short session handles (first 8 hex chars of the canonical digest). MCP tools and run-scoped resources accept both short and full run ids. Finding ids follow the same rule: MCP responses use compact ids, while the canonical report.json keeps full finding ids unchanged. When a short finding id would collide within a run, MCP lengthens it just enough to keep it unique.

Resource surface

Fixed resources:

Resource Content
codeclone://latest/summary Latest run summary
codeclone://latest/triage Latest production-first triage
codeclone://latest/report.json Full canonical report
codeclone://latest/health Health score and dimensions
codeclone://latest/gates Last gate evaluation result
codeclone://latest/changed Changed-files projection (diff-aware runs)
codeclone://schema Canonical report shape descriptor

Run-scoped resource templates:

URI template Content
codeclone://runs/{run_id}/summary Summary for a specific run
codeclone://runs/{run_id}/report.json Report for a specific run
codeclone://runs/{run_id}/findings/{finding_id} One finding from a specific run

Resources and URI templates are read-only views over stored runs; they do not trigger analysis.

codeclone://latest/* always resolves to the most recent run registered in the current MCP server session. A later analyze_repository or analyze_changed_paths call moves that pointer. mark_finding_reviewed and clear_session_runs mutate only in-memory session state. They never touch source files, baselines, cache, or report artifacts.

Budget-aware first pass

analyze_repository → get_run_summary or get_production_triage
→ list_hotspots or check_* → get_finding → get_remediation

Full repository review

analyze_repository → get_production_triage
→ list_hotspots(kind="highest_priority") → get_finding → evaluate_gates

Changed-files review (PR / patch)

analyze_changed_paths → get_report_section(section="changed")
→ list_findings(changed_paths=..., sort_by="priority") → get_remediation → generate_pr_summary

Session-based review loop

list_findings → get_finding → mark_finding_reviewed
→ list_findings(exclude_reviewed=true) → … → clear_session_runs

Prompt patterns

Good prompts include scope, goal, and constraint:

Health check

Use codeclone MCP to analyze this repository. Give me a concise structural health summary
and explain which findings are worth looking at first.

Clone triage (production only)

Analyze through codeclone MCP, filter to clone findings in production code only,
and show me the top 3 clone groups worth fixing first.

Changed-files review

Use codeclone MCP in changed-files mode for my latest edits.
Focus only on findings that touch changed files and rank them by priority.

Dead-code review

Use codeclone MCP to review dead-code findings. Separate actionable items from
likely framework false positives. Do not add suppressions automatically.

Gate preview

Run codeclone through MCP and preview gating with fail_on_new plus a zero clone threshold.
Explain the exact reasons. Do not change any files.

AI-generated code check

I added code with an AI agent. Use codeclone MCP to check for new structural drift:
clone groups, dead code, duplicated branches, design hotspots.
Separate accepted baseline debt from new regressions.

Safe refactor planning

Use codeclone MCP to pick one production finding that looks safe to refactor.
Explain why it is a good candidate and outline a minimal plan.

Run comparison

Compare the latest CodeClone MCP run against the previous one.
Show regressions, resolved findings, and health delta.

Tips:

  • Use analyze_changed_paths for PRs, not full analysis.
  • Prefer get_run_summary or get_production_triage for the first pass on a new run.
  • Prefer list_hotspots or the narrow check_* tools before broad list_findings calls.
  • Use get_finding / get_remediation for one finding instead of raising detail_level on larger lists.
  • Set cache_policy="off" when you need the freshest truth from a new analysis run, not whatever older session state currently sits behind latest/*.
  • Pass an absolute root to analyze_repository / analyze_changed_paths. MCP intentionally rejects relative roots like . to avoid analyzing the wrong directory when server cwd and client workspace differ.
  • Prefer generate_pr_summary(format="markdown") for agent-facing output; use json only when another machine step needs it.
  • Avoid get_report_section(section="all") unless you truly need the full canonical report document.
  • Use get_report_section(section="metrics_detail", family=..., limit=...) for metrics drill-down; the unfiltered call is intentionally bounded.
  • Use "production-only" / source_kind filters to cut test/fixture noise.
  • Use mark_finding_reviewed + exclude_reviewed=true in long sessions.
  • Ask the agent to separate baseline debt from new regressions.

Client configuration

All clients use the same CodeClone server — only the registration differs.

Claude Code / Anthropic

{
  "mcpServers": {
    "codeclone": {
      "command": "codeclone-mcp",
      "args": [
        "--transport",
        "stdio"
      ]
    }
  }
}

Codex / OpenAI (command-based)

[mcp_servers.codeclone]
enabled = true
command = "codeclone-mcp"
args = ["--transport", "stdio"]

For the Responses API or remote-only OpenAI clients, use streamable-http.

GitHub Copilot Chat

{
  "mcpServers": {
    "codeclone": {
      "command": "codeclone-mcp",
      "args": [
        "--transport",
        "stdio"
      ]
    }
  }
}

Gemini CLI

Same stdio registration. If the client only accepts remote URLs, use streamable-http and point to the /mcp endpoint.

Other clients

  • stdio for local analysis
  • streamable-http for remote/HTTP-only clients

If codeclone-mcp is not on PATH, use an absolute path to the launcher.

Security

  • Read-only by design: no source mutation, no baseline/cache writes.
  • Run history and review markers are in-memory only — lost on process stop.
  • Repository access is limited to what the server process can read locally.
  • streamable-http binds to loopback by default; --allow-remote is explicit opt-in.

Troubleshooting

Problem Fix
CodeClone MCP support requires the optional 'mcp' extra pip install "codeclone[mcp]"
Client cannot find codeclone-mcp uv tool install "codeclone[mcp]" or use absolute path
Client only accepts remote MCP Use streamable-http transport
Agent reads stale results Call analyze_repository again; latest always points to the most recent run
changed_paths rejected Pass a list[str] of repo-relative paths, not a comma-separated string

See also