01. Architecture Map¶
Purpose¶
Document the current module boundaries and ownership in CodeClone 2.0.x.
Public surface¶
Main ownership layers:
- CLI entry and UX orchestration
- Config parsing and pyproject resolution
- Core runtime pipeline
- Analysis and clone grouping
- Metrics and findings
- Baseline/cache persistence contracts
- Canonical report document and deterministic projections
- HTML render-only surface
- Read-only MCP surface
- IDE/client surfaces over MCP
Data model¶
| Layer | Modules | Responsibility |
|---|---|---|
| Entry | codeclone/main.py |
Public CLI entrypoint only |
| CLI surface | codeclone/surfaces/cli/*, codeclone/ui_messages/* |
Parse args, resolve runtime mode, print summaries, write outputs, route exits |
| Config | codeclone/config/* |
Option specs, parser construction, pyproject loading, CLI > pyproject > defaults merge |
| Core runtime | codeclone/core/* |
Bootstrap, discovery, worker processing, project metrics, report/gate integration |
| Analysis | codeclone/analysis/*, codeclone/blocks/*, codeclone/paths/*, codeclone/qualnames/* |
Parse source, normalize AST/CFG facts, extract units, prepare deterministic analysis inputs |
| Findings | codeclone/findings/clones/*, codeclone/findings/structural/* |
Clone grouping and structural finding derivation |
| Metrics | codeclone/metrics/* |
Complexity, coupling, cohesion, dependencies, dead code, health, adoption, coverage join, API surface |
| Contracts/domain | codeclone/contracts/*, codeclone/models.py, codeclone/domain/* |
Version constants, enums, typed exceptions, shared models, domain taxonomies |
| Persistence | codeclone/baseline/*, codeclone/cache/* |
Trusted comparison state and optimization-only cache contracts |
| Canonical report | codeclone/report/document/*, codeclone/report/gates/*, codeclone/report/*.py |
Canonical report payload, derived projections, explainability, suggestions, gate reasons |
| Deterministic renderers | codeclone/report/renderers/* |
Text/Markdown/SARIF/JSON projections over the canonical report |
| HTML render layer | codeclone/report/html/* |
Render-only HTML view over canonical report/meta facts |
| MCP surface | codeclone/surfaces/mcp/* |
Read-only MCP tools/resources over the same pipeline/report contracts |
| Client surfaces | extensions/vscode-codeclone/*, extensions/claude-desktop-codeclone/*, plugins/codeclone/* |
Native clients/install surfaces over codeclone-mcp |
Refs:
codeclone/main.py:maincodeclone/surfaces/cli/workflow.py:_main_implcodeclone/core/pipeline.py:analyzecodeclone/report/document/builder.py:build_report_documentcodeclone/report/html/assemble.py:build_html_reportcodeclone/surfaces/mcp/server.py:build_mcp_server
Contracts¶
- Core produces facts; renderers present facts.
codeclone/report/document/*is the canonical report source of truth.- HTML, Markdown, SARIF, text, and MCP are projections over the same canonical report semantics.
- Baseline and cache are persistence contracts, not analysis truth.
- Cache is optimization-only and fail-open.
- MCP is read-only and must not create a second analysis truth path.
- VS Code, Claude Desktop, and Codex plugin surfaces are clients over MCP, not second analyzers.
Refs:
codeclone/report/document/builder.py:build_report_documentcodeclone/report/renderers/text.py:render_text_report_documentcodeclone/report/renderers/markdown.py:render_markdown_report_documentcodeclone/report/renderers/sarif.py:render_sarif_report_documentcodeclone/report/html/assemble.py:build_html_reportcodeclone/baseline/clone_baseline.py:Baseline.loadcodeclone/baseline/metrics_baseline.py:MetricsBaseline.loadcodeclone/cache/store.py:Cache.load
Invariants (MUST)¶
- Report serialization is deterministic and schema-versioned.
- UI is render-only and must not invent gating semantics.
- Status enums remain domain-owned in baseline/metrics-baseline/cache/contracts modules.
codeclone/main.pystays thin; orchestration lives incodeclone/surfaces/cli/*.
Refs:
codeclone/report/document/integrity.py:_build_integrity_payloadcodeclone/report/document/inventory.py:_build_inventory_payloadcodeclone/baseline/trust.py:BaselineStatuscodeclone/baseline/_metrics_baseline_contract.py:MetricsBaselineStatuscodeclone/cache/versioning.py:CacheStatuscodeclone/contracts/__init__.py:ExitCode
Failure modes¶
| Condition | Layer |
|---|---|
| Invalid CLI args / invalid output path | CLI surface (codeclone/config/*, codeclone/surfaces/cli/*) |
| Baseline schema/integrity mismatch | Baseline contract layer |
| Metrics baseline schema/integrity mismatch | Metrics-baseline contract layer |
| Cache corruption/version mismatch | Cache contract layer (fail-open) |
| HTML snippet read failure | HTML render layer fallback snippet |
| MCP invalid request / invalid root / unknown run | MCP surface |
Determinism / canonicalization¶
- File iteration and grouping order are explicit sorts.
- Canonical report integrity excludes non-canonical
derivedpayload. - Baseline and cache hashes/signatures use canonical JSON.
Refs:
codeclone/scanner.py:iter_py_filescodeclone/report/document/integrity.py:_build_integrity_payloadcodeclone/baseline/trust.py:_compute_payload_sha256codeclone/cache/integrity.py:canonical_json
Locked by tests¶
tests/test_architecture.py::test_architecture_layer_violationstests/test_report.py::test_report_json_compact_v21_contracttests/test_report_contract_coverage.py::test_report_document_rich_invariants_and_rendererstests/test_html_report.py::test_html_report_uses_core_block_group_factstests/test_cache.py::test_cache_v13_uses_relpaths_when_root_settests/test_mcp_service.py::test_mcp_service_analyze_repository_registers_latest_run
Non-guarantees¶
- Internal file splits may evolve in
2.0.xif public contracts are preserved. - Package markers and internal helper placement are not contract by themselves.
Chapter map¶
| Topic | Primary chapters |
|---|---|
| CLI behavior and failure routing | 03-contracts-exit-codes.md, 09-cli.md |
| Config precedence and defaults | 04-config-and-defaults.md |
| Core processing pipeline | 05-core-pipeline.md |
| Clone baseline trust/compat/integrity | 06-baseline.md |
| Cache trust and fail-open behavior | 07-cache.md |
| Report schema and provenance | 08-report.md, 10-html-render.md |
| MCP agent surface | 20-mcp-interface.md |
| Health score model | 15-health-score.md |
| Metrics gates and metrics baseline | 15-metrics-and-quality-gates.md |
| Dead-code liveness policy | 16-dead-code-contract.md |
| Determinism and versioning policy | 12-determinism.md, 14-compatibility-and-versioning.md |