04. Config and Defaults¶
Purpose¶
Describe effective runtime configuration and defaults that affect behavior.
Public surface¶
- CLI parser and defaults:
codeclone/_cli_args.py:build_parser - Pyproject config loader:
codeclone/_cli_config.py - Effective cache default path logic:
codeclone/cli.py:_resolve_cache_path - Metrics-mode selection logic:
codeclone/cli.py:_configure_metrics_mode - Debug mode sources:
codeclone/cli.py:_is_debug_enabled
Data model¶
Configuration sources, in precedence order:
- CLI flags (
argparse, explicit options only) pyproject.tomlsection[tool.codeclone]- Code defaults in parser and runtime
CODECLONE_DEBUG=1 affects debug diagnostics only and is not part of analysis
or gating configuration precedence.
Key defaults:
root="."--min-loc=10--min-stmt=6--processes=4--baseline=codeclone.baseline.json--max-baseline-size-mb=5--max-cache-size-mb=50--coverage-min=50- default cache path (when no cache flag is given):
<root>/.cache/codeclone/cache.json --metrics-baseline=codeclone.baseline.json(same default path as--baseline)- bare reporting flags use default report paths:
--html-><root>/.cache/codeclone/report.html--json-><root>/.cache/codeclone/report.json--md-><root>/.cache/codeclone/report.md--sarif-><root>/.cache/codeclone/report.sarif--text-><root>/.cache/codeclone/report.txt
Fragment-level admission thresholds (pyproject.toml only, advanced tuning):
block_min_loc=20— minimum function LOC for block-level sliding windowblock_min_stmt=8— minimum function statements for block-level sliding windowsegment_min_loc=20— minimum function LOC for segment-level sliding windowsegment_min_stmt=10— minimum function statements for segment-level sliding window
Example project-level config:
[tool.codeclone]
min_loc = 10
min_stmt = 6
baseline = "codeclone.baseline.json"
skip_metrics = true
quiet = true
Supported [tool.codeclone] keys in the current line:
Requires / Implies lists only runtime-enforced relationships from the current
code path. Use - when the key has no special dependency contract.
Analysis:
| Key | Type | Default | Meaning | Requires / Implies |
|---|---|---|---|---|
min_loc |
int |
10 |
Minimum function LOC for clone admission | - |
min_stmt |
int |
6 |
Minimum function statement count for clone admission | - |
block_min_loc |
int |
20 |
Minimum function LOC for block-window analysis | - |
block_min_stmt |
int |
8 |
Minimum function statements for block-window analysis | - |
segment_min_loc |
int |
20 |
Minimum function LOC for segment analysis | - |
segment_min_stmt |
int |
10 |
Minimum function statements for segment analysis | - |
processes |
int |
4 |
Worker process count | - |
cache_path |
str \| null |
<root>/.cache/codeclone/cache.json |
Cache file path | - |
max_cache_size_mb |
int |
50 |
Maximum accepted cache size before fail-open ignore | - |
skip_metrics |
bool |
false* |
Skip full metrics mode when allowed | Incompatible with metrics gates/update; auto-enabled in some runs* |
skip_dead_code |
bool |
false |
Skip dead-code analysis | Forced on by skip_metrics; overridden by fail_dead_code |
skip_dependencies |
bool |
false |
Skip dependency analysis | Forced on by skip_metrics; overridden by fail_cycles |
golden_fixture_paths |
list[str] |
[] |
Exclude clone groups fully contained in matching golden test fixtures from health/gates/active findings; keep them as suppressed report facts | Patterns must resolve under tests/ or tests/fixtures/ |
Baseline and CI:
| Key | Type | Default | Meaning | Requires / Implies |
|---|---|---|---|---|
baseline |
str |
codeclone.baseline.json |
Clone baseline path | Default target for metrics_baseline when not overridden |
max_baseline_size_mb |
int |
5 |
Maximum accepted baseline size | - |
update_baseline |
bool |
false |
Rewrite unified baseline from current run | In unified mode, auto-enables update_metrics_baseline unless skip_metrics=true |
metrics_baseline |
str |
codeclone.baseline.json |
Dedicated metrics-baseline path override | Defaults to baseline path when not overridden |
update_metrics_baseline |
bool |
false |
Rewrite metrics baseline from current run | Requires metrics analysis; may auto-enable update_baseline for missing shared baseline |
ci |
bool |
false |
Enable CI preset behavior | Implies fail_on_new, no_color, quiet; enables fail_on_new_metrics when a trusted metrics baseline loads |
Quality gates and metric collection:
| Key | Type | Default | Meaning | Requires / Implies |
|---|---|---|---|---|
fail_on_new |
bool |
false |
Fail when new clone groups appear | Requires a trusted clone baseline |
fail_threshold |
int |
-1 |
Fail when clone count exceeds threshold | - |
fail_complexity |
int |
-1 |
Fail when max cyclomatic complexity exceeds threshold | Incompatible with skip_metrics |
fail_coupling |
int |
-1 |
Fail when max CBO exceeds threshold | Incompatible with skip_metrics |
fail_cohesion |
int |
-1 |
Fail when max LCOM4 exceeds threshold | Incompatible with skip_metrics |
fail_cycles |
bool |
false |
Fail when dependency cycles are present | Incompatible with skip_metrics; forces dependency analysis |
fail_dead_code |
bool |
false |
Fail when high-confidence dead code is present | Incompatible with skip_metrics; forces dead-code analysis |
fail_health |
int |
-1 |
Fail when health score drops below threshold | Incompatible with skip_metrics |
fail_on_new_metrics |
bool |
false |
Fail on new metric hotspots vs trusted metrics baseline | Requires trusted metrics baseline; incompatible with skip_metrics; auto-enabled by ci when baseline loads |
api_surface |
bool |
false |
Collect public API inventory/diff facts | Auto-enabled by fail_on_api_break |
coverage_xml |
str \| null |
null |
Join external Cobertura XML to current-run function spans | Enables Coverage Join |
coverage_min |
int |
50 |
Coverage threshold for joined measured coverage hotspots | Used by Coverage Join; meaningful with coverage_xml |
min_typing_coverage |
int |
-1 |
Minimum allowed typing coverage percent | Incompatible with skip_metrics |
min_docstring_coverage |
int |
-1 |
Minimum allowed docstring coverage percent | Incompatible with skip_metrics |
fail_on_typing_regression |
bool |
false |
Fail on typing coverage regression vs metrics baseline | Requires trusted metrics baseline with adoption snapshot; incompatible with skip_metrics |
fail_on_docstring_regression |
bool |
false |
Fail on docstring coverage regression vs metrics baseline | Requires trusted metrics baseline with adoption snapshot; incompatible with skip_metrics |
fail_on_api_break |
bool |
false |
Fail on public API breaking changes vs metrics baseline | Requires trusted metrics baseline with API surface snapshot; incompatible with skip_metrics; implies api_surface |
fail_on_untested_hotspots |
bool |
false |
Fail when medium/high-risk functions measured by Coverage Join fall below threshold | Incompatible with skip_metrics; requires successful Coverage Join to fire |
Report outputs and local UX:
| Key | Type | Default | Meaning | Requires / Implies |
|---|---|---|---|---|
html_out |
str \| null |
null |
HTML report output path | - |
json_out |
str \| null |
null |
JSON report output path | - |
md_out |
str \| null |
null |
Markdown report output path | - |
sarif_out |
str \| null |
null |
SARIF report output path | - |
text_out |
str \| null |
null |
Plain-text report output path | - |
no_progress |
bool |
false |
Disable progress UI | Implied by quiet |
no_color |
bool |
false |
Disable colored CLI output | Enabled by ci |
quiet |
bool |
false |
Use compact CLI output | Implies no_progress; enabled by ci |
verbose |
bool |
false |
Enable more verbose CLI output | - |
debug |
bool |
false |
Enable debug diagnostics | Also enabled by CODECLONE_DEBUG=1 |
This is the exact accepted key set from codeclone/_cli_config.py; unknown
keys are contract errors.
Notes:
skip_metrics=false*: parser default isfalse, but runtime may auto-enable it when no metrics work is requested and no metrics baseline exists.- Report output keys default to
null; bare CLI flags still write to the deterministic.cache/codeclone/report.*paths listed above.
CLI always has precedence when option is explicitly provided, including boolean
overrides via --foo/--no-foo (e.g. --no-skip-metrics).
Path values loaded from pyproject.toml are normalized relative to resolved
scan root when provided as relative paths.
golden_fixture_paths is different:
- entries are repo-relative glob patterns, not filesystem paths
- they are not normalized to absolute paths
- they must target
tests/ortests/fixtures/ - a clone group is excluded only when every occurrence matches the configured golden-fixture scope
Current-run coverage join config:
coverage_xmlmay be set inpyproject.toml; relative paths resolve from the scan root like other configured paths.coverage_minandfail_on_untested_hotspotsfollow the same precedence rules as CLI flags.- Coverage join remains current-run only and does not persist to baseline.
Metrics baseline path selection contract:
- Relative
baseline/metrics_baselinepaths coming from defaults orpyproject.tomlresolve from the analysis root. - If
--metrics-baselineis explicitly set, that path is used. - If
metrics_baselineinpyproject.tomldiffers from parser default, that configured path is used even without explicit CLI flag. - Otherwise, metrics baseline defaults to the clone baseline path.
- In other words, metrics do not live in a separate file by default:
the default unified flow uses the same
codeclone.baseline.jsonpath for clone and metrics comparison state.
Refs:
codeclone/_cli_args.py:build_parsercodeclone/cli.py:_main_implcodeclone/cli.py:_configure_metrics_mode
Contracts¶
--ciis a preset: enablesfail_on_new,no_color,quiet.- In CI mode, if trusted metrics baseline is loaded, runtime also enables
fail_on_new_metrics. --quietimplies--no-progress.- Negative size limits are contract errors.
Refs:
codeclone/cli.py:_main_impl
Invariants (MUST)¶
- Detection thresholds (
min-loc,min-stmt) affect function-level extraction. - Fragment thresholds (
block_min_loc/stmt,segment_min_loc/stmt) affect block/segment extraction. - All six thresholds are part of cache compatibility (
payload.ap). - Reporting flags (
--html/--json/--md/--sarif/--text) affect output only. - Reporting flags accept optional path values; passing bare flag writes to
deterministic default path under
.cache/codeclone/. --cache-pathoverrides project-local cache default; legacy alias--cache-dirmaps to same destination.- Metrics baseline update/gating flags require metrics mode; incompatible
combinations with
--skip-metricsare contract errors. - Unknown keys or invalid value types in
[tool.codeclone]are contract errors (exit 2).
Refs:
codeclone/extractor.py:extract_units_and_stats_from_sourcecodeclone/_cli_args.py:build_parsercodeclone/cli.py:_configure_metrics_mode
Failure modes¶
| Condition | Behavior |
|---|---|
| Invalid output extension/path | Contract error (2) |
| Invalid root path | Contract error (2) |
| Negative size limits | Contract error (2) |
Refs:
codeclone/_cli_paths.py:_validate_output_pathcodeclone/cli.py:_main_impl
Determinism / canonicalization¶
- Parser help text and epilog are deterministic constants.
- Summary metric labels are deterministic constants.
Refs:
codeclone/contracts.py:cli_help_epilogcodeclone/ui_messages.py:SUMMARY_LABEL_FILES_FOUND
Locked by tests¶
tests/test_cli_unit.py::test_cli_help_text_consistencytests/test_cli_inprocess.py::test_cli_default_cache_dir_uses_roottests/test_cli_inprocess.py::test_cli_cache_dir_override_respectedtests/test_cli_inprocess.py::test_cli_negative_size_limits_fail_fast
Non-guarantees¶
- CLI help section ordering is stable today but not versioned independently from the CLI contract.