Scoring
Every crimes finding carries five numeric scores in [0, 1]:
| Score | Source | ”Higher means” |
|---|---|---|
severity | Detector — how bad the smell is in isolation | More severe |
confidence | Detector — how certain the detector is | More certain |
churn | Git log over a 90-day window | Edits more |
test_gap | Filesystem + import-graph test discovery | Less tested |
blast_radius | Import-graph transitive closure | Touches more |
agent_risk | Unified composite of all five | Riskier to AI edits |
Rank by agent_risk when the question is “which areas are dangerous to
edit”; rank by severity when the question is “which findings are
worst in isolation”.
Status by version
Section titled “Status by version”| Version | severity / confidence | churn / test_gap / blast_radius | agent_risk |
|---|---|---|---|
| 0.1–0.5 | Populated | Reserved (always absent) | Populated by detector |
| 0.6+ | Populated | Populated from the scoring context | Populated from the unified formula |
All fields are rounded to two decimal places.
The unified agent_risk formula
Section titled “The unified agent_risk formula”agent_risk is recomputed for every finding after detectors emit. The
0.6.0 formula:
agent_risk = clamp01( 0.4 * severity_numeric + 0.2 * confidence + 0.15 * churn + 0.15 * test_gap + 0.10 * blast_radius)severity_numeric maps the categorical severity to a numeric value:
severity | numeric |
|---|---|
high | 0.9 |
medium | 0.7 |
low | 0.45 |
These mappings match the convention pre-0.6.0 detectors used, so the ordering of existing high-severity findings is preserved when the new signals are zero.
How each signal is computed
Section titled “How each signal is computed”churn[file] = min(commits_touching_file_in_last_90_days / 20, 1)The same saturation curve crimes hotspots uses for the change-frequency
component of risk. The window can shift in future releases — the
contract is “higher is worse, range [0, 1]”, not “this exact number”.
When the repo isn’t a git working tree, or the git binary isn’t
available, churn is 0 for every file and the scoring context’s
internal limited flag is set. The hotspots command already documents
this case under history_limited.
test_gap
Section titled “test_gap”A three-tier signal derived from filesystem layout and the import graph:
test_gap | Condition |
|---|---|
0.0 | The file is itself a test file (e.g. foo.test.ts), or at least one test file imports it. |
0.5 | A sibling test file exists (foo.test.ts next to foo.ts) or a __tests__/ test file shares the basename, but no test file actually imports the target. |
1.0 | None of the above. |
Test files are recognised by the standard pattern: anything under
__tests__/, or any file matching .test.{ts,tsx,js,jsx,mjs,cjs} or
.spec.{…}.
blast_radius
Section titled “blast_radius”blast_radius[file] = min(transitive_importers / 50, 1)Where transitive_importers is the count of distinct repo files that
reach the target via one or more import edges. The traversal is memoised
per file so the per-scan cost stays O(F).
When the import graph isn’t available, blast_radius is 0 for every
file. The graph is built once per scan and shared across every detector
via DetectorContext.imports.
Stability guarantees
Section titled “Stability guarantees”severity and confidence continue to be the “stable” knobs detectors
calibrate against. churn, test_gap, and blast_radius are
ordinal — treat the exact numbers as advisory; the formulae may
shift between minor releases as the underlying heuristics are refined.
The contracts that don’t shift:
- Range is always [0, 1], rounded to two decimal places.
- Direction is always “higher is worse”.
agent_riskis monotonic in each of its five inputs.
The exact agent_risk weighting itself may also shift between minor
releases. Consumers that key off agent_risk should rank findings by
relative ordering rather than absolute thresholds.
Where to read the scores
Section titled “Where to read the scores”crimes scan --format json—findings[].scorescarries all five fields on every finding.crimes scan(human format) — a one-line “Risk profile” block prints alongside any finding where at least one ofchurn/test_gap/blast_radiusis > 0.5.--allalways shows it.crimes explain <id>— a “Risk profile” section explains each score alongside its raw evidence (commit count, importer count, test-file presence).crimes context <file>— every finding rendered in the deep-dive view includes the risk profile.