`crimes@0.8.1` — Calibration Patch
Draft release notes for the GitHub Release tagged
v0.8.1. The body below is what should go in the Releases page when you cut the tag — that triggers.github/workflows/release.ymland publishes to npm via Trusted Publishing.
crimes@0.8.1 is a patch on top of 0.8.0 that tunes the new 0.8.0
detectors against dogfood evidence — no new detectors, no schema
change, no new commands. Three changes:
boolean_naming_driftbuilt-in allowlist expanded. Eight idiomatic state names that 0.8.0 over-flagged on real codebases (loaded,found,settled,overflow,typeonly,interpolated,limited,existed) are now in the React-state allowlist. Project-specific names still go throughdetectors.options.boolean_naming_drift.allowedNames— this is pure default-tuning, not a behaviour change for configured users.- Self-scan signal cleanup. The crimes monorepo’s own
crimes.config.jsonnow excludesevals/fixtures/**andexamples/messy-ts-app/**from the asset pass, so the dogfood scan no longer surfaces the intentional-bad demo assets at the top of the report. No effect on downstream users’ configs. scan-assets.tsrefactored. The 80-linerunAssetDetectorsForRootbody broken into four named helpers (discoverAssetFiles,groupDetectorsByExtension,runDetectorsForAssetFile,buildAssetContext). Each block is now a single responsibility and each is independently testable. Behaviour byte-identical to 0.8.0.
Schema: schema_version stays at "0.1.0". The published
finding shape is unchanged.
What shipped
Section titled “What shipped”boolean_naming_drift allowlist expansion
Section titled “boolean_naming_drift allowlist expansion”0.8.0 ships boolean_naming_drift with a 26-name React-state
allowlist (loading, ready, active, …). The dogfood pass
surfaced eight more names where the unprefixed form is the natural
identifier:
| Name | Where it shows up |
|---|---|
loaded | Post-async-load flag — symmetric to the existing loading. |
found | Search / lookup result presence. |
settled | Promise / async-state idiom. |
overflow | UI dimension flag (text / box overflow). |
typeonly | TypeScript import type discrimination. |
interpolated | Template-string / fold-state flag. |
limited | Truncation / cap signal (e.g. history_limited). |
existed | Pre-condition flag in delete / cleanup flows. |
All eight are now exempt by default — adding them to a project’s
detectors.options.boolean_naming_drift.allowedNames is no longer
required. Users who had them in their config can leave them there;
the lookup is set-membership so duplicates are harmless.
Self-scan signal cleanup
Section titled “Self-scan signal cleanup”The 0.8.0 release intentionally shipped the messy-ts-app demo with
violating asset files (hero-banner.png, check-icon.png,
partner-logo.svg) so the asset detectors had fixture coverage.
But when we ran crimes scan on the crimes monorepo itself, the
demo partner-logo.svg was the #1 high-severity finding — a
fixture-driven artefact dominating production-code signal.
The fix is one config change in our own
crimes.config.json — assets.exclude
now adds evals/fixtures/** and examples/messy-ts-app/** on top
of the defaults. The default assets.exclude already excludes
**/fixtures/** (added in phase 5b) which catches most real-world
fixture trees; this monorepo’s two paths needed explicit listing
because they live alongside production code rather than under a
canonical fixtures/ folder.
No effect on downstream users — their crimes.config.json is
unchanged, and the defaults haven’t moved.
scan-assets.ts refactor
Section titled “scan-assets.ts refactor”The 0.8.0 runAssetDetectorsForRoot was a single 80-line function
mixing four responsibilities: discover asset files, group detectors
by extension, build a per-file detector context, and dispatch the
run loop. Split into four named helpers:
discoverAssetFiles(root, config)— owns theassets.include/assets.excludediscipline, including the “explicitly cleared include = skip the pass entirely” semantics.groupDetectorsByExtension(detectors)— builds theMap<extension, AssetDetector[]>used by the dispatch loop.runDetectorsForAssetFile({ root, absolutePath, config, byExtension })— per-file orchestration: lookup applicable detectors, build context, run each, swallow per-detector exceptions.buildAssetContext({ root, absolutePath, extension, config })— pre-fetchesbyteSizeviafs.stat, sets up the lazy per-file-cachedread(), returns theAssetDetectorContextorundefinedfor unreadable files.
Same behaviour, same per-file caching, same exception isolation —
the public runAssetDetectorsForRoot signature is unchanged. The
refactor was motivated by making each piece individually
substitutable in tests and future asset pipelines (e.g. parallel
asset scan, batch I/O via worker threads).
Baseline at 0.8.1
Section titled “Baseline at 0.8.1”Same harness, same 38 scenarios per agent, against the patched detector set:
| Agent | Structural pass rate | Δ vs 0.7.15 |
|---|---|---|
claude | 0.82 | -3pp |
codex | 0.76 | +2pp |
Per scenario kind:
| Kind | claude | codex | claude Δ | codex Δ |
|---|---|---|---|---|
| bugfix | 0.75 | 0.63 | +18pp | +13pp |
| context | 0.93 | 1.00 | -6pp | +0pp |
| plan | 0.71 | 0.65 | -23pp | +6pp |
| refactor | 0.93 | 0.80 | +4pp | +2pp |
| review | 0.71 | 0.74 | -10pp | -2pp |
The per-kind shifts are run-to-run variance — every kind samples
small enough (4-11 scenarios) that one scenario flipping is ±10pp
or more. The allowlist change doesn’t touch the fixtures (the
eight new exempt names aren’t in the messy-ts-app fixture — the
fixture’s drift cases like paid / expired / stale continue
to fire). Re-pinning the baseline is the policy and keeps the
per-version trail clean.
The first run of this baseline hit nine transient claude exited 1
failures (no stderr — looks like subscription-side throttling).
We re-ran --agent claude to fill them; the codex side from the
first pass was clean. The summary above is the merged result.
Result transcripts and rubric scores at
evals/results/0.8.1/.
What’s not in 0.8.1
Section titled “What’s not in 0.8.1”- No new detectors. Detector count unchanged at 47.
- No new commands. CLI surface byte-equivalent to 0.8.0.
- No schema bump.
- No config schema changes. The
assets.excludeextension in the bundledcrimes.config.jsonis monorepo-specific, not a default-value change. - No package dependency changes.
Upgrading
Section titled “Upgrading”npm install -g crimes@0.8.1crimes --version # crimes@0.8.1For users on crimes@0.8.0:
- If you previously added
loaded/found/settled/overflow/typeonly/interpolated/limited/existedto yourdetectors.options.boolean_naming_drift.allowedNames— you can leave them; the default allowlist now covers them too (set-membership lookup, duplicates are harmless). - If your CI gates on
crimes scan --fail-on highorcrimes baseline check --fail-on, no findings should regress: the changes are detector-quietening (allowlist additions) plus a structural refactor.
Notable links
Section titled “Notable links”docs/releases/v0.8.0.md— the parent release these patches build on.docs/finding-types/structural.md—boolean_naming_driftdocumentation.evals/results/0.8.1/— re-pinned eval baseline.