Skip to content

`crimes` JSON output schema

Every crimes command that supports --format json emits a single JSON document. The shape varies per command, but every document carries the same two top-level discriminator keys — schema_version and report_type — so consumers can route on a single field.

This page is the stable product API. Treat it as a public contract: any breaking change to a field name, type, or required-ness will bump schema_version.

Documented as of schema_version: "0.1.0". The source of truth in code is packages/core/src/finding.ts.

For how an agent should use this output, see agent-usage.md.

Reportreport_typeEmitted by
ScanReport"scan"crimes scan, crimes scan --changed [--fail-on]
Finding(embedded)every report that lists findings
ContextReport"context"crimes context <file>
HotspotsReport"hotspots"crimes hotspots
DiffReport"diff"crimes diff <base...head>
Baseline"baseline"crimes baseline save (on-disk file)
BaselineCheckReport"baseline_check"crimes baseline check
VerdictReport"verdict"crimes verdict
ExplainReport"explain"crimes explain <id-or-fingerprint>
Suppressions"suppressions"crimes ignore / crimes unignore (on-disk file)
AuditSuppressionsReport"audit_suppressions"crimes audit-suppressions
FeedbackReport"feedback"crimes feedback list / summary / export
Gate fields(optional)crimes scan --changed --fail-on …
Suppression fields(optional)every report that lists findings
Resurface fields(optional)every report that lists findings (0.7.0+)
Stability guarantees

The default report. Emitted by every form of crimes scan — directory scans, --changed, and the --changed --fail-on gate (which adds two extra top-level fields documented below).

interface ScanReport {
schema_version: "0.1.0";
/** Discriminator. Always the literal `"scan"`. */
report_type: "scan";
repo: RepoInfo;
summary: ScanSummary;
findings: Finding[];
/** Set only when `crimes scan --changed --fail-on <severity>` is used. */
fail_on?: "low" | "medium" | "high";
/** Set only when `fail_on` is set. True when ≥1 finding meets `fail_on`. */
failed?: boolean;
/** Set only when `crimes scan --changed` was used. See below. */
changed_files?: string[];
}

The wire format version. Always present, always a string. Bumped on any breaking change to the shape of Finding, ScanSummary, or RepoInfo.

Consumers should refuse to parse a report whose schema_version they do not recognise.

Discriminator literal. Always "scan" for crimes scan output. Every report type that crimes emits carries one — "scan", "context", "hotspots", "diff", "baseline", "baseline_check", "verdict" — so consumers can route on a single field instead of pattern-matching on the body. Always present, always a string literal.

interface RepoInfo {
/** Basename of the scanned root directory. */
name: string;
/** Absolute path to the scanned root, machine-specific. */
root: string;
/** Optional git ref the scan ran against. Not yet populated. */
git_ref?: string;
}

root is an absolute filesystem path on the machine that ran the scan. Useful to anchor findings[].file (which is repo-relative), but not stable across machines or containers.

git_ref is reserved for a future milestone that wires up git history.

interface ScanSummary {
total: number;
high: number;
medium: number;
low: number;
}

Counts by severity. total equals the sum of the three buckets.

fail_on and failed are optional, top-level, and only set when the CLI runs crimes scan --changed --fail-on <severity>. Both fields are absent from every other invocation of crimes scan — including crimes scan without --changed and crimes scan --changed without --fail-on. Adding optional fields is non-breaking under the stability guarantees below, so the existing ScanReport contract is unchanged.

fail_on?: "low" | "medium" | "high";
failed?: boolean;
  • fail_on — the threshold the CLI gated on, echoed back verbatim. Use this to confirm what the run was actually checking against.
  • failedtrue when at least one Finding in findings has severity ≥ fail_on, using the same low < medium < high ordering as crimes baseline check. false otherwise (including when findings is empty).

The corresponding CLI behaviour:

  • --fail-on is only valid in combination with --changed. Passing it on a plain crimes scan exits 2 (usage error). Pass --changed --fail-on <severity> to opt into the gate.
  • --fail-on accepts low | medium | high. low fails on any finding; medium fails on medium or high; high fails on high only — same semantics as crimes baseline check --fail-on.
  • Exit 1 when failed === true; exit 0 when failed === false; exit 2 for usage / environment errors (unknown threshold, not a git repo, etc.).
  • The default crimes scan exit-code behaviour is unchanged when --fail-on is not passed — it still always exits 0.

See docs/ci.md for the recommended CI integration.

changed_files is optional, top-level, and only set when the CLI runs crimes scan --changed (with or without --base, with or without --fail-on). Plain crimes scan omits the field. Adding an optional field is non-breaking under the stability guarantees.

changed_files?: string[];
  • Lists every file the --changed resolver returned — including files that produced zero findings (e.g. a touched README.md, package.json, or a .ts file the detectors had nothing to say about). The point is that an agent re-running crimes scan --changed after an edit can confirm which files it actually touched even when the diff is clean.
  • Paths are repo-relative POSIX (/-separated). Sorted alphabetically and deduplicated.
  • When the working tree has no changes, the array is present and empty — that’s “we looked and found nothing”, not “we didn’t look”.
  • The set is the same one crimes scan --changed resolves to drive the scan — staged + unstaged + untracked working-tree changes, plus <base>...HEAD when --base is set. Deletions are skipped (git reports them but the path no longer exists on disk).

Array of Finding objects, sorted:

  1. By severity (high → medium → low)
  2. Then by confidence descending
  3. Then by file ascending
  4. Then by lines[0] ascending

IDs (crime_00001, crime_00002, …) are assigned in this sort order, so they are stable as long as the underlying detector results are stable.


interface Finding {
/** Stable per-scan id, e.g. "crime_00001". */
id: string;
/** Machine-readable detector type, e.g. "large_function". */
type: string;
/** Human-readable charge, e.g. "God Function". */
charge: string;
severity: "low" | "medium" | "high";
/** 0–1 confidence. */
confidence: number;
/** Repo-relative path with forward slashes. */
file: string;
/** Function/class/method name when applicable. */
symbol?: string;
/** Inclusive [start, end] 1-based line range. */
lines?: [number, number];
/** One-line natural-language summary. */
summary: string;
/** Concrete evidence — short factual strings, deterministic. */
evidence: string[];
scores: FindingScores;
suggested_actions?: SuggestedAction[];
/**
* Other repo-relative files that contributed evidence to this finding.
* Populated by cross-file detectors (`missing_agent_context`,
* `route_metadata_drift`, `duplicated_navigation_source`,
* `concept_alias_drift`, `docs_code_drift`,
* `magic_domain_literal_scatter`). Absent on file-local findings.
*/
related_files?: string[];
/**
* Only set when the consumer requested `--show-suppressed`. Indicates
* the finding matched an entry in `.crimes/suppressions.json` and would
* normally be hidden. Gate evaluation always ignores findings with
* `suppressed === true` regardless of display.
*/
suppressed?: true;
/** Paired with `suppressed`. The reason recorded in the suppressions file. */
suppression_reason?: string;
/**
* Set when the finding matched a feedback-sourced suppression whose
* pinned minor differs from the current crimes minor — the 0.7.0
* auto-resurface loop. The finding is kept in `findings[]` (NOT
* counted in `suppressed_count`) so the user can re-confirm `fp` or
* mark `tp`. Manual suppressions never resurface.
*/
previously_suppressed?: true;
/** Paired with `previously_suppressed`. Carries the prior pin + reason. */
previous_suppression?: {
pinned_version: string;
reason: string;
};
}

Always present on every finding:

  • id, type, charge, severity, confidence, file, summary, evidence, scores

Populated when the detector has the data:

  • symbol — set for findings that name a specific function/class/method (e.g. large_function). Absent for file-level findings.
  • lines — set for any finding with a meaningful line range. All built-in detectors populate this in v0.1.0.
  • suggested_actions — set for every built-in detector in v0.1.0, but optional in the schema.

Populated on cross-file findings:

  • related_files — populated by the five IA detectors and the cross-file petty literal detector (missing_agent_context, route_metadata_drift, duplicated_navigation_source, concept_alias_drift, docs_code_drift, magic_domain_literal_scatter). Absent on structural and file-local petty findings.

Reserved (declared in the schema, deferred to later milestones):

  • scores.blast_radius, scores.churn, scores.test_gap — see below

A scan-local identifier in the form crime_NNNNN (5-digit zero-padded). IDs are assigned after sorting, so a given finding may get a different id between runs if the set of findings changes. Use id for citing within a single report; do not persist it across scans.

Machine identifier for the detector that produced the finding. Stable. New type values may be added without bumping schema_version; consumers should treat unknown values defensively. The currently shipped values are:

typeChargeSymbol set?What it flags
large_fileGod FilenoFiles over thresholds.largeFileLines (default 300)
large_functionGod FunctionyesFunctions/methods/arrows over thresholds.largeFunctionLines (60)
todo_densityUnfinished BusinessnoHigh TODO/FIXME/XXX/HACK density vs thresholds.todoDensityPerKLoc
direct_dateTemporal RecklessnessnoDirect Date.now() or new Date() usage
commented_out_codeCommented-Out CorpsenoComment blocks or consecutive line comments that appear to contain disabled source code
logic_in_commentsLogic in the AlibinoComments that appear to carry business rules or safety constraints not represented nearby
name_behavior_mismatchFalse IdentityyesSafe-sounding function names whose bodies appear to perform side effects
magic_domain_literal_scatterString SprinklesnoRepeated domain-looking literals spread across production files
weak_test_signalTest That Proves NothingnoTests with no assertions or only weak assertion matchers
option_bag_junk_drawerOption Bag Junk DraweryesGeneric object bags with large implicit shapes
return_shape_rouletteReturn Shape RouletteyesFunctions returning divergent object shapes without an explicit return type
negative_flag_mazeNegative Flag MazenoConditionals that combine multiple negative flag names
missing_agent_contextMissing Agent ContextnoRepo declares a bin but ships no AGENTS.md / CLAUDE.md / .claude/skills/*/SKILL.md / .agents/skills/*/SKILL.md
route_metadata_driftRoute Metadata DriftnoOne route’s path, file, component name, page title, and nav labels appear to use competing names
duplicated_navigation_sourceDuplicated Navigation SourcenoSame destination declared in multiple nav-like source files with different labels
concept_alias_driftConcept Alias DriftnoA seeded alias group (e.g. team / workspace / organisation) spans multiple directories on product surface
docs_code_driftDocs-Code DriftnoMarkdown document references a local file that does not exist on disk
orphaned_destinationOrphaned DestinationnoRoute declared in nav / IA index but no source file or route file resolves the destination
parallel_destinationParallel DestinationnoTwo nav-like surfaces declare different routes for the same canonical destination
permission_ia_driftPermission IA DriftnoThe same role / permission identifier appears with different IA categorisation across surfaces
action_label_driftAction Label DriftnoTwo surfaces label the same domain action differently (“Delete” vs “Remove”)
command_drift_docs_code_driftCommand Docs / Code DriftnoMarkdown references a bin subcommand the CLI no longer implements
layer_violationLayer Border CrossingnoAn import crosses a forbidden boundary defined by architecture.layers + architecture.rules
circular_dependencyTangled ImportsnoTwo or more files form an import cycle
deep_importDeep Import AbusenoAn import reaches more than n segments deep into another package’s source tree
high_fan_in_fan_outCrowded ModulenoA file has unusually high import fan-in and/or fan-out
design_token_escapeDesign Token EscapenoHard-coded colours / spacings / font sizes in JSX where the repo has a design-token system
accessible_interaction_riskHidden InteractionnoA JSX element handles pointer events but has no accessible label
duplicate_component_shapeDuplicate Component ShapeyesTwo or more components share an identical JSX shape (AST-hash equivalent)
responsive_fragilityResponsive FragilitynoA component mixes many breakpoint-specific utilities or hard-pixel widths
copy_ia_driftCopy / IA Drift (frontend)noA nav label and a breadcrumb / page-title disagree on the canonical name for the same destination
exact_duplicate_blockExact Duplicate BlockyesTwo or more function bodies / statement blocks share an identical AST hash
near_duplicate_blockNear-Duplicate BlockyesTwo function bodies share a high-similarity AST-hash bag with a small delta
duplicated_role_status_plan_checkDuplicated Policy LogicnoThe same domain concept (role, status, plan tier) is checked in multiple files with different shapes

Information-architecture findings (missing_agent_context, route_metadata_drift, duplicated_navigation_source, concept_alias_drift, docs_code_drift, orphaned_destination, parallel_destination, permission_ia_drift, action_label_drift, command_drift_docs_code_drift, copy_ia_drift) are cross-file: their file field anchors the finding on the most useful single path, and related_files lists the other files involved. magic_domain_literal_scatter, the duplication detectors, the dependency-graph detectors, and the frontend duplicate-component detector follow the same cross-file pattern.

Human-readable label for type. Stable for a given type. Use this in user-facing summaries; use type in code that branches on detector kind.

severity is one of "low" | "medium" | "high". It’s the headline triage signal: how bad this looks in isolation, ignoring blast radius and churn.

confidence is 0–1. It reflects how sure the detector is that the smell is real (e.g. a function that’s 2× the line threshold has higher confidence than one just barely over). Rounded to two decimals.

file is always repo-relative, with forward slashes regardless of OS. Resolve against repo.root if you need an absolute path.

lines is [startLine, endLine], both inclusive, both 1-based. When a detector reports on the whole file, lines is [1, lineCount]. For todo_density, lines spans from the first marker to the last marker.

The function, method, or accessor name when the detector pinpoints a specific declaration. May be "<anonymous>" when the detector found a function but couldn’t infer its name.

An array of short factual strings (typically 2–4 items). Every item is generated deterministically from the file/AST — no LLM. Quote evidence verbatim when explaining a finding to a user. Examples:

"lines 37–240 (204 lines)"
"3.4× the configured 60-line threshold"
"function declaration"
"5× Date.now(), 2× new Date()"
"lines: 44, 50, 79, 185, 225, 237, 260"
"6× TODO, 4× FIXME, 2× HACK, 2× XXX"
"424.2 markers per 1k LOC (threshold 10)"
interface FindingScores {
/** How bad the smell is in isolation (0–1). Always present. */
severity: number;
/** Detector certainty (0–1). Always present. */
confidence: number;
/**
* Normalised transitive-importer count (0–1). Populated by every scan
* since 0.6.0 from the repo's import graph. Ordinal — the precise
* scaling may shift between minor releases. See `docs/scoring.md`.
*/
blast_radius?: number;
/**
* Normalised commits-in-window count (0–1). Populated by every scan
* since 0.6.0 from `git log --since=90d`. Same saturation curve as
* `crimes hotspots`. Ordinal — see `docs/scoring.md`.
*/
churn?: number;
/**
* Inverted test-coverage signal (0–1). 1.0 = no nearby tests; 0.0 = a
* test file imports this file. Populated by every scan since 0.6.0.
* Ordinal — see `docs/scoring.md`.
*/
test_gap?: number;
/**
* Unified composite of severity / confidence / churn / test_gap /
* blast_radius (0–1). Computed by core's finalisation pass on every
* scan since 0.6.0; detectors no longer set this directly. The
* weighting formula is documented in `docs/scoring.md`.
*/
agent_risk?: number;
}

The severity / confidence here are the numeric versions of the top-level fields. agent_risk is the differentiating signal vs other tools: rank by agent_risk when your goal is “which areas are dangerous for me to edit”, not “which areas have the worst static smell”.

blast_radius, churn, and test_gap are populated by every scan from 0.6.0 onward. They were “reserved” in 0.1.0–0.5.0; consumers that fell back to “not computed” for absent values now see real numbers. The fields remain optional in the schema so consumers can keep tolerating absence in mixed-version environments (a crimes scan from a fixture saved before 0.6.0 still parses cleanly).

All score fields are rounded to two decimals when present.

interface SuggestedAction {
/** Stable machine-readable action id, e.g. "extract_function". */
kind: string;
/** Human-readable suggestion. */
description: string;
/** Estimated risk of doing this action: "low" | "medium" | "high". */
risk: "low" | "medium" | "high";
}

Currently shipped kind values (additive; new kinds may appear without bumping schema_version):

  • extract_function — break up a large function into named helpers
  • split_file — split a large file along responsibility boundaries
  • triage_todos — convert TODO markers into tracked issues or remove them
  • inject_clock — replace direct Date usage with an injected clock
  • centralise_domain_literal — move a repeated domain literal to a named source of truth
  • assert_observable_behaviour — replace weak/no-op tests with assertions against observable behaviour
  • name_option_shape — replace a generic object bag with a named shape or owned destructuring
  • name_return_shape — add an explicit return type or named result variants
  • rename_or_simplify_flags — prefer positive flag names or extract a readable predicate
  • add_agent_context — add AGENTS.md or a Claude skill so agents can discover repo conventions
  • align_route_metadata — align route path, file/component name, page title, and nav labels around one canonical name
  • consolidate_nav_source — make one nav file the canonical source of truth for a destination
  • consolidate_concept — pick or document the canonical term, and use aliases deliberately rather than accidentally
  • fix_doc_link — update the docs or restore the referenced file so agents do not follow stale instructions

These are deterministic — typically one per detector kind. They are suggestions, not instructions; pick the ones that match the user’s request.

Populated by cross-file detectors: the information-architecture detectors listed above and magic_domain_literal_scatter. For each cross-file finding, file is the canonical anchor (route file, nav source, doc, alias-group anchor, or first literal occurrence) and related_files lists the other repo-relative paths that contributed evidence. Paths are repo-relative POSIX strings, deduped, and sorted deterministically.

The human reporter renders related_files as an “Also touches:” block under each finding (capped at 5 entries with the rest summarised), so JSON consumers and human readers see the same set of paths without the JSON contract changing. Treat each entry as “also read this before editing” — same scope as the finding itself.

Reserved by the file-local detectors. They do not populate it today; treat absence as “no cross-file context for this finding”.



ContextReport (output of crimes context <file>)

Section titled “ContextReport (output of crimes context <file>)”

crimes context <file> --format json emits a single JSON document — the ContextReport. It shares schema_version and the Finding shape with ScanReport, but is keyed to one file:

interface ContextReport {
schema_version: "0.1.0";
/** Discriminator. Always the literal `"context"`. */
report_type: "context";
repo: { name: string; root: string; git_ref?: string };
/** Repo-relative path of the inspected file. */
file: string;
risk: ContextRisk;
/** One short line per finding type that fired, deduped. Stable order. */
agent_guidance: string[];
/** Other files an agent should read before editing the target. */
related_files: ContextRelatedFile[];
/** Repo-relative paths of test files likely covering `file`. Sorted. */
likely_tests: string[];
/** Same Finding shape as ScanReport, filtered to `file`. */
findings: Finding[];
/** Present only when `agent_guidance` is empty. */
agent_guidance_reason?: string;
/** Present only when `related_files` is empty. */
related_files_reason?: string;
/** Present only when `likely_tests` is empty. */
likely_tests_reason?: string;
}
interface ContextRisk {
/** Worst severity present in `findings`. `"none"` when there are none. */
level: "none" | "low" | "medium" | "high";
high: number;
medium: number;
low: number;
/** findings.length */
total: number;
}
interface ContextRelatedFile {
/** Repo-relative POSIX path. Never the target file itself. */
file: string;
/** Short, human-readable rationale. Multiple reasons joined with "; ". */
reason: string;
/** Ordinal 0–1 weight used for sorting. May change between minor releases. */
score?: number;
}

JSON.stringify preserves insertion order, and the report is built with agent_guidance ahead of findings so agents reading the JSON top to bottom see the actionable summary before the more verbose finding bodies. The canonical order is:

schema_version → report_type → repo → file → risk
→ agent_guidance → related_files → likely_tests → findings
→ agent_guidance_reason? → related_files_reason? → likely_tests_reason?

Object-key order is not part of the schema contract — consumers should read by key, not by position — but the test fixtures and the CLI output follow this order so copy-paste examples stay consistent.

A ranked, capped list (max 10 entries) of repo-relative files an agent should probably read before editing the target. Discovered deterministically — no LLM, no git history. Heuristics:

  • IA finding passthrough. When a finding on the target carries related_files (the IA detectors do this — route_metadata_drift, duplicated_navigation_source, concept_alias_drift, docs_code_drift), each of those paths surfaces with reason related to <charge>.
  • Shared IA path tokens. Files whose path tokens overlap with the target’s after stop-word filtering and singularisation (using the same tokeniser the IA index uses). Generic tokens like api, route, service don’t anchor a match. Reason: shares domain token "<token>".
  • Domain-prefix filename match. Files whose basename starts with <dominant-token>- / <dominant-token>_ / <dominant-token>., ends with -<dominant-token> / _<dominant-token> (before the extension), or contains a path segment equal to the dominant token. Reason: matches domain "<token>".
  • Same-directory siblings. Other source files in the same directory as the target. Reason: same directory.

Per-entry rules:

  • The target file itself is never included.
  • Files already surfaced in likely_tests are excluded (tests live in their own block).
  • Multiple heuristic hits on the same file compound — reasons are joined with ; and scores add (capped at 1.0).
  • Sorted by score descending, then by file ascending — deterministic across runs.
  • The cap (10) is not part of the schema contract; treat it as a hint, not a guarantee.

related_files_reason is set instead of (or in addition to, when the array is empty) the array — see Empty-field reasons.

Discovered by four deterministic conventions, in this order:

  1. Sibling files with the same basename and a .test.{ts,tsx,js,jsx,mjs,cjs} or .spec.{...} extension (Jest / Vitest infix).
  2. Sibling files matching the Go-style _test.{ts,tsx,…} / _spec.{…} suffix.
  3. Files under any __tests__/ directory whose basename (with any .test / .spec / _test / _spec suffix stripped) matches the target’s basename.
  4. Test files (matching one of the above conventions) whose source contains a relative-path import that resolves to the target file.

The result is deduped and lexically sorted. No git history, no symbol resolution beyond a textual import-path match.

When agent_guidance, related_files, or likely_tests is the empty array, the corresponding *_reason field is set to a short string explaining why. The reason is omitted when the array is non-empty. Standard wordings:

FieldWording (when empty)
agent_guidance_reasonno findings on this file and no deterministic related files or findings on this file did not match any keyed guidance line
related_files_reasonno neighbourhood signal: no IA finding related_files, no shared domain tokens, no domain-prefix filenames, no same-directory siblings
likely_tests_reasonno sibling, __tests__, .test, .spec, _test, or _spec files matched the target basename

The exact wording is advisory copy — match on the array being empty or the reason field being present, not on the string itself.

Static lookup keyed on Finding.type. One line per type that appears in findings, in the order they first appear. Current keys:

Finding.typeGuidance
large_functionPrefer extracting pure helpers before adding more branches.
large_fileRead the whole file before editing — propose splits in their own change.
direct_dateAvoid adding more direct clock access; inject time where possible.
todo_densityReview TODOs before relying on comments as current intent.
commented_out_codeDo not copy disabled code from comments; verify whether it should be deleted or explained as rationale.
logic_in_commentsTreat prose-only rules as suspect; encode them in guards, tests, config, or types before relying on them.
name_behavior_mismatchSafe-sounding names may hide side effects — inspect callers before moving, caching, or duplicating them.
magic_domain_literal_scatterRepeated domain strings can be duplicated policy — find or create the source of truth before adding another copy.
weak_test_signalTreat weak tests as low confidence; assert observable behaviour before relying on them as safety net.
option_bag_junk_drawerGeneric bags hide required fields — identify the owned shape before threading more data through.
return_shape_rouletteDivergent return shapes need an explicit contract before callers or agents infer a branch-specific shape.
negative_flag_mazeSimplify negative flags before extending the condition; double negatives are easy to invert.
missing_agent_contextAgents may miss project-specific commands, architecture rules, and safety checks.
route_metadata_driftThe route path, title, breadcrumb, and component name appear to disagree — verify each before changing labels.
duplicated_navigation_sourceMultiple files declare this destination; updating only one will leave the others stale.
concept_alias_driftOther files describe this concept under a different name; read them before renaming or extending.
docs_code_driftDocs reference local files that no longer exist — update the docs in the same PR.

When the target file has no findings but related_files is non-empty, agent_guidance instead contains a single neighbourhood line:

Review related files before editing — they share domain tokens or route/navigation evidence with this target.

New detector types may add new guidance lines without bumping schema_version. The wording of any guidance line is not part of the schema contract — treat it as advisory copy, not a stable string to match on.


HotspotsReport (from crimes hotspots --format json)

Section titled “HotspotsReport (from crimes hotspots --format json)”
interface HotspotsReport {
schema_version: "0.1.0";
/** Discriminator. Always the literal `"hotspots"`. */
report_type: "hotspots";
repo: RepoInfo;
/** Echo of the `--since` value the user passed (e.g. "90d"). */
since: string;
/** False when the directory is not a git repository or `git` is unavailable. */
git_available: boolean;
/** True when commit history is truncated (e.g. shallow clone). See below. */
history_limited?: boolean;
/** Short reason string. Only set when `history_limited` is true. */
history_limited_reason?: string;
hotspots: Hotspot[];
}
interface Hotspot {
/** Repo-relative path with forward slashes. */
file: string;
/** Commits in the `--since` window that touched this file. */
change_count: number;
/** ISO-8601 timestamp of the most recent commit. Absent when change_count is 0. */
latest_change?: string;
/** Number of `crimes scan` findings on this file. */
finding_count: number;
/** Worst severity present in findings. `"none"` when finding_count is 0. */
highest_severity: "none" | "low" | "medium" | "high";
/** Aggregate 0–1 change-risk score, rounded to 2 dp. */
risk: number;
}

hotspots is sorted:

  1. By risk descending
  2. Then change_count descending
  3. Then highest_severity descending (high → medium → low → none)
  4. Then file ascending — as a stable tie-breaker
risk = 0.6 × min(change_count / 20, 1)
+ 0.4 × { high: 1.0, medium: 0.6, low: 0.3, none: 0 }[highest_severity]

Rounded to 2 decimal places. The 20-commit cap and the 0.6 / 0.4 weights are the numeric formula — they may change between minor releases. Treat risk as an ordinal signal for ranking, not an exact measurement (same contract as the per-finding scores.* fields).

When git_available is false, every row has change_count: 0 and no latest_change. risk then collapses to the severity component only and is capped at 0.4. The command does not fail — it degrades.

When the working tree is a shallow clone (git rev-parse --is-shallow-repository returns true), older commits aren’t present locally — git log only sees the slice the clone fetched. Hotspot counts under-report churn in that case. crimes hotspots annotates this with two optional top-level fields:

history_limited?: boolean;
history_limited_reason?: string;
  • Only set when git_available is true AND the shallow probe returned true. Plain non-git directories already surface via git_available: false; the two flags are mutually exclusive in practice.
  • history_limited_reason is short, human-readable advisory copy (e.g. "repository is a shallow clone; older commits are unavailable, so churn counts only reflect history present locally"). Treat the wording as advisory — match on the boolean flag, not the string.
  • Common in CI runners that default to --depth=1 clones. Pass fetch-depth: 0 (or equivalent) in your workflow to deepen the clone and clear the flag.

The human report prints the same notice on its second line, alongside the existing “not a git repo” warning. JSON consumers should branch on history_limited and downweight the ranking accordingly.


DiffReport (output of crimes diff <base...head>)

Section titled “DiffReport (output of crimes diff <base...head>)”

crimes diff <base...head> --format json emits a single JSON document — the DiffReport. It shares schema_version and the Finding shape with ScanReport, but the body is grouped by what changed between the two refs rather than listed flat.

interface DiffReport {
schema_version: "0.1.0";
/** Discriminator. Always the literal "diff". */
report_type: "diff";
repo: { name: string; root: string };
/** Base ref the user passed (e.g. "main", "origin/main", a SHA). */
base: string;
/** Head ref the user passed (typically "HEAD"). */
head: string;
summary: DiffSummary;
/** Findings present at `head` but not at `base`. */
new_findings: Finding[];
/** Findings present at `base` but not at `head`. */
fixed_findings: Finding[];
/**
* Findings present at both refs (matched by fingerprint).
* The Finding object comes from the `head` scan, so its
* `lines`, `evidence`, and per-scan `id` reflect HEAD.
*/
unchanged_findings: Finding[];
/**
* Set only when `crimes diff --fail-on <threshold>` is used. Mirrors
* `ScanReport.fail_on` / `failed` — see the
* [Suppression fields](#suppression-fields) and
* [Stability guarantees](#stability-guarantees) sections.
*/
fail_on?: "new-high" | "new-medium";
failed?: boolean;
/**
* See [Suppression fields](#suppression-fields). Only set when ≥1
* suppression matched on the new set.
*/
suppressed_count?: number;
}
interface DiffSummary {
new: number;
fixed: number;
unchanged: number;
}

A discriminator literal so consumers can route on the report kind when multiple shapes are piped together. Always "diff" for crimes diff output. Every crimes report carries one ("scan", "context", "hotspots", "diff", "baseline", "baseline_check", "verdict") — see the Contents table for the full mapping.

Findings are classified as new / fixed / unchanged by a stable fingerprint, not by the per-scan id. The fingerprint is:

<type>::<file>::<symbol-or-empty>
  • type — detector identity (large_function, large_file, …)
  • file — repo-relative POSIX path
  • symbol — function/method name when the detector pinpoints a declaration (e.g. large_function); empty for file-level detectors

The fingerprint deliberately excludes lines, evidence, summary, and the per-scan id. That means a function shifting from lines 37–240 to lines 42–246 after an unrelated edit above it is classified as unchanged, not as a fix + new pair.

Source of truth: packages/core/src/fingerprint.ts.

new_findings, fixed_findings, and unchanged_findings each preserve the order their underlying scan produced — i.e. the same severity-first order documented for ScanReport.findings:

  1. By severity (high → medium → low)
  2. Then by confidence descending
  3. Then by file ascending
  4. Then by lines[0] ascending

crimes diff exports each ref into a fresh temporary directory via git archive <ref> | tar -x and scans it there. The working tree is never touched: no checkout, no stash, no temporary commits. Both temp directories are cleaned up before the report is returned.

  • File renames register as a fix + new pair. A file moved from src/a.ts to src/b.ts between base and head will produce both fixed findings (from a.ts) and new findings (in b.ts), even if the underlying detector results are identical. This matches the default behaviour of git diff (without --find-renames).
  • Two findings with identical (type, file, symbol) collide on one fingerprint. Nested helpers or overloaded function declarations with the same name in the same file deduplicate to a single logical finding. Rare in practice; a future schema version may add a disambiguator if it becomes a problem.

By default, crimes diff is advisory — it exits 0 regardless of how many new findings appear. Pass --fail-on new-high | new-medium to turn the command into a hard CI gate (added in 0.5.0); the threshold matches crimes verdict’s thresholds. --fail-on new-high exits 1 when any new finding has severity: "high"; --fail-on new-medium exits 1 when any new finding is "medium" or "high". Suppressed entries never trip the gate, regardless of --show-suppressed.

For a hard CI gate you have four equivalent options sharing the same 0 pass / 1 blocked / 2 usage exit contract:

  • crimes scan --changed --fail-on <severity> (changed-set advisory)
  • crimes diff <base...head> --fail-on new-high | new-medium
  • crimes baseline check --fail-on …
  • crimes verdict --fail-on worse | new-high | new-medium

Or gate on the JSON yourself:

Terminal window
crimes diff origin/main...HEAD --format json \
| jq -e '.summary.new == 0' >/dev/null

Baseline (on-disk shape of .crimes/baseline.json)

Section titled “Baseline (on-disk shape of .crimes/baseline.json)”

crimes baseline save writes a single JSON document to <root>/.crimes/baseline.json. The file is intended to be committed — it pins the set of pre-existing findings that future crimes baseline check runs should ignore. The schema is versioned by the same schema_version as ScanReport.

interface Baseline {
schema_version: "0.1.0";
/** Discriminator. Always the literal "baseline". */
report_type: "baseline";
/** ISO-8601 timestamp at which the baseline was written. */
created_at: string;
/** Version of `crimes` that wrote the file. Optional. */
crimes_version?: string;
/** Best-effort repo identity. `root` is machine-specific — informational only. */
repo?: { name: string; root: string };
/** Severity counts at the moment the baseline was written. */
summary: ScanSummary;
/** Every finding present at capture time, trimmed to identity-only fields. */
findings: BaselineEntry[];
}
interface BaselineEntry {
/** Same `<type>::<file>::<symbol-or-empty>` as `fingerprintFinding`. */
fingerprint: string;
type: string;
charge: string;
severity: "low" | "medium" | "high";
file: string;
symbol?: string;
}

The baseline only needs enough per-finding data to (a) match a future scan via the same fingerprint logic crimes diff uses, and (b) render a useful fixed_findings list when the offending file no longer exists. Concretely this means no lines, evidence, summary, scores, suggested_actions, or per-scan id is persisted — those drift between scans or become meaningless after the underlying code is gone.

How crimes baseline check matches findings

Section titled “How crimes baseline check matches findings”

By the exact same fingerprint logic as crimes diff: the stable <type>::<file>::<symbol-or-empty> identity. Small line shifts from unrelated edits do not register as fix + new. The known limitations are the same too: file renames register as a fix + new pair, and two findings with identical (type, file, symbol) collide on one fingerprint.


BaselineCheckReport (output of crimes baseline check)

Section titled “BaselineCheckReport (output of crimes baseline check)”
interface BaselineCheckReport {
schema_version: "0.1.0";
/** Discriminator. Always the literal "baseline_check". */
report_type: "baseline_check";
repo: { name: string; root: string };
/** Absolute path to the baseline file that was loaded. */
baseline_path: string;
/** Threshold the run used to decide `failed`. */
fail_on: "low" | "medium" | "high";
/** True when at least one new finding has severity ≥ `fail_on`. */
failed: boolean;
summary: BaselineCheckSummary;
/** Full `Finding` objects from the current scan not present in the baseline. */
new_findings: Finding[];
/** Baseline entries with no matching fingerprint in the current scan. */
fixed_findings: BaselineEntry[];
/** Current-scan findings matched by fingerprint to a baseline entry. */
unchanged_findings: Finding[];
}
interface BaselineCheckSummary {
total_baseline: number;
total_current: number;
new: number;
fixed: number;
unchanged: number;
new_by_severity: { high: number; medium: number; low: number };
}
ValueA new finding fails when its severity is …
"low"low, medium, or high
"medium"medium or high (default)
"high"high only

failed is the AND of “at least one new finding” and “the worst new severity meets the threshold”. fixed_findings and unchanged_findings never influence failed — only forward debt blocks CI.

ExitMeaning
0failed: false — no new findings at or above fail_on.
1failed: true — at least one new finding at or above fail_on.
2Usage / environment error — missing baseline, malformed baseline, bad flags.

The JSON output is produced on stdout for exit 0 and 1. Exit 2 writes a single human-readable error line to stderr and emits no JSON.

Why fixed_findings is BaselineEntry[], not Finding[]

Section titled “Why fixed_findings is BaselineEntry[], not Finding[]”

Unlike DiffReport, where both refs are scanned and the full Finding shape is available on both sides, the baseline file only stores the trimmed BaselineEntry per finding. A finding can be reported as “fixed” even when the offending code has been deleted entirely — at which point the original lines, evidence, and scores no longer make sense to serialise.


crimes verdict --format json emits a single JSON document — the VerdictReport. It is built on top of crimes diff (same archive-into-temp machinery, same fingerprint-based matching) and adds a single headline verdict plus reasons / recommended_actions strings on top.

interface VerdictReport {
schema_version: "0.1.0";
/** Discriminator. Always the literal "verdict". */
report_type: "verdict";
repo: { name: string; root: string };
/** Base ref the verdict resolved (explicit `--base`, else default). */
base: string;
/** Head ref the verdict resolved (typically "HEAD"). */
head: string;
/** Headline judgement — one of four enum values. */
verdict: "cleaner" | "worse" | "unchanged" | "mixed";
summary: VerdictSummary;
/** Short, machine-friendly reasons that drove the verdict. */
reasons: string[];
/** Short, human-readable next-step suggestions. */
recommended_actions: string[];
/** Findings present at `head` but not at `base`. Same shape as ScanReport. */
new_findings: Finding[];
/** Findings present at `base` but not at `head`. Same shape as ScanReport. */
fixed_findings: Finding[];
}
interface VerdictSummary {
new: number;
fixed: number;
/** Findings present at both refs. Carried through from the underlying diff. */
unchanged: number;
new_by_severity: { high: number; medium: number; low: number };
fixed_by_severity: { high: number; medium: number; low: number };
/** Σ SEVERITY_WEIGHT over `new_findings`. */
new_weighted: number;
/** Σ SEVERITY_WEIGHT over `fixed_findings`. */
fixed_weighted: number;
}

When --base is omitted, crimes verdict picks the first of these refs that resolves:

  1. origin/main
  2. main

If neither resolves, the command exits 2 with a “no default base” error on stderr asking the user to pass --base <ref> explicitly. No JSON is emitted.

Severity weights: high = 3, medium = 2, low = 1. Treat the verdict as an ordinal signal — the weights may change between minor releases (same contract as the per-finding scores.* fields).

Judgement rules, in order:

  1. unchanged — no new findings AND no fixed findings.
  2. worse — any new finding has severity: "high". (A new high is not offset by fixing other highs — it still flips the verdict.)
  3. worsesummary.new_weighted > summary.fixed_weighted (no new high required).
  4. cleanersummary.fixed_weighted > summary.new_weighted AND no new high findings.
  5. mixed — both sides have at least one finding and weighted scores are equal.

reasons is a short array of human-readable strings — same content the human renderer prints on the Reason: line. recommended_actions is deterministic and keyed off the verdict (e.g. “fix new high-severity findings before merging.”, “ship it — this branch removes more crime weight than it adds.”). Treat both as advisory copy — wording may shift across minor releases.

Same stable <type>::<file>::<symbol-or-empty> fingerprint as crimes diff. Same known limitations apply — file renames register as a fix + new pair, identical-name nested helpers collide on one fingerprint.

crimes verdict is advisory by default — it always exits 0 regardless of the verdict, so agents and humans can read it without breaking automation. Opt into a blocking gate with --fail-on:

--fail-onExit 1 when …
(omitted)Never. Always exit 0.
worseverdict === "worse".
new-highAny new finding has severity: "high".
new-mediumAny new finding has severity: "medium" or "high".

Exit 2 is reserved for usage / environment errors:

  • Not a git repository.
  • No default base resolves and no --base was passed.
  • An explicit --base <ref> cannot be resolved.
  • A bad --format or --fail-on flag.

The JSON output is produced on stdout for exit 0 and 1. Exit 2 writes a single human-readable error line to stderr and emits no JSON.


Long-form rationale for a single finding. Resolves either a per-scan id (crime_00005) or a stable fingerprint (<type>::<file>::<symbol>). Deterministic — same paragraph per detector type, no LLM, no per-finding tailoring.

interface ExplainReport {
schema_version: "0.1.0";
/** Discriminator. Always the literal `"explain"`. */
report_type: "explain";
/** The matched finding, verbatim from the scan it came from. */
finding: Finding;
detector: {
/** Same string as `finding.type`. */
type: string;
/** Same string as `finding.charge`. */
charge: string;
/** One-line description of what the detector looks for. */
description: string;
};
/** One-paragraph rationale for why this kind of finding matters. */
why_it_matters: string;
/** Ranked, human-readable remediation paths for this finding. */
likely_remedies: string[];
/**
* Verbatim shell line that would suppress this finding. Always
* starts with `crimes ignore <fingerprint> --reason ` and ends with
* the placeholder `"<one-sentence justification>"`.
*/
suggested_suppression_command: string;
}

crimes explain does not exit non-zero unless the input id/fingerprint fails to resolve (exit 2).


Suppressions (on-disk shape of .crimes/suppressions.json)

Section titled “Suppressions (on-disk shape of .crimes/suppressions.json)”

Hand-reviewable list of per-finding exceptions. Written by crimes ignore, intended to be committed. Matched findings are filtered out of every report’s default view; --show-suppressed re-surfaces them annotated.

interface Suppressions {
schema_version: "0.1.0";
/** Discriminator. Always the literal `"suppressions"`. */
report_type: "suppressions";
/** ISO-8601 timestamp at which the file was first written. */
created_at: string;
/** ISO-8601 timestamp at which the file was last modified. */
updated_at: string;
/** Version of `crimes` that wrote the file last. Informational. */
crimes_version?: string;
suppressions: SuppressionEntry[];
}
interface SuppressionEntry {
/** Stable `<type>::<file>::<symbol>` identity. Required. */
fingerprint: string;
/** Denormalised — same as the type segment of `fingerprint`. */
type: string;
/** Denormalised — same as the file segment of `fingerprint`. */
file?: string;
/** Denormalised — same as the symbol segment of `fingerprint`. */
symbol?: string;
/** Required, non-empty. The team's justification. */
reason: string;
/** ISO-8601 timestamp at which this entry was first written. */
created_at: string;
/** Optional. Default from `git config user.email` when available. */
created_by?: string;
/**
* Origin of this suppression. Defaults to `"manual"` when absent
* (i.e. the 0.5.0 / 0.6.0 file shape). Feedback entries participate
* in the 0.7.0 auto-resurface loop; manual entries never resurface.
*/
source?: "manual" | "feedback";
/**
* The crimes minor this suppression was recorded against, e.g.
* `"0.7"`. Only meaningful when `source === "feedback"`. On scans
* whose minor differs from the pinned value, the matching finding
* resurfaces tagged `previously_suppressed: true`.
*/
crimes_version_pinned?: string;
}

The denormalised type / file / symbol fields are redundant for matching (only fingerprint drives it) but are load-bearing for human review: a reviewer scanning git diff .crimes/suppressions.json reads the entry without parsing the fingerprint.

The file is rewritten in full by crimes ignore — pretty-printed with 2-space indent and a trailing newline so the diff is reviewable. Re-suppressing the same fingerprint updates reason and the top-level updated_at; the entry’s created_at is preserved. crimes unignore removes an entry by fingerprint and bumps updated_at; the file is never deleted (an empty suppressions: [] stays so the frame is visible).


AuditSuppressionsReport (output of crimes audit-suppressions)

Section titled “AuditSuppressionsReport (output of crimes audit-suppressions)”

Lists every entry in .crimes/suppressions.json with per-entry age and concerns. Sorted oldest first. A missing file is not an error — the report sets loaded: false and entries: []; a present-but- malformed file exits 2 from the CLI with no JSON output.

interface AuditSuppressionsReport {
schema_version: "0.1.0";
/** Discriminator. Always the literal `"audit_suppressions"`. */
report_type: "audit_suppressions";
/** Absolute path of the suppressions file (read or not). */
suppressions_path: string;
/** True when the file existed and was read; false on an empty/missing file. */
loaded: boolean;
/** ISO-8601 timestamp the audit ran. Drives `age_days`. */
generated_at: string;
/** Total entries (clean + flagged). */
total: number;
/** Number of entries with at least one concern. */
flagged_count: number;
/** Every entry, sorted oldest first. */
entries: AuditSuppressionEntry[];
}
interface AuditSuppressionEntry extends SuppressionEntry {
/** Whole-number days between `created_at` and `generated_at`. */
age_days: number;
/** Empty for clean entries; one or more of the concerns below. */
concerns: ("stale" | "short_reason" | "vague_reason")[];
}

Concern semantics:

ConcernMeaning
"stale"age_days > 180.
"short_reason"reason.trim().length < 16.
"vague_reason"The reason starts with a deferral keyword (tmp, todo, wip, fixme, noisy, legacy, later, skip, ignore) or matches too noisy / we know …. Only set when the reason is not already flagged as short.

The thresholds are fixed in this release. JSON consumers that want different rules can re-filter the entries array using the raw age_days and reason fields.


Every report that lists findings (ScanReport, ContextReport, BaselineCheckReport, DiffReport, VerdictReport) carries an optional suppressed_count?: number field. Present only when ≥1 entry in .crimes/suppressions.json matched a finding in this invocation. Absent otherwise — JSON consumers should treat absent as equivalent to “no suppressions configured”.

The per-finding annotations only appear when --show-suppressed is set:

  • Finding.suppressed?: true — flags an entry that would otherwise be filtered out.
  • Finding.suppression_reason?: string — the reason recorded in the suppressions file.

Gate semantics are independent of display: findings with suppressed === true never trip a --fail-on gate on any command, whether or not --show-suppressed is on.


FeedbackReport (output of crimes feedback list / summary / export)

Section titled “FeedbackReport (output of crimes feedback list / summary / export)”

Per-repo or global rollup view of the captured feedback JSONL. scope: "repo" reads .crimes/feedback.jsonl; scope: "global" reads ~/.crimes/feedback-rollup.jsonl (which carries a repo field per entry). Emitted by --format json from crimes feedback list / summary / export. recheck has its own shape (feedback_recheck) listed below.

interface FeedbackReport {
schema_version: "0.1.0";
report_type: "feedback";
scope: "repo" | "global";
/** Absolute path of the JSONL file read. */
source_file: string;
entries: FeedbackEntry[];
/** Aggregate roll-up. Always present from `summary`; optional from `list`/`export`. */
summary?: FeedbackSummary;
}
interface FeedbackEntry {
/** ISO 8601 timestamp of when the verdict was recorded. */
timestamp: string;
/** Full semver of the crimes version that produced the finding. */
crimes_version: string;
/** Stable `<type>::<file>::<symbol>` fingerprint — primary identity. */
fingerprint: string;
/** Convenience denormalisation of the detector id. */
finding_type: string;
verdict: "tp" | "fp" | "known";
/** Required when verdict is "fp" (it becomes the suppression reason). */
note: string | null;
/** sha256 of the scan JSON when `crimes feedback ... --file` was used. */
scan_hash: string | null;
/** Prior minor when this entry re-confirms / resolves a resurfaced fp. */
resurfaced_from: string | null;
/** Only present in the global rollup — absolute repo path. */
repo?: string;
}
interface FeedbackSummary {
total: number;
by_verdict: { tp: number; fp: number; known: number };
by_detector: Record<string, { tp: number; fp: number; known: number }>;
by_version: Record<string, number>;
/** Only present in global-rollup summaries. */
by_repo?: Record<string, number>;
}

crimes feedback recheck --format json emits a sibling feedback_recheck report:

interface FeedbackRecheckReport {
schema_version: "0.1.0";
report_type: "feedback_recheck";
current_version: string; // e.g. "0.7.0"
current_minor: string; // e.g. "0.7"
resurfaced: Array<{
fingerprint: string;
type: string;
file?: string;
symbol?: string;
reason: string;
crimes_version_pinned: string;
/** Per-detector release-notes hint, or the generic fallback. */
hint: string;
/** Verbatim re-feedback commands the user can copy. */
commands: {
reconfirm_fp: string;
mark_resolved: string;
};
}>;
}

Every report that lists findings (ScanReport, ContextReport, DiffReport, BaselineCheckReport) can carry per-finding resurface annotations in 0.7.0+ when a feedback-sourced suppression’s pinned minor differs from the current crimes minor:

  • Finding.previously_suppressed?: true — set on every resurfaced finding. The finding is kept in findings[] (unlike suppressed, which is only kept when --show-suppressed is on), and is not counted in suppressed_count.
  • Finding.previous_suppression?: { pinned_version, reason } — paired with previously_suppressed. Carries the prior pin + the original feedback note.

Consumers can detect resurfaced findings without reading the suppressions file by walking findings[] and filtering on previously_suppressed === true. Counting them is what powers the “5 feedback-sourced suppressions resurface because they were pinned to 0.6” stderr breadcrumb the CLI prints on every scan after a minor bump.


Within a single schema_version:

  • Field names in the JSON output never change.
  • Field types never change.
  • A required field never becomes optional, or vice versa.
  • New detector type values may be added without bumping the schema — consumers should treat unknown types defensively.
  • New kind values for suggested_actions may be added without bumping.
  • The numeric formulas behind severity, confidence, and agent_risk scores may change between minor releases. Treat scores as ordinal signals, not exact measurements.

Breaking changes bump schema_version and are called out in release notes.