Skip to content

Forensic audit per-run matrix

Every campaign run that produced a phase4-analysis.json lands here with its audit_forensics block rendered: per-node chain heads, line counts, tamper detection per node, Phase 3 op-to-audit match rate, and forged-provenance detection rate. One row per run, sorted newest-first.

Each row carries:

  • Run ID — the campaign directory under runs/.
  • Per-node chain heads — SHA-256 of the latest audit entry on each node (truncated to 16 hex for display). Empty cell = audit log missing or empty for that node.
  • Per-node line counts — total audit entries on each node.
  • Tamper detection — substrate-canary verdict from S26 plus the per-node uniform inference (S26 runs on node-1; we replicate the verdict to every node since the audit substrate is uniform across the v0.6.3.1 mesh).
  • Op→audit match ratephase3_writes_matched / phase3_writes_total across every Phase 3 record's ai_memory_ops. The auditor's primary forensic-reproducibility metric.
  • Forged-provenance detection ratescenario_j_runs_detected / scenario_j_runs_total. Scenario J asks the receiver to detect a memory whose body lies about authorship; the audit log's stamped agent_id is the source of truth.
  • Legal admissibility summary — deterministic prose summary the meta-analyst computes from the audit-forensics block.

Rows where phase4-analysis.json is absent (older or interrupted runs) are omitted; their substrate verdict still renders on Campaign runs.


Per-run forensics matrix

Run node-1 head node-2 head node-3 head node-4 head Lines (1/2/3/4) Op→audit Forged-prov Summary
a2a-hermes-v0.6.3.1-r16 b3cd8416fbce6869… 04d4557b0ca53c40… b5b2154dd1c8105d… 163/103/85/26 0% 0% (0/8) no Phase 3 write ops to correlate; chain head present on 3/4 nodes; tamper detection fired on the substrate canary (S26); append-only enforcement verified (S27); forged-provenance detection rate 0% (0/8 Scenario J runs).
a2a-hermes-v0.6.3.1-r15 803a594b0558bffb… a37fdcb1e345a043… fd60de332e481750… 163/103/85/26 0% 0% (0/8) no Phase 3 write ops to correlate; chain head present on 3/4 nodes; tamper detection fired on the substrate canary (S26); append-only enforcement verified (S27); forged-provenance detection rate 0% (0/8 Scenario J runs).
a2a-hermes-v0.6.3.1-r12 fcba21494362719e… 4c6701a4e4e5fdf4… e380e2906aeaf871… 163/103/85/26 0% 0% (0/8) no Phase 3 write ops to correlate; chain head present on 3/4 nodes; tamper detection fired on the substrate canary (S26); append-only enforcement verified (S27); forged-provenance detection rate 0% (0/8 Scenario J runs).
a2a-ironclaw-v0.6.3.1-r27 bbcc7a816cf41dfd… 0f3c813d5dce15e3… b12a6f92e3b95e4f… 161/103/85/26 0% 0% (0/8) no Phase 3 write ops to correlate; chain head present on 3/4 nodes; tamper detection fired on the substrate canary (S26); append-only enforcement verified (S27); forged-provenance detection rate 0% (0/8 Scenario J runs).
a2a-ironclaw-v0.6.3.1-r26 abccf2279b681969… c8afe9fa0eafc685… 093c3308aec0e14b… 163/103/85/26 0% 0% (0/8) no Phase 3 write ops to correlate; chain head present on 3/4 nodes; tamper detection fired on the substrate canary (S26); append-only enforcement verified (S27); forged-provenance detection rate 0% (0/8 Scenario J runs).
a2a-ironclaw-v0.6.3.1-r25 23b75b57449ea10a… acb65a8bf1d19100… fcc18579243019e7… 163/103/85/26 0% 0% (0/8) no Phase 3 write ops to correlate; chain head present on 3/4 nodes; tamper detection fired on the substrate canary (S26); append-only enforcement verified (S27); forged-provenance detection rate 0% (0/8 Scenario J runs).

Total runs with audit_forensics: 6


Reading the matrix

  • Op→audit match rate at 1.00 means every Phase 3 NHI memory write has a 1:1 corresponding audit entry — the forensic-reproducibility property Scenario I tests.
  • Match rate strictly below 1.00 means at least one ai_memory_op did NOT land in the audit log, which is itself a high-severity finding (the audit hook silently skipped a write).
  • Tamper detection per nodeverify_rc=0, ok=true on every node is the clean-chain baseline. Any non-zero rc on a node means the chain itself is in an unverifiable state — the test infrastructure must resolve that before the run's substrate verdict can be trusted.
  • Forged-provenance detection rate at 1.00 means every Scenario J run saw the receiver flag the body-vs-audit-log authorship discrepancy. Lower means the receiver agent failed to consult the audit log as the source of truth.

For the substrate-level property tests (S25/S26/S27), see the per-run substrate cell in Campaign runs. For the explainer on what the audit substrate guarantees, see Forensic audit trail.