../ runs index · rendered on Pages

Campaign v0.6.0.0-final-r22 FAIL

ai-memory ref
release/v0.6.0
Completed at
2026-04-20T16:07:57Z
Overall pass
FAIL

Run focus

Phase 4 script crashed on my own cycles_by_fault capture

What this campaign set out to test: Full four-phase protocol after PR #316 (revert of aggressive reqwest settings) landed, with partition_minority opted out of the Phase 4 default (kill_primary_mid_write is the only required fault class now). Also carried my earlier f993e2c commit that adds a `cycles_by_fault` field to phase4.json — that commit is the one that broke r22.

What it demonstrated: Proved Phases 1/2/3 are robust across another campaign on real infrastructure (five consecutive originals now: r15, r16, r17, r18, r22). Disproved that my f993e2c commit was tested — it clearly wasn't. Reverted in ship-gate commit a99bb3b before r23 dispatched.

Detailed tri-audience analysis is below, followed by per-phase test results for all four phases of the protocol — including any phase that did not run in this campaign.

AI NHI analysis · Claude Opus 4.7

Phase 4 script crashed on my own cycles_by_fault capture

Phases 1, 2, 3 all passed cleanly (originals — not reconstructed). Phase 4 exited non-zero with an empty phase4.json — the chaos campaign itself ran fine (5 min wall clock, normal for single-fault kill_primary), but the phase4_chaos.sh post-campaign jq pipeline that captures per-cycle data into the summary hit an input-shape error, `set -e` killed the script before the final summary could emit. Honest self-own.

What this campaign tested

Full four-phase protocol after PR #316 (revert of aggressive reqwest settings) landed, with partition_minority opted out of the Phase 4 default (kill_primary_mid_write is the only required fault class now). Also carried my earlier f993e2c commit that adds a `cycles_by_fault` field to phase4.json — that commit is the one that broke r22.

What it proved (or disproved)

Proved Phases 1/2/3 are robust across another campaign on real infrastructure (five consecutive originals now: r15, r16, r17, r18, r22). Disproved that my f993e2c commit was tested — it clearly wasn't. Reverted in ship-gate commit a99bb3b before r23 dispatched.

For three audiences

Non-technical end users

A different test-infrastructure bug, caught by the same test infrastructure that's been catching every other bug this week. My mistake this time: I added new diagnostic code to the test harness without actually running it end-to-end. The new code errored on its first real-infra encounter. Reverted, re-dispatched. No customer-visible change; this one never got close to a release.

C-level decision makers

Test-harness regression, not product regression. The ship-gate continues to enforce the distinction cleanly — every time a false step lands, the next campaign catches it. Cost: one more run, another ~$0.15, another ~30 minutes of engineering time. The record still shows zero product-level regressions shipped.

Engineers & architects

phase4_chaos.sh after commit f993e2c captured `jq -s '.' "$JSONL"` into a bash associative array `CYCLE_DATA`, then printf-concatenated entries through `jq -s 'add // {}'` to build a `cycles_by_fault` object for the final summary. Something in that pipeline errored under the chaos-client's jq version — perhaps the printf-of-multi-line-JSON not parsing as a single document, perhaps an `add`-on-non-array issue on a single fault class. `set -e` in phase4_chaos.sh propagated the failure; final summary never wrote. Reverted in a99bb3b; per-cycle capture to be re-introduced in a follow-up PR with a local smoke test before it ships.

Bugs surfaced and where they were fixed

  1. phase4_chaos.sh cycles_by_fault pipeline errored → empty phase4.json

    Impact: r22 couldn't emit a Phase 4 verdict JSON. Phases 1/2/3 evidence is intact and valid.

    Root cause: Introduced in ship-gate commit f993e2c. Multi-line JSON piped through printf into jq slurp mode hit a parsing/type error under set -e. Not caught because the commit wasn't locally smoke-tested.

    Fixed in:

What changed going into the next campaign

r23 dispatched with every experimental addition removed: conservative reqwest client (PR #316), partition_minority opt-in (ship-gate commit ac7e87a), no cycles_by_fault capture (ship-gate a99bb3b). Target outcome: full 4/4 green → v0.6.0 tag.

Phase 1 — functional (per-node) PASS

What this phase proves: Single-node CRUD, backup, curator dry-run, and MCP handshake on each of the three peer droplets. Establishes that ai-memory starts and is functional at the one-node level before federation is exercised.

Test results

node-a

node-b

node-c

Raw evidence

phase1-node-a
{
	"phase": 1,
	"host": "aim-v0-6-0-0-final-r22-node-a",
	"version": "ai-memory 0.6.0",
	"pass": true,
	"reasons": [
		""
	],
	"stats": {
		"total": 1,
		"by_tier": [
			{
				"tier": "mid",
				"count": 1
			}
		],
		"by_namespace": [
			{
				"namespace": "ship-gate-phase1",
				"count": 1
			}
		],
		"expiring_soon": 0,
		"links_count": 0,
		"db_size_bytes": 139264
	},
	"curator": {
		"started_at": "2026-04-20T15:59:19.791710256+00:00",
		"completed_at": "2026-04-20T15:59:19.792192516+00:00",
		"cycle_duration_ms": 0,
		"memories_scanned": 1,
		"memories_eligible": 1,
		"auto_tagged": 0,
		"contradictions_found": 0,
		"operations_attempted": 0,
		"operations_skipped_cap": 0,
		"autonomy": {
			"clusters_formed": 0,
			"memories_consolidated": 0,
			"memories_forgotten": 0,
			"priority_adjustments": 0,
			"rollback_entries_written": 0,
			"errors": []
		},
		"errors": [
			"no LLM client configured"
		],
		"dry_run": true
	},
	"mcp_tool_count": 36,
	"recall_count": 1,
	"snapshot_count": 1,
	"manifest_count": 1
}

raw JSON

phase1-node-b
{
	"phase": 1,
	"host": "aim-v0-6-0-0-final-r22-node-b",
	"version": "ai-memory 0.6.0",
	"pass": true,
	"reasons": [
		""
	],
	"stats": {
		"total": 1,
		"by_tier": [
			{
				"tier": "mid",
				"count": 1
			}
		],
		"by_namespace": [
			{
				"namespace": "ship-gate-phase1",
				"count": 1
			}
		],
		"expiring_soon": 0,
		"links_count": 0,
		"db_size_bytes": 139264
	},
	"curator": {
		"started_at": "2026-04-20T15:59:19.897722331+00:00",
		"completed_at": "2026-04-20T15:59:19.898184118+00:00",
		"cycle_duration_ms": 0,
		"memories_scanned": 1,
		"memories_eligible": 1,
		"auto_tagged": 0,
		"contradictions_found": 0,
		"operations_attempted": 0,
		"operations_skipped_cap": 0,
		"autonomy": {
			"clusters_formed": 0,
			"memories_consolidated": 0,
			"memories_forgotten": 0,
			"priority_adjustments": 0,
			"rollback_entries_written": 0,
			"errors": []
		},
		"errors": [
			"no LLM client configured"
		],
		"dry_run": true
	},
	"mcp_tool_count": 36,
	"recall_count": 1,
	"snapshot_count": 1,
	"manifest_count": 1
}

raw JSON

phase1-node-c
{
	"phase": 1,
	"host": "aim-v0-6-0-0-final-r22-node-c",
	"version": "ai-memory 0.6.0",
	"pass": true,
	"reasons": [
		""
	],
	"stats": {
		"total": 1,
		"by_tier": [
			{
				"tier": "mid",
				"count": 1
			}
		],
		"by_namespace": [
			{
				"namespace": "ship-gate-phase1",
				"count": 1
			}
		],
		"expiring_soon": 0,
		"links_count": 0,
		"db_size_bytes": 139264
	},
	"curator": {
		"started_at": "2026-04-20T15:59:19.936080940+00:00",
		"completed_at": "2026-04-20T15:59:19.936595743+00:00",
		"cycle_duration_ms": 0,
		"memories_scanned": 1,
		"memories_eligible": 1,
		"auto_tagged": 0,
		"contradictions_found": 0,
		"operations_attempted": 0,
		"operations_skipped_cap": 0,
		"autonomy": {
			"clusters_formed": 0,
			"memories_consolidated": 0,
			"memories_forgotten": 0,
			"priority_adjustments": 0,
			"rollback_entries_written": 0,
			"errors": []
		},
		"errors": [
			"no LLM client configured"
		],
		"dry_run": true
	},
	"mcp_tool_count": 36,
	"recall_count": 1,
	"snapshot_count": 1,
	"manifest_count": 1
}

raw JSON

Phase 2 — multi-agent federation PASS

What this phase proves: 4 agents × 50 writes against the 3-node federation with W=2 quorum, then 90s settle and convergence count on every peer. Plus two quorum probes (one-peer-down must 201, both-peers-down must 503). Catches silent-data-loss and quorum-misclassification regressions.

Test results

Raw evidence

phase2
{
	"phase": 2,
	"pass": true,
	"total_writes": 200,
	"ok": 200,
	"quorum_not_met": 0,
	"fail": 0,
	"counts": {
		"a": 200,
		"b": 200,
		"c": 200
	},
	"probe1_single_peer_down": "201",
	"probe2_both_peers_down": "503",
	"reasons": [
		""
	]
}

raw JSON

Phase 3 — cross-backend migration PASS

What this phase proves: 1000-memory round-trip: SQLite → Postgres, re-run for idempotency, Postgres → SQLite. Asserts zero errors and counts match. Catches migration-correctness regressions in either direction of a production upgrade path.

Test results

Raw evidence

phase3
{
	"phase": 3,
	"pass": true,
	"report_forward": {
		"batches": 1,
		"dry_run": false,
		"errors": [],
		"from_url": "sqlite:///tmp/phase3-source.db",
		"memories_read": 1000,
		"memories_written": 1000,
		"to_url": "postgres://ai_memory:ai_memory_test@127.0.0.1:5433/ai_memory_test"
	},
	"report_idempotent": {
		"batches": 1,
		"dry_run": false,
		"errors": [],
		"from_url": "sqlite:///tmp/phase3-source.db",
		"memories_read": 1000,
		"memories_written": 1000,
		"to_url": "postgres://ai_memory:ai_memory_test@127.0.0.1:5433/ai_memory_test"
	},
	"report_reverse": {
		"batches": 1,
		"dry_run": false,
		"errors": [],
		"from_url": "postgres://ai_memory:ai_memory_test@127.0.0.1:5433/ai_memory_test",
		"memories_read": 1000,
		"memories_written": 1000,
		"to_url": "sqlite:///tmp/phase3-roundtrip.db"
	},
	"src_count": 1000,
	"dst_count": 1000,
	"reasons": [
		""
	]
}

raw JSON

Phase 4 — chaos campaign FAIL

What this phase proves: packaging/chaos/run-chaos.sh on the chaos-client droplet with 50 cycles × 100 writes per fault class. Measures convergence_bound = min(count_node1, count_node2) / total_ok. Catches fault-tolerance regressions under SIGKILL of the primary, brief network partition, and related fault models.

Test results

Raw evidence

phase4

          

raw JSON

All artifacts

Every JSON committed to this campaign directory. Raw, machine-readable, and stable.