Campaign v0.6.0.0-final-r23 FAIL

ai-memory ref: release/v0.6.0
Completed at: 2026-04-20T17:20:10Z
Overall pass: FAIL

Run focus

The v0.6.0 tag candidate — stripped back to what definitely works

What this campaign set out to test: Full four-phase protocol against release/v0.6.0 at commit 710ad76 (post PR #316 revert). Phase 4 runs `kill_primary_mid_write` only — the canonical primary-crash disaster scenario. partition_minority moved to opt-in; the aggressive federation client settings reverted; the cycles_by_fault capture reverted. Nothing new under test except the stripped-back baseline.

What it demonstrated: To be determined. Expected outcome: Phases 1, 2, 3 green (consistent since r15), Phase 4 green (the kill_primary_mid_write guarantee we already demonstrated at 1.0 in r19 and r20). Unexpected outcome: any phase red would mean a new regression on release/v0.6.0 that the r15-r22 campaigns didn't surface, which would need urgent investigation before tagging.

Detailed tri-audience analysis is below, followed by per-phase test results for all four phases of the protocol — including any phase that did not run in this campaign.

AI NHI analysis · Claude Opus 4.7

The v0.6.0 tag candidate — stripped back to what definitely works

In flight at the time of this narrative. Every experimental addition from the r19-r22 arc has been removed. If this run comes back full green, v0.6.0 tags and the release pipeline fires. If it doesn't, we have scoped problems to investigate rather than compounded ones.

What this campaign tested

Full four-phase protocol against release/v0.6.0 at commit 710ad76 (post PR #316 revert). Phase 4 runs `kill_primary_mid_write` only — the canonical primary-crash disaster scenario. partition_minority moved to opt-in; the aggressive federation client settings reverted; the cycles_by_fault capture reverted. Nothing new under test except the stripped-back baseline.

What it proved (or disproved)

To be determined. Expected outcome: Phases 1, 2, 3 green (consistent since r15), Phase 4 green (the kill_primary_mid_write guarantee we already demonstrated at 1.0 in r19 and r20). Unexpected outcome: any phase red would mean a new regression on release/v0.6.0 that the r15-r22 campaigns didn't surface, which would need urgent investigation before tagging.

For three audiences

Non-technical end users

This is the release gate itself. Every fix we've landed sits on top of six weeks of engineering behind it, and this run is the final pre-release health check. If it comes back green, the release goes out today. If not, we investigate calmly without pressure — the current v0.6.0-previous users aren't affected by waiting, and we've earned the right to hold a release until we're sure.

C-level decision makers

Release gate. Clean-pass outcome: v0.6.0 tags, release pipeline auto-fires across crates.io, Homebrew, Ubuntu PPA, Fedora COPR, GHCR Docker, GitHub Release. Users pick up PR #309's silent-data-loss fix plus PR #310's chaos source allowlist plus PR #312's per-cycle harness infrastructure within a handful of hours. Customer impact of holding if this run fails: zero — existing users aren't affected by a tag delay, and shipping a regression would be worse than shipping one day later.

Engineers & architects

release/v0.6.0 tip = 710ad76. PR #316 merged (revert of aggressive client settings). Phase 4 FAULTS default narrowed to kill_primary_mid_write in ship-gate commit ac7e87a. phase4_chaos.sh cycles_by_fault capture reverted in a99bb3b. Net effect for this run vs r20 (last known-good baseline): identical ai-memory code (PR #309 + PR #310 federation detach + chaos allowlist still in place), slimmer chaos campaign (one fault class instead of two), cleaner phase4_chaos.sh with only the metric-extraction code path. Any failure is a regression we didn't see in r15-r20, not a consequence of anything we added recently.

What changed going into the next campaign

If r23 passes: tag v0.6.0, fire release pipeline, write post-release retro. If r23 fails: dispatch diagnostic r24 with `CHAOS_CYCLES=5` for fast iteration while we debug.

Phase 1 — functional (per-node) PASS

What this phase proves: Single-node CRUD, backup, curator dry-run, and MCP handshake on each of the three peer droplets. Establishes that ai-memory starts and is functional at the one-node level before federation is exercised.

Test results

node-a

✓ Stats total ≥ 1 (store + list + stats round-trip) — 1 memories
✓ Recall returned ≥ 1 hit — 1 hits
✓ Backup snapshot file emitted — 1 snapshot(s)
✓ Backup manifest file emitted — 1 manifest(s)
✓ MCP handshake advertises ≥ 30 tools — 36 tools
✓ Curator dry-run clean (Ollama-not-configured is accepted) — 1 errors
✓ Overall phase-1 pass flag

node-b

✓ Stats total ≥ 1 (store + list + stats round-trip) — 1 memories
✓ Recall returned ≥ 1 hit — 1 hits
✓ Backup snapshot file emitted — 1 snapshot(s)
✓ Backup manifest file emitted — 1 manifest(s)
✓ MCP handshake advertises ≥ 30 tools — 36 tools
✓ Curator dry-run clean (Ollama-not-configured is accepted) — 1 errors
✓ Overall phase-1 pass flag

node-c

✓ Stats total ≥ 1 (store + list + stats round-trip) — 1 memories
✓ Recall returned ≥ 1 hit — 1 hits
✓ Backup snapshot file emitted — 1 snapshot(s)
✓ Backup manifest file emitted — 1 manifest(s)
✓ MCP handshake advertises ≥ 30 tools — 36 tools
✓ Curator dry-run clean (Ollama-not-configured is accepted) — 1 errors
✓ Overall phase-1 pass flag

Raw evidence

phase1-node-a

{
	"phase": 1,
	"host": "aim-v0-6-0-0-final-r23-node-a",
	"version": "ai-memory 0.6.0",
	"pass": true,
	"reasons": [
		""
	],
	"stats": {
		"total": 1,
		"by_tier": [
			{
				"tier": "mid",
				"count": 1
			}
		],
		"by_namespace": [
			{
				"namespace": "ship-gate-phase1",
				"count": 1
			}
		],
		"expiring_soon": 0,
		"links_count": 0,
		"db_size_bytes": 139264
	},
	"curator": {
		"started_at": "2026-04-20T16:19:50.210158907+00:00",
		"completed_at": "2026-04-20T16:19:50.210695531+00:00",
		"cycle_duration_ms": 0,
		"memories_scanned": 1,
		"memories_eligible": 1,
		"auto_tagged": 0,
		"contradictions_found": 0,
		"operations_attempted": 0,
		"operations_skipped_cap": 0,
		"autonomy": {
			"clusters_formed": 0,
			"memories_consolidated": 0,
			"memories_forgotten": 0,
			"priority_adjustments": 0,
			"rollback_entries_written": 0,
			"errors": []
		},
		"errors": [
			"no LLM client configured"
		],
		"dry_run": true
	},
	"mcp_tool_count": 36,
	"recall_count": 1,
	"snapshot_count": 1,
	"manifest_count": 1
}

raw JSON

phase1-node-b

{
	"phase": 1,
	"host": "aim-v0-6-0-0-final-r23-node-b",
	"version": "ai-memory 0.6.0",
	"pass": true,
	"reasons": [
		""
	],
	"stats": {
		"total": 1,
		"by_tier": [
			{
				"tier": "mid",
				"count": 1
			}
		],
		"by_namespace": [
			{
				"namespace": "ship-gate-phase1",
				"count": 1
			}
		],
		"expiring_soon": 0,
		"links_count": 0,
		"db_size_bytes": 139264
	},
	"curator": {
		"started_at": "2026-04-20T16:19:50.257439108+00:00",
		"completed_at": "2026-04-20T16:19:50.257873397+00:00",
		"cycle_duration_ms": 0,
		"memories_scanned": 1,
		"memories_eligible": 1,
		"auto_tagged": 0,
		"contradictions_found": 0,
		"operations_attempted": 0,
		"operations_skipped_cap": 0,
		"autonomy": {
			"clusters_formed": 0,
			"memories_consolidated": 0,
			"memories_forgotten": 0,
			"priority_adjustments": 0,
			"rollback_entries_written": 0,
			"errors": []
		},
		"errors": [
			"no LLM client configured"
		],
		"dry_run": true
	},
	"mcp_tool_count": 36,
	"recall_count": 1,
	"snapshot_count": 1,
	"manifest_count": 1
}

raw JSON

phase1-node-c

{
	"phase": 1,
	"host": "aim-v0-6-0-0-final-r23-node-c",
	"version": "ai-memory 0.6.0",
	"pass": true,
	"reasons": [
		""
	],
	"stats": {
		"total": 1,
		"by_tier": [
			{
				"tier": "mid",
				"count": 1
			}
		],
		"by_namespace": [
			{
				"namespace": "ship-gate-phase1",
				"count": 1
			}
		],
		"expiring_soon": 0,
		"links_count": 0,
		"db_size_bytes": 139264
	},
	"curator": {
		"started_at": "2026-04-20T16:19:50.332318979+00:00",
		"completed_at": "2026-04-20T16:19:50.332848332+00:00",
		"cycle_duration_ms": 0,
		"memories_scanned": 1,
		"memories_eligible": 1,
		"auto_tagged": 0,
		"contradictions_found": 0,
		"operations_attempted": 0,
		"operations_skipped_cap": 0,
		"autonomy": {
			"clusters_formed": 0,
			"memories_consolidated": 0,
			"memories_forgotten": 0,
			"priority_adjustments": 0,
			"rollback_entries_written": 0,
			"errors": []
		},
		"errors": [
			"no LLM client configured"
		],
		"dry_run": true
	},
	"mcp_tool_count": 36,
	"recall_count": 1,
	"snapshot_count": 1,
	"manifest_count": 1
}

raw JSON

Phase 2 — multi-agent federation PASS

What this phase proves: 4 agents × 50 writes against the 3-node federation with W=2 quorum, then 90s settle and convergence count on every peer. Plus two quorum probes (one-peer-down must 201, both-peers-down must 503). Catches silent-data-loss and quorum-misclassification regressions.

Test results

✓ Burst writes returned 201 — ok=200/200 (qnm=0, fail=0)
✓ node-A convergence ≥ 95% of ok — a=200 / threshold 190
✓ node-B convergence ≥ 95% of ok — b=200 / threshold 190
✓ node-C convergence ≥ 95% of ok — c=200 / threshold 190
✓ Probe 1: one peer down → 201 (quorum met via remaining peer) — got 201
✓ Probe 2: both peers down → 503 (quorum_not_met) — got 503
✓ Overall phase-2 pass flag

Raw evidence

phase2

{
	"phase": 2,
	"pass": true,
	"total_writes": 200,
	"ok": 200,
	"quorum_not_met": 0,
	"fail": 0,
	"counts": {
		"a": 200,
		"b": 200,
		"c": 200
	},
	"probe1_single_peer_down": "201",
	"probe2_both_peers_down": "503",
	"reasons": [
		""
	]
}

raw JSON

Phase 3 — cross-backend migration PASS

What this phase proves: 1000-memory round-trip: SQLite → Postgres, re-run for idempotency, Postgres → SQLite. Asserts zero errors and counts match. Catches migration-correctness regressions in either direction of a production upgrade path.

Test results

✓ Source SQLite has 1000 seed memories — src_count=1000
✓ Destination after reverse roundtrip has 1000 memories — dst_count=1000
✓ Forward migration SQLite → Postgres: errors=0 — errors=0
✓ Idempotent re-run is a no-op — writes=1000
✓ Reverse migration Postgres → SQLite: errors=0 — errors=0
✓ Overall phase-3 pass flag

Raw evidence

phase3

{
	"phase": 3,
	"pass": true,
	"report_forward": {
		"batches": 1,
		"dry_run": false,
		"errors": [],
		"from_url": "sqlite:///tmp/phase3-source.db",
		"memories_read": 1000,
		"memories_written": 1000,
		"to_url": "postgres://ai_memory:ai_memory_test@127.0.0.1:5433/ai_memory_test"
	},
	"report_idempotent": {
		"batches": 1,
		"dry_run": false,
		"errors": [],
		"from_url": "sqlite:///tmp/phase3-source.db",
		"memories_read": 1000,
		"memories_written": 1000,
		"to_url": "postgres://ai_memory:ai_memory_test@127.0.0.1:5433/ai_memory_test"
	},
	"report_reverse": {
		"batches": 1,
		"dry_run": false,
		"errors": [],
		"from_url": "postgres://ai_memory:ai_memory_test@127.0.0.1:5433/ai_memory_test",
		"memories_read": 1000,
		"memories_written": 1000,
		"to_url": "sqlite:///tmp/phase3-roundtrip.db"
	},
	"src_count": 1000,
	"dst_count": 1000,
	"reasons": [
		""
	]
}

raw JSON

Phase 4 — chaos campaign FAIL

What this phase proves: packaging/chaos/run-chaos.sh on the chaos-client droplet with 50 cycles × 100 writes per fault class. Measures convergence_bound = min(count_node1, count_node2) / total_ok. Catches fault-tolerance regressions under SIGKILL of the primary, brief network partition, and related fault models.

Test results

✗ phase4.json did not parse as JSON — the chaos-harness summary never wrote cleanly — see raw JSON below
✗ Per-fault convergence_bound ≥ 0.995 — metric unavailable

Raw evidence

phase4

raw JSON

All artifacts

Every JSON committed to this campaign directory. Raw, machine-readable, and stable.

Generated by scripts/generate_run_html.sh. Campaign directory: alphaonedev/ai-memory-ship-gate/runs/v0.6.0.0-final-r23 . Methodology: alphaonedev.github.io/ai-memory-ship-gate/methodology . Analysis data source: analysis/run-insights.json.