../ runs index · rendered on Pages

Campaign v0.6.0.0-final FAIL

ai-memory ref
release/v0.6.0
Completed at
2026-04-20T03:43:41Z
Overall pass
FAIL

Run focus

First "final" attempt — baseline for regression tracking

What this campaign set out to test: Full four-phase gate on fresh DigitalOcean infrastructure against release/v0.6.0 tip. The first campaign to survive all four phases' orchestration without an infra-level abort.

What it demonstrated: The campaign workflow, Terraform, per-phase scripts, and aggregator are structurally coherent enough to execute top-to-bottom and produce JSON artifacts under the runs/ convention. It did NOT establish that any particular assertion (convergence, probe classification, migration correctness) passed under the then-current product state — several phases produced empty-or-malformed outputs that later became the subject of ship-gate harness fixes.

Detailed tri-audience analysis is below, followed by per-phase test results for all four phases of the protocol — including any phase that did not run in this campaign.

AI NHI analysis · Claude Opus 4.7

First "final" attempt — baseline for regression tracking

Ran the protocol end-to-end, produced artifacts, established the runs/ directory convention that every subsequent campaign adheres to. A working scaffold, not a passing release candidate.

What this campaign tested

Full four-phase gate on fresh DigitalOcean infrastructure against release/v0.6.0 tip. The first campaign to survive all four phases' orchestration without an infra-level abort.

What it proved (or disproved)

The campaign workflow, Terraform, per-phase scripts, and aggregator are structurally coherent enough to execute top-to-bottom and produce JSON artifacts under the runs/ convention. It did NOT establish that any particular assertion (convergence, probe classification, migration correctness) passed under the then-current product state — several phases produced empty-or-malformed outputs that later became the subject of ship-gate harness fixes.

For three audiences

Non-technical end users

This is the first time all four tests in the test battery at least managed to start and produce a result file, even if some of those results were incomplete. Think of it as proving the gym's equipment all works before measuring an athlete's performance.

C-level decision makers

A structural baseline, not a performance claim. The value this run delivers is establishing that we now have a reproducible, version-controlled protocol with stable output contracts. Subsequent runs are measured relative to this baseline, which is why the per-run dashboard rewards every attempt (red or green) with an entry.

Engineers & architects

Establishes the `runs/<campaign-id>/` convention, the per-phase JSON contract (every phase script emits `{pass: bool, reasons: [...]}` plus phase-specific fields), and the aggregator's `campaign-summary.json` shape. The outcome-derivation bug in the aggregator (historical overall_pass:true even for partial campaigns) dates from this run and was only fixed client-side by pages.yml + generate_run_html.sh on 2026-04-20 (see commit dc0f489).

What changed going into the next campaign

Runs 10+ adopt the runner-driven SSH refactor. Subsequent iteration targets Phase 2 and Phase 4 harness correctness.

Phase 1 — functional (per-node) PASS

What this phase proves: Single-node CRUD, backup, curator dry-run, and MCP handshake on each of the three peer droplets. Establishes that ai-memory starts and is functional at the one-node level before federation is exercised.

Test results

node-a

node-b

node-c

Raw evidence

phase1-node-a
{
	"phase": 1,
	"host": "aim-v0-6-0-0-final-node-a",
	"version": "ai-memory 0.6.0",
	"pass": true,
	"reasons": [
		""
	],
	"stats": {
		"total": 1,
		"by_tier": [
			{
				"tier": "mid",
				"count": 1
			}
		],
		"by_namespace": [
			{
				"namespace": "ship-gate-phase1",
				"count": 1
			}
		],
		"expiring_soon": 0,
		"links_count": 0,
		"db_size_bytes": 139264
	},
	"curator": {
		"started_at": "2026-04-20T03:42:55.018060612+00:00",
		"completed_at": "2026-04-20T03:42:55.018538372+00:00",
		"cycle_duration_ms": 0,
		"memories_scanned": 1,
		"memories_eligible": 1,
		"auto_tagged": 0,
		"contradictions_found": 0,
		"operations_attempted": 0,
		"operations_skipped_cap": 0,
		"autonomy": {
			"clusters_formed": 0,
			"memories_consolidated": 0,
			"memories_forgotten": 0,
			"priority_adjustments": 0,
			"rollback_entries_written": 0,
			"errors": []
		},
		"errors": [
			"no LLM client configured"
		],
		"dry_run": true
	},
	"mcp_tool_count": 36,
	"recall_count": 1,
	"snapshot_count": 1,
	"manifest_count": 1
}

raw JSON

phase1-node-b
{
	"phase": 1,
	"host": "aim-v0-6-0-0-final-node-b",
	"version": "ai-memory 0.6.0",
	"pass": true,
	"reasons": [
		""
	],
	"stats": {
		"total": 1,
		"by_tier": [
			{
				"tier": "mid",
				"count": 1
			}
		],
		"by_namespace": [
			{
				"namespace": "ship-gate-phase1",
				"count": 1
			}
		],
		"expiring_soon": 0,
		"links_count": 0,
		"db_size_bytes": 139264
	},
	"curator": {
		"started_at": "2026-04-20T03:42:54.917510191+00:00",
		"completed_at": "2026-04-20T03:42:54.917957231+00:00",
		"cycle_duration_ms": 0,
		"memories_scanned": 1,
		"memories_eligible": 1,
		"auto_tagged": 0,
		"contradictions_found": 0,
		"operations_attempted": 0,
		"operations_skipped_cap": 0,
		"autonomy": {
			"clusters_formed": 0,
			"memories_consolidated": 0,
			"memories_forgotten": 0,
			"priority_adjustments": 0,
			"rollback_entries_written": 0,
			"errors": []
		},
		"errors": [
			"no LLM client configured"
		],
		"dry_run": true
	},
	"mcp_tool_count": 36,
	"recall_count": 1,
	"snapshot_count": 1,
	"manifest_count": 1
}

raw JSON

phase1-node-c
{
	"phase": 1,
	"host": "aim-v0-6-0-0-final-node-c",
	"version": "ai-memory 0.6.0",
	"pass": true,
	"reasons": [
		""
	],
	"stats": {
		"total": 1,
		"by_tier": [
			{
				"tier": "mid",
				"count": 1
			}
		],
		"by_namespace": [
			{
				"namespace": "ship-gate-phase1",
				"count": 1
			}
		],
		"expiring_soon": 0,
		"links_count": 0,
		"db_size_bytes": 139264
	},
	"curator": {
		"started_at": "2026-04-20T03:42:55.340141172+00:00",
		"completed_at": "2026-04-20T03:42:55.340648710+00:00",
		"cycle_duration_ms": 0,
		"memories_scanned": 1,
		"memories_eligible": 1,
		"auto_tagged": 0,
		"contradictions_found": 0,
		"operations_attempted": 0,
		"operations_skipped_cap": 0,
		"autonomy": {
			"clusters_formed": 0,
			"memories_consolidated": 0,
			"memories_forgotten": 0,
			"priority_adjustments": 0,
			"rollback_entries_written": 0,
			"errors": []
		},
		"errors": [
			"no LLM client configured"
		],
		"dry_run": true
	},
	"mcp_tool_count": 36,
	"recall_count": 1,
	"snapshot_count": 1,
	"manifest_count": 1
}

raw JSON

Phase 2 — multi-agent federation FAIL

What this phase proves: 4 agents × 50 writes against the 3-node federation with W=2 quorum, then 90s settle and convergence count on every peer. Plus two quorum probes (one-peer-down must 201, both-peers-down must 503). Catches silent-data-loss and quorum-misclassification regressions.

Test results

Raw evidence

phase2

          

raw JSON

Phase 3 — cross-backend migration NOT REACHED

What this phase proves: 1000-memory round-trip: SQLite → Postgres, re-run for idempotency, Postgres → SQLite. Asserts zero errors and counts match. Catches migration-correctness regressions in either direction of a production upgrade path.

This phase did not run because an earlier phase failed and the campaign aborted. Evidence from the phases that did run is above; the protocol would have exercised this phase next if the prior step had passed.

Phase 4 — chaos campaign NOT REACHED

What this phase proves: packaging/chaos/run-chaos.sh on the chaos-client droplet with 50 cycles × 100 writes per fault class. Measures convergence_bound = min(count_node1, count_node2) / total_ok. Catches fault-tolerance regressions under SIGKILL of the primary, brief network partition, and related fault models.

This phase did not run because an earlier phase failed and the campaign aborted. Evidence from the phases that did run is above; the protocol would have exercised this phase next if the prior step had passed.

All artifacts

Every JSON committed to this campaign directory. Raw, machine-readable, and stable.