Run focus
Phase 2 crashed at line 102 — first diagnostic finding surfaces a harness bug
What this campaign set out to test: Four agent identities (`ai:agent-alice`, `-bob`, `-charlie`, `-dana`) each issuing 50 concurrent POSTs to node-a's `/api/v1/memories` endpoint under `--quorum-writes 2 --quorum-peers <other two>`, then tallying per-agent status codes.
What it demonstrated: The write burst HAPPENED — the code tried to tally responses, meaning traffic reached the server. But the tally itself crashed. What the server actually did with those 200 writes was invisible to the test this run. The run neither proved nor disproved anything about federation.
Detailed tri-audience analysis is below, followed by per-phase test results for all four phases of the protocol — including any phase that did not run in this campaign.
AI NHI analysis · Claude Opus 4.7
Phase 2 crashed at line 102 — first diagnostic finding surfaces a harness bug
The Phase 2 shell harness died on an arithmetic syntax error before it ever reached the federation assertion. Harness-only failure; no product signal yet.
What this campaign tested
Four agent identities (`ai:agent-alice`, `-bob`, `-charlie`, `-dana`) each issuing 50 concurrent POSTs to node-a's `/api/v1/memories` endpoint under `--quorum-writes 2 --quorum-peers <other two>`, then tallying per-agent status codes.
What it proved (or disproved)
The write burst HAPPENED — the code tried to tally responses, meaning traffic reached the server. But the tally itself crashed. What the server actually did with those 200 writes was invisible to the test this run. The run neither proved nor disproved anything about federation.
For three audiences
Non-technical end users
We tried to measure the cluster's health, and the measuring device itself malfunctioned. Like a thermometer with a broken display — the patient may or may not have a fever, but we can't tell yet. The fix was to replace the broken thermometer, not to conclude anything about the patient.
C-level decision makers
A test-infrastructure defect blocked the real product assertion. Materially different from a product bug: no customer impact, no release blocker beyond the cost of re-running the campaign. Notable for one reason though — the bug was subtle (a POSIX shell edge case where grep-with-zero-matches emits 0 on stdout AND exits non-zero) and survived review because the fault was latent until a genuinely-passing run hit the zero-match branch. A reminder that harness rigor matters as much as product rigor.
Engineers & architects
scripts/phase2_multiagent.sh line 102 computed `OK=$(grep -c '^201$' codes.txt || echo 0)`. When no line matched, `grep -c` emits `0\n` and exits 1; `|| echo 0` appends another `0`, producing a multi-line "0\n0" string. The subsequent `FAIL=$((TOTAL - OK - QNM))` arithmetic parsed that as invalid syntax. Fixed in commit 625ed66 by switching to `awk '/^201$/{n++} END{print n+0}'` which is idempotent for the empty-match case.
Bugs surfaced and where they were fixed
-
grep -c + || echo 0 doubled stdout → arithmetic syntax error
Impact: Phase 2 harness died before producing a verdict. Workflow reported FAIL, but the failure was test-infrastructure, not product.
Root cause: Pipeline double-counted when grep found zero matches. awk is robust to the empty case because it unconditionally emits `n+0` once at END.
Fixed in:
What changed going into the next campaign
r12 adopts the awk-based tally. The first cycle that actually reaches the federation assertion will surface the NEXT layer of problems — and it does.
Phase 1 — functional (per-node) PASS
What this phase proves: Single-node CRUD, backup, curator dry-run, and MCP handshake on each of the three peer droplets. Establishes that ai-memory starts and is functional at the one-node level before federation is exercised.
Test results
node-a
- ✓ Stats total ≥ 1 (store + list + stats round-trip) — 1 memories
- ✓ Recall returned ≥ 1 hit — 1 hits
- ✓ Backup snapshot file emitted — 1 snapshot(s)
- ✓ Backup manifest file emitted — 1 manifest(s)
- ✓ MCP handshake advertises ≥ 30 tools — 36 tools
- ✓ Curator dry-run clean (Ollama-not-configured is accepted) — 1 errors
- ✓ Overall phase-1 pass flag
node-b
- ✓ Stats total ≥ 1 (store + list + stats round-trip) — 1 memories
- ✓ Recall returned ≥ 1 hit — 1 hits
- ✓ Backup snapshot file emitted — 1 snapshot(s)
- ✓ Backup manifest file emitted — 1 manifest(s)
- ✓ MCP handshake advertises ≥ 30 tools — 36 tools
- ✓ Curator dry-run clean (Ollama-not-configured is accepted) — 1 errors
- ✓ Overall phase-1 pass flag
node-c
- ✓ Stats total ≥ 1 (store + list + stats round-trip) — 1 memories
- ✓ Recall returned ≥ 1 hit — 1 hits
- ✓ Backup snapshot file emitted — 1 snapshot(s)
- ✓ Backup manifest file emitted — 1 manifest(s)
- ✓ MCP handshake advertises ≥ 30 tools — 36 tools
- ✓ Curator dry-run clean (Ollama-not-configured is accepted) — 1 errors
- ✓ Overall phase-1 pass flag
Raw evidence
phase1-node-a
{
"phase": 1,
"host": "aim-v0-6-0-0-final-r11-node-a",
"version": "ai-memory 0.6.0",
"pass": true,
"reasons": [
""
],
"stats": {
"total": 1,
"by_tier": [
{
"tier": "mid",
"count": 1
}
],
"by_namespace": [
{
"namespace": "ship-gate-phase1",
"count": 1
}
],
"expiring_soon": 0,
"links_count": 0,
"db_size_bytes": 139264
},
"curator": {
"started_at": "2026-04-20T04:06:01.574280045+00:00",
"completed_at": "2026-04-20T04:06:01.575046443+00:00",
"cycle_duration_ms": 0,
"memories_scanned": 1,
"memories_eligible": 1,
"auto_tagged": 0,
"contradictions_found": 0,
"operations_attempted": 0,
"operations_skipped_cap": 0,
"autonomy": {
"clusters_formed": 0,
"memories_consolidated": 0,
"memories_forgotten": 0,
"priority_adjustments": 0,
"rollback_entries_written": 0,
"errors": []
},
"errors": [
"no LLM client configured"
],
"dry_run": true
},
"mcp_tool_count": 36,
"recall_count": 1,
"snapshot_count": 1,
"manifest_count": 1
}
raw JSON
phase1-node-b
{
"phase": 1,
"host": "aim-v0-6-0-0-final-r11-node-b",
"version": "ai-memory 0.6.0",
"pass": true,
"reasons": [
""
],
"stats": {
"total": 1,
"by_tier": [
{
"tier": "mid",
"count": 1
}
],
"by_namespace": [
{
"namespace": "ship-gate-phase1",
"count": 1
}
],
"expiring_soon": 0,
"links_count": 0,
"db_size_bytes": 139264
},
"curator": {
"started_at": "2026-04-20T04:06:01.229194260+00:00",
"completed_at": "2026-04-20T04:06:01.229714144+00:00",
"cycle_duration_ms": 0,
"memories_scanned": 1,
"memories_eligible": 1,
"auto_tagged": 0,
"contradictions_found": 0,
"operations_attempted": 0,
"operations_skipped_cap": 0,
"autonomy": {
"clusters_formed": 0,
"memories_consolidated": 0,
"memories_forgotten": 0,
"priority_adjustments": 0,
"rollback_entries_written": 0,
"errors": []
},
"errors": [
"no LLM client configured"
],
"dry_run": true
},
"mcp_tool_count": 36,
"recall_count": 1,
"snapshot_count": 1,
"manifest_count": 1
}
raw JSON
phase1-node-c
{
"phase": 1,
"host": "aim-v0-6-0-0-final-r11-node-c",
"version": "ai-memory 0.6.0",
"pass": true,
"reasons": [
""
],
"stats": {
"total": 1,
"by_tier": [
{
"tier": "mid",
"count": 1
}
],
"by_namespace": [
{
"namespace": "ship-gate-phase1",
"count": 1
}
],
"expiring_soon": 0,
"links_count": 0,
"db_size_bytes": 139264
},
"curator": {
"started_at": "2026-04-20T04:06:01.117530548+00:00",
"completed_at": "2026-04-20T04:06:01.118120527+00:00",
"cycle_duration_ms": 0,
"memories_scanned": 1,
"memories_eligible": 1,
"auto_tagged": 0,
"contradictions_found": 0,
"operations_attempted": 0,
"operations_skipped_cap": 0,
"autonomy": {
"clusters_formed": 0,
"memories_consolidated": 0,
"memories_forgotten": 0,
"priority_adjustments": 0,
"rollback_entries_written": 0,
"errors": []
},
"errors": [
"no LLM client configured"
],
"dry_run": true
},
"mcp_tool_count": 36,
"recall_count": 1,
"snapshot_count": 1,
"manifest_count": 1
}
raw JSON