Run focus
Phase 4 script crashed on my own cycles_by_fault capture
What this campaign set out to test: Full four-phase protocol after PR #316 (revert of aggressive reqwest settings) landed, with partition_minority opted out of the Phase 4 default (kill_primary_mid_write is the only required fault class now). Also carried my earlier f993e2c commit that adds a `cycles_by_fault` field to phase4.json — that commit is the one that broke r22.
What it demonstrated: Proved Phases 1/2/3 are robust across another campaign on real infrastructure (five consecutive originals now: r15, r16, r17, r18, r22). Disproved that my f993e2c commit was tested — it clearly wasn't. Reverted in ship-gate commit a99bb3b before r23 dispatched.
Detailed tri-audience analysis is below, followed by per-phase test results for all four phases of the protocol — including any phase that did not run in this campaign.
AI NHI analysis · Claude Opus 4.7
Phase 4 script crashed on my own cycles_by_fault capture
Phases 1, 2, 3 all passed cleanly (originals — not reconstructed). Phase 4 exited non-zero with an empty phase4.json — the chaos campaign itself ran fine (5 min wall clock, normal for single-fault kill_primary), but the phase4_chaos.sh post-campaign jq pipeline that captures per-cycle data into the summary hit an input-shape error, `set -e` killed the script before the final summary could emit. Honest self-own.
What this campaign tested
Full four-phase protocol after PR #316 (revert of aggressive reqwest settings) landed, with partition_minority opted out of the Phase 4 default (kill_primary_mid_write is the only required fault class now). Also carried my earlier f993e2c commit that adds a `cycles_by_fault` field to phase4.json — that commit is the one that broke r22.
What it proved (or disproved)
Proved Phases 1/2/3 are robust across another campaign on real infrastructure (five consecutive originals now: r15, r16, r17, r18, r22). Disproved that my f993e2c commit was tested — it clearly wasn't. Reverted in ship-gate commit a99bb3b before r23 dispatched.
For three audiences
Non-technical end users
A different test-infrastructure bug, caught by the same test infrastructure that's been catching every other bug this week. My mistake this time: I added new diagnostic code to the test harness without actually running it end-to-end. The new code errored on its first real-infra encounter. Reverted, re-dispatched. No customer-visible change; this one never got close to a release.
C-level decision makers
Test-harness regression, not product regression. The ship-gate continues to enforce the distinction cleanly — every time a false step lands, the next campaign catches it. Cost: one more run, another ~$0.15, another ~30 minutes of engineering time. The record still shows zero product-level regressions shipped.
Engineers & architects
phase4_chaos.sh after commit f993e2c captured `jq -s '.' "$JSONL"` into a bash associative array `CYCLE_DATA`, then printf-concatenated entries through `jq -s 'add // {}'` to build a `cycles_by_fault` object for the final summary. Something in that pipeline errored under the chaos-client's jq version — perhaps the printf-of-multi-line-JSON not parsing as a single document, perhaps an `add`-on-non-array issue on a single fault class. `set -e` in phase4_chaos.sh propagated the failure; final summary never wrote. Reverted in a99bb3b; per-cycle capture to be re-introduced in a follow-up PR with a local smoke test before it ships.
Bugs surfaced and where they were fixed
-
phase4_chaos.sh cycles_by_fault pipeline errored → empty phase4.json
Impact: r22 couldn't emit a Phase 4 verdict JSON. Phases 1/2/3 evidence is intact and valid.
Root cause: Introduced in ship-gate commit f993e2c. Multi-line JSON piped through printf into jq slurp mode hit a parsing/type error under set -e. Not caught because the commit wasn't locally smoke-tested.
Fixed in:
What changed going into the next campaign
r23 dispatched with every experimental addition removed: conservative reqwest client (PR #316), partition_minority opt-in (ship-gate commit ac7e87a), no cycles_by_fault capture (ship-gate a99bb3b). Target outcome: full 4/4 green → v0.6.0 tag.
Phase 1 — functional (per-node) PASS
What this phase proves: Single-node CRUD, backup, curator dry-run, and MCP handshake on each of the three peer droplets. Establishes that ai-memory starts and is functional at the one-node level before federation is exercised.
Test results
node-a
- ✓ Stats total ≥ 1 (store + list + stats round-trip) — 1 memories
- ✓ Recall returned ≥ 1 hit — 1 hits
- ✓ Backup snapshot file emitted — 1 snapshot(s)
- ✓ Backup manifest file emitted — 1 manifest(s)
- ✓ MCP handshake advertises ≥ 30 tools — 36 tools
- ✓ Curator dry-run clean (Ollama-not-configured is accepted) — 1 errors
- ✓ Overall phase-1 pass flag
node-b
- ✓ Stats total ≥ 1 (store + list + stats round-trip) — 1 memories
- ✓ Recall returned ≥ 1 hit — 1 hits
- ✓ Backup snapshot file emitted — 1 snapshot(s)
- ✓ Backup manifest file emitted — 1 manifest(s)
- ✓ MCP handshake advertises ≥ 30 tools — 36 tools
- ✓ Curator dry-run clean (Ollama-not-configured is accepted) — 1 errors
- ✓ Overall phase-1 pass flag
node-c
- ✓ Stats total ≥ 1 (store + list + stats round-trip) — 1 memories
- ✓ Recall returned ≥ 1 hit — 1 hits
- ✓ Backup snapshot file emitted — 1 snapshot(s)
- ✓ Backup manifest file emitted — 1 manifest(s)
- ✓ MCP handshake advertises ≥ 30 tools — 36 tools
- ✓ Curator dry-run clean (Ollama-not-configured is accepted) — 1 errors
- ✓ Overall phase-1 pass flag
Raw evidence
phase1-node-a
{
"phase": 1,
"host": "aim-v0-6-0-0-final-r22-node-a",
"version": "ai-memory 0.6.0",
"pass": true,
"reasons": [
""
],
"stats": {
"total": 1,
"by_tier": [
{
"tier": "mid",
"count": 1
}
],
"by_namespace": [
{
"namespace": "ship-gate-phase1",
"count": 1
}
],
"expiring_soon": 0,
"links_count": 0,
"db_size_bytes": 139264
},
"curator": {
"started_at": "2026-04-20T15:59:19.791710256+00:00",
"completed_at": "2026-04-20T15:59:19.792192516+00:00",
"cycle_duration_ms": 0,
"memories_scanned": 1,
"memories_eligible": 1,
"auto_tagged": 0,
"contradictions_found": 0,
"operations_attempted": 0,
"operations_skipped_cap": 0,
"autonomy": {
"clusters_formed": 0,
"memories_consolidated": 0,
"memories_forgotten": 0,
"priority_adjustments": 0,
"rollback_entries_written": 0,
"errors": []
},
"errors": [
"no LLM client configured"
],
"dry_run": true
},
"mcp_tool_count": 36,
"recall_count": 1,
"snapshot_count": 1,
"manifest_count": 1
}
raw JSON
phase1-node-b
{
"phase": 1,
"host": "aim-v0-6-0-0-final-r22-node-b",
"version": "ai-memory 0.6.0",
"pass": true,
"reasons": [
""
],
"stats": {
"total": 1,
"by_tier": [
{
"tier": "mid",
"count": 1
}
],
"by_namespace": [
{
"namespace": "ship-gate-phase1",
"count": 1
}
],
"expiring_soon": 0,
"links_count": 0,
"db_size_bytes": 139264
},
"curator": {
"started_at": "2026-04-20T15:59:19.897722331+00:00",
"completed_at": "2026-04-20T15:59:19.898184118+00:00",
"cycle_duration_ms": 0,
"memories_scanned": 1,
"memories_eligible": 1,
"auto_tagged": 0,
"contradictions_found": 0,
"operations_attempted": 0,
"operations_skipped_cap": 0,
"autonomy": {
"clusters_formed": 0,
"memories_consolidated": 0,
"memories_forgotten": 0,
"priority_adjustments": 0,
"rollback_entries_written": 0,
"errors": []
},
"errors": [
"no LLM client configured"
],
"dry_run": true
},
"mcp_tool_count": 36,
"recall_count": 1,
"snapshot_count": 1,
"manifest_count": 1
}
raw JSON
phase1-node-c
{
"phase": 1,
"host": "aim-v0-6-0-0-final-r22-node-c",
"version": "ai-memory 0.6.0",
"pass": true,
"reasons": [
""
],
"stats": {
"total": 1,
"by_tier": [
{
"tier": "mid",
"count": 1
}
],
"by_namespace": [
{
"namespace": "ship-gate-phase1",
"count": 1
}
],
"expiring_soon": 0,
"links_count": 0,
"db_size_bytes": 139264
},
"curator": {
"started_at": "2026-04-20T15:59:19.936080940+00:00",
"completed_at": "2026-04-20T15:59:19.936595743+00:00",
"cycle_duration_ms": 0,
"memories_scanned": 1,
"memories_eligible": 1,
"auto_tagged": 0,
"contradictions_found": 0,
"operations_attempted": 0,
"operations_skipped_cap": 0,
"autonomy": {
"clusters_formed": 0,
"memories_consolidated": 0,
"memories_forgotten": 0,
"priority_adjustments": 0,
"rollback_entries_written": 0,
"errors": []
},
"errors": [
"no LLM client configured"
],
"dry_run": true
},
"mcp_tool_count": 36,
"recall_count": 1,
"snapshot_count": 1,
"manifest_count": 1
}
raw JSON
Phase 2 — multi-agent federation PASS
What this phase proves: 4 agents × 50 writes against the 3-node federation with W=2 quorum, then 90s settle and convergence count on every peer. Plus two quorum probes (one-peer-down must 201, both-peers-down must 503). Catches silent-data-loss and quorum-misclassification regressions.
Test results
- ✓ Burst writes returned 201 — ok=200/200 (qnm=0, fail=0)
- ✓ node-A convergence ≥ 95% of ok — a=200 / threshold 190
- ✓ node-B convergence ≥ 95% of ok — b=200 / threshold 190
- ✓ node-C convergence ≥ 95% of ok — c=200 / threshold 190
- ✓ Probe 1: one peer down → 201 (quorum met via remaining peer) — got 201
- ✓ Probe 2: both peers down → 503 (quorum_not_met) — got 503
- ✓ Overall phase-2 pass flag
Raw evidence
phase2
{
"phase": 2,
"pass": true,
"total_writes": 200,
"ok": 200,
"quorum_not_met": 0,
"fail": 0,
"counts": {
"a": 200,
"b": 200,
"c": 200
},
"probe1_single_peer_down": "201",
"probe2_both_peers_down": "503",
"reasons": [
""
]
}
raw JSON
Phase 3 — cross-backend migration PASS
What this phase proves: 1000-memory round-trip: SQLite → Postgres, re-run for idempotency, Postgres → SQLite. Asserts zero errors and counts match. Catches migration-correctness regressions in either direction of a production upgrade path.
Test results
- ✓ Source SQLite has 1000 seed memories — src_count=1000
- ✓ Destination after reverse roundtrip has 1000 memories — dst_count=1000
- ✓ Forward migration SQLite → Postgres: errors=0 — errors=0
- ✓ Idempotent re-run is a no-op — writes=1000
- ✓ Reverse migration Postgres → SQLite: errors=0 — errors=0
- ✓ Overall phase-3 pass flag
Raw evidence
phase3
{
"phase": 3,
"pass": true,
"report_forward": {
"batches": 1,
"dry_run": false,
"errors": [],
"from_url": "sqlite:///tmp/phase3-source.db",
"memories_read": 1000,
"memories_written": 1000,
"to_url": "postgres://ai_memory:ai_memory_test@127.0.0.1:5433/ai_memory_test"
},
"report_idempotent": {
"batches": 1,
"dry_run": false,
"errors": [],
"from_url": "sqlite:///tmp/phase3-source.db",
"memories_read": 1000,
"memories_written": 1000,
"to_url": "postgres://ai_memory:ai_memory_test@127.0.0.1:5433/ai_memory_test"
},
"report_reverse": {
"batches": 1,
"dry_run": false,
"errors": [],
"from_url": "postgres://ai_memory:ai_memory_test@127.0.0.1:5433/ai_memory_test",
"memories_read": 1000,
"memories_written": 1000,
"to_url": "sqlite:///tmp/phase3-roundtrip.db"
},
"src_count": 1000,
"dst_count": 1000,
"reasons": [
""
]
}
raw JSON