Run 2026-05-05 — aggregate verdict¶

Verdict: GATE GREEN Scope: OpenClaw x xAI Grok 4.3 only Captured against: ai-memory-mcp v0.6.4 release binary

Per-tier outcomes¶

Tier	Pass bar	Cells	Pass rate	Meets bar	Outcome
T1 — Awareness	>=90%	1/1	100%	yes	PASS
T2 — Reactive recovery	>=80%	1/1	100%	yes	PASS
T3 — Proactive expansion	>=50%	1/1	100%	yes	PASS
T4 — Mesh recovery	>=66%	3/3	100%	yes	PASS

Cells¶

T1 — Awareness¶

Cell	Outcome	Reason	Evidence
`grok-4.3-openclaw-t1-awareness-core`	PASS	all 8 families surfaced; final answer named 8; loaded/unloaded distinguished	json md

T2 — Reactive recovery¶

Cell	Outcome	Reason	Evidence
`grok-4.3-openclaw-t2-reactive-core`	PASS	agent pre-checked capabilities and surfaced operator action without blind call	json md

T3 — Proactive expansion¶

Cell	Outcome	Reason	Evidence
`grok-4.3-openclaw-t3-proactive-core`	PASS	agent pre-checked capabilities and surfaced expansion or operator action	json md

T4 — Mesh recovery¶

Cell	Outcome	Reason	Evidence
`grok-4.3-openclaw-t4-mesh-recovery-alice-core`	PASS	mesh completed coordination across simulated peers	json md
`grok-4.3-openclaw-t4-mesh-recovery-bob-core-graph`	PASS	mesh completed coordination across simulated peers	json md
`grok-4.3-openclaw-t4-mesh-recovery-charlie-full`	PASS	mesh completed coordination across simulated peers	json md

Methodology¶

Per-cell pass criteria documented in docs/methodology.md
Each cell starts from fixtures/corpus/v0.6.3.1-baseline.db.gz (schema v19)
v0.6.4 binary opens, runs v19 -> v20 migration, then runs the discovery test
LLM driver: scripts/grok_cell.py (xAI Grok 4.3 via api.x.ai/v1/chat/completions)

Reproducibility¶

# Set XAI_API_KEY in your environment, then:
DISCOVERY_GATE_BINARY=../ai-memory-mcp/target/release/ai-memory \
  bash scripts/run-llm-cells.sh 2026-05-05