Campaign a2a-ironclaw-v0.6.3.1-r5 FAIL

Agent group: ironclaw (homogeneous)
ai-memory ref: v0.6.3.1
Completed at: 2026-05-01T18:05:30Z
Overall pass: false
Skipped reports: 0

Infrastructure

Provider: digitalocean
Region: nyc3
Droplet size: s-2vcpu-4gb
Topology: 4-node federation mesh (W=2/N=4)
Scenarios started
Scenarios ended
Dispatched by: alphaonedev
Harness SHA: 5b762cc14a53
Workflow run: https://github.com/alphaonedev/ai-memory-a2a-v0.6.3.1/actions/runs/25225516100

Node roster

#	Role	Agent ID	Public IP	Private IP
1	agent	`ai:alice`	`64.225.8.124`	`10.10.2.4`
2	agent	`ai:bob`	`143.198.30.38`	`10.10.2.5`
3	agent	`ai:charlie`	`138.197.10.205`	`10.10.2.3`
4	memory-only	`—`	`64.225.28.138`	`10.10.2.2`

Baseline attestation BASELINE VIOLATION

Per the authoritative baseline spec, every agent node must emit a self-attestation before any scenario is permitted to run. This run's attestation:

Spec version: 1.4.0 — see authoritative baseline.

Node	Agent	Framework	Authentic	MCP ai-memory	xAI cfg	xAI default	Agent ID	Federation	UFW off	iptables	dead-man	F1 xAI	F2a substrate	F2b agent (non-gating)	Config SHA	Pass
node-2	`ai:bob`	`ironclaw ironclaw 0.27.0`	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	—	`8d05834be8f2`	PASS
node-3	`ai:charlie`	`ironclaw ironclaw 0.27.0`	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	—	`4735c4a9e431`	PASS

a2a-baseline.json

{
	"baseline_pass": false,
	"per_node": [
		{
			"spec_version": "1.4.0",
			"agent_type": "ironclaw",
			"agent_id": "ai:bob",
			"node_index": "2",
			"framework_version": "ironclaw 0.27.0",
			"ai_memory_version": "0.6.3.1",
			"peer_urls": "https://10.10.2.4:9077,https://10.10.2.3:9077,https://10.10.2.2:9077",
			"config_file_sha256": "8d05834be8f29ce034db45abbfe2e7e263571742f200572701cfb45b7194a1ed",
			"config_attestation": {
				"framework_is_authentic": true,
				"mcp_server_ai_memory_registered": true,
				"llm_backend_is_xai_grok": true,
				"llm_is_default_provider": true,
				"mcp_command_is_ai_memory": true,
				"agent_id_stamped": true,
				"federation_live": true,
				"ufw_disabled": true,
				"iptables_flushed": true,
				"dead_man_switch_scheduled": true
			},
			"negative_invariants": {
				"_description": "Alternative A2A channels must be OFF so a passing scenario is only passing via ai-memory shared memory. Any true here = thesis-preserving.",
				"a2a_protocol_off": true,
				"sub_agent_or_sessions_spawn_off": true,
				"alternative_channels_off": true,
				"tool_allowlist_is_memory_only": true,
				"a2a_gate_profile_locked": true
			},
			"functional_probes": {
				"xai_grok_chat_reachable": true,
				"xai_grok_sample_reply": "READY",
				"substrate_http_canary_f2a": true,
				"substrate_http_canary_uuid": "f079071c-4628-4f65-b3db-2bb85b73599f",
				"agent_mcp_canary_f2b": false,
				"agent_mcp_canary_uuid": "473e6577-f064-4190-99bf-0a7f7bd1d569",
				"agent_canary_response_head": "error: unrecognized subcommand 'chat'    tip: a similar subcommand exists: 'channels'  Usage: ironclaw [OPTIONS] [COMMAND]  For more information, try '--help'. ",
				"_f2b_note": "F2b is LLM-dependent and non-blocking. F2a (deterministic HTTP substrate) gates baseline_pass.",
				"mesh_connectivity_f4": true,
				"mesh_edges_ok": 3,
				"mesh_edges_total": 3,
				"mesh_edges_detail": "10.10.2.4:9077:OK,10.10.2.3:9077:OK,10.10.2.2:9077:OK",
				"_f4_note": "F4 verifies this local nodes N-1 OUTBOUND mesh edges to every peer via both GET health and POST sync_push dry_run. Aggregator ANDs across N nodes to confirm full N*(N-1) bidirectional reachability. Gates baseline_pass.",
				"ai_memory_mcp_stdio_f5": true,
				"ai_memory_mcp_stdio_init_ok": true,
				"ai_memory_mcp_stdio_tools_ok": true,
				"ai_memory_mcp_stdio_tools_found": "memory_agent_list,memory_agent_register,memory_archive_list,memory_archive_purge,memory_archive_restore,memory_archive_stats,memory_auto_tag,memory_capabilities,memory_check_duplicate,memory_consolidate,memory_delete,memory_detect_contradiction,memory_entity_get_by_alias,memory_entity_register,memory_expand_query,memory_forget,memory_gc,memory_get,memory_get_links,memory_get_taxonomy,memory_inbox,memory_kg_invalidate,memory_kg_query,memory_kg_timeline,memory_link,memory_list,memory_list_subscriptions,memory_namespace_clear_standard,memory_namespace_get_standard,memory_namespace_set_standard,memory_notify,memory_pending_approve,memory_pending_list,memory_pending_reject,memory_promote,memory_recall,memory_search,memory_session_start,memory_stats,memory_store,memory_subscribe,memory_unsubscribe,memory_update",
				"_f5_note": "F5 spawns the ai-memory stdio MCP subprocess using the framework-configured invocation and verifies initialize + tools/list return memory_store, memory_recall, memory_list. Deterministic (no LLM). Gates baseline_pass.",
				"tls_mode": "mtls",
				"tls_handshake_f6": true,
				"tls_handshake_f6_reason": "",
				"mtls_enforcement_f7": true,
				"mtls_enforcement_f7_reason": "",
				"_f6_f7_note": "F6 verifies the TLS 1.3 handshake against the local serve + CA chain. F7 verifies mTLS enforcement — anonymous client rejected, whitelisted client accepted. Both gate baseline_pass when tls_mode != off / mtls respectively.",
				"embedder_loaded_f8": true,
				"embedder_loaded_f8_reason": "",
				"_f8_note": "F8 verifies /api/v1/capabilities reports features.embedder_loaded=true — i.e. the MiniLM embedder initialised at serve startup. Gates baseline_pass unconditionally. Without this, scenario-18 silently black-holes (semantic recall returns 0 rows).",
				"agent_mcp_ai_memory_canary": true,
				"canary_uuid": "f079071c-4628-4f65-b3db-2bb85b73599f",
				"canary_namespace": "_baseline_canary_f2a"
			},
			"baseline_pass": true
		},
		{
			"spec_version": "1.4.0",
			"agent_type": "ironclaw",
			"agent_id": "ai:charlie",
			"node_index": "3",
			"framework_version": "ironclaw 0.27.0",
			"ai_memory_version": "0.6.3.1",
			"peer_urls": "https://10.10.2.4:9077,https://10.10.2.5:9077,https://10.10.2.2:9077",
			"config_file_sha256": "4735c4a9e4317f0e67d0c1cd31bfb174af5ec7f6a610b228989be0247238b568",
			"config_attestation": {
				"framework_is_authentic": true,
				"mcp_server_ai_memory_registered": true,
				"llm_backend_is_xai_grok": true,
				"llm_is_default_provider": true,
				"mcp_command_is_ai_memory": true,
				"agent_id_stamped": true,
				"federation_live": true,
				"ufw_disabled": true,
				"iptables_flushed": true,
				"dead_man_switch_scheduled": true
			},
			"negative_invariants": {
				"_description": "Alternative A2A channels must be OFF so a passing scenario is only passing via ai-memory shared memory. Any true here = thesis-preserving.",
				"a2a_protocol_off": true,
				"sub_agent_or_sessions_spawn_off": true,
				"alternative_channels_off": true,
				"tool_allowlist_is_memory_only": true,
				"a2a_gate_profile_locked": true
			},
			"functional_probes": {
				"xai_grok_chat_reachable": true,
				"xai_grok_sample_reply": "READY",
				"substrate_http_canary_f2a": true,
				"substrate_http_canary_uuid": "db09a7a3-15fb-400e-9c07-f1d439f27f6b",
				"agent_mcp_canary_f2b": false,
				"agent_mcp_canary_uuid": "eaf03b26-4ba3-4a81-8161-a82920f96af4",
				"agent_canary_response_head": "error: unrecognized subcommand 'chat'    tip: a similar subcommand exists: 'channels'  Usage: ironclaw [OPTIONS] [COMMAND]  For more information, try '--help'. ",
				"_f2b_note": "F2b is LLM-dependent and non-blocking. F2a (deterministic HTTP substrate) gates baseline_pass.",
				"mesh_connectivity_f4": true,
				"mesh_edges_ok": 3,
				"mesh_edges_total": 3,
				"mesh_edges_detail": "10.10.2.4:9077:OK,10.10.2.5:9077:OK,10.10.2.2:9077:OK",
				"_f4_note": "F4 verifies this local nodes N-1 OUTBOUND mesh edges to every peer via both GET health and POST sync_push dry_run. Aggregator ANDs across N nodes to confirm full N*(N-1) bidirectional reachability. Gates baseline_pass.",
				"ai_memory_mcp_stdio_f5": true,
				"ai_memory_mcp_stdio_init_ok": true,
				"ai_memory_mcp_stdio_tools_ok": true,
				"ai_memory_mcp_stdio_tools_found": "memory_agent_list,memory_agent_register,memory_archive_list,memory_archive_purge,memory_archive_restore,memory_archive_stats,memory_auto_tag,memory_capabilities,memory_check_duplicate,memory_consolidate,memory_delete,memory_detect_contradiction,memory_entity_get_by_alias,memory_entity_register,memory_expand_query,memory_forget,memory_gc,memory_get,memory_get_links,memory_get_taxonomy,memory_inbox,memory_kg_invalidate,memory_kg_query,memory_kg_timeline,memory_link,memory_list,memory_list_subscriptions,memory_namespace_clear_standard,memory_namespace_get_standard,memory_namespace_set_standard,memory_notify,memory_pending_approve,memory_pending_list,memory_pending_reject,memory_promote,memory_recall,memory_search,memory_session_start,memory_stats,memory_store,memory_subscribe,memory_unsubscribe,memory_update",
				"_f5_note": "F5 spawns the ai-memory stdio MCP subprocess using the framework-configured invocation and verifies initialize + tools/list return memory_store, memory_recall, memory_list. Deterministic (no LLM). Gates baseline_pass.",
				"tls_mode": "mtls",
				"tls_handshake_f6": true,
				"tls_handshake_f6_reason": "",
				"mtls_enforcement_f7": true,
				"mtls_enforcement_f7_reason": "",
				"_f6_f7_note": "F6 verifies the TLS 1.3 handshake against the local serve + CA chain. F7 verifies mTLS enforcement — anonymous client rejected, whitelisted client accepted. Both gate baseline_pass when tls_mode != off / mtls respectively.",
				"embedder_loaded_f8": true,
				"embedder_loaded_f8_reason": "",
				"_f8_note": "F8 verifies /api/v1/capabilities reports features.embedder_loaded=true — i.e. the MiniLM embedder initialised at serve startup. Gates baseline_pass unconditionally. Without this, scenario-18 silently black-holes (semantic recall returns 0 rows).",
				"agent_mcp_ai_memory_canary": true,
				"canary_uuid": "db09a7a3-15fb-400e-9c07-f1d439f27f6b",
				"canary_namespace": "_baseline_canary_f2a"
			},
			"baseline_pass": true
		}
	],
	"failure_mode": "baseline-violation"
}

raw file

Run focus

Campaign run failed due to no scenario reports recovered

What this campaign tested: Intended to exercise 30 scenarios across transport protocols, framework integrations, and memory primitives in a 4-node DigitalOcean federation mesh, but no tests executed successfully.

What it demonstrated: The campaign infrastructure failed to produce any scenario results, demonstrating a breakdown in the testing harness rather than validating AI memory sharing capabilities.

AI NHI analysis · Claude Opus 4.7

Campaign run failed due to no scenario reports recovered

FAIL — no scenario reports recovered

For three audiences

Non-technical end users

This test run was supposed to check if AI agents can reliably share memories with each other, but it didn't work at all. No tests ran, so we couldn't see if the memory sharing holds up. It's like planning a big experiment but forgetting to turn on the equipment.

C-level decision makers

This run indicates a critical failure in the CI/CD pipeline, rendering the test results unusable and exposing risks in our automated validation process for production readiness. Customer-facing claims about reliable agent memory sharing cannot be substantiated from this artifact. Compared to prior runs, this represents a regression in harness reliability, not in the core ai-memory code.

Engineers & architects

The campaign requested 30 scenarios (S1, S1b, S2, S4-S6, S9-S18, S22-S25, S28-S42) covering transport (e.g., mTLS, WebSockets), framework (IronClaw federation), and primitives (memory sync, eviction), but the scenarios array is empty with reason 'no scenario reports recovered'. Probable root cause is a failure in the test harness (harness_sha 5b762cc14a53cbc4c6e5bd92f841335d3d5ece0a) or CI workflow (run 25225516100), possibly due to timing issues (empty started_at/ended_at) or infra provisioning errors in the 4-node DigitalOcean setup. No specific failure modes in primitives can be assessed without reports.

What changes going into the next campaign

Investigate and fix the test harness to ensure scenario reports are generated and recovered before re-running the campaign.

All artifacts

Generated by scripts/generate_run_html.sh. Methodology: alphaonedev.github.io/ai-memory-ai2ai-gate/methodology. Analysis source: analysis/run-insights.json.