../ runs index

Campaign a2a-hermes-v0.6.3.1-r7 FAIL

Agent group
hermes (homogeneous)
ai-memory ref
v0.6.3.1
Completed at
2026-05-03T12:47:07Z
Overall pass
false
Skipped reports
1

Infrastructure

Provider
digitalocean
Region
nyc3
Droplet size
s-2vcpu-4gb
Topology
4-node federation mesh (W=2/N=4)
Scenarios started
2026-05-03T12:21:59Z
Scenarios ended
Dispatched by
alphaonedev
Harness SHA
938af3d9de98
Workflow run
https://github.com/alphaonedev/ai-memory-a2a-v0.6.3.1/actions/runs/25278703880

Node roster

#RoleAgent IDPublic IPPrivate IP
1agentai:alice134.209.34.22810.11.0.3
2agentai:bob159.65.252.13810.11.0.4
3agentai:charlie142.93.71.6510.11.0.2
4memory-only104.248.0.22810.11.0.5

Baseline attestation BASELINE OK

Per the authoritative baseline spec, every agent node must emit a self-attestation before any scenario is permitted to run. This run's attestation:

Spec version: 1.4.0 — see authoritative baseline.

NodeAgentFrameworkAuthenticMCP ai-memoryxAI cfgxAI defaultAgent IDFederationUFW offiptablesdead-manF1 xAIF2a substrateF2b agent (non-gating)Config SHAPass
node-1ai:alicehermes Hermes Agent v0.12.0 (2026.4.30)12f99ec56116PASS
node-2ai:bobhermes Hermes Agent v0.12.0 (2026.4.30)889c475f0926PASS
node-3ai:charliehermes Hermes Agent v0.12.0 (2026.4.30)00622f57d71bPASS
a2a-baseline.json
{
	"baseline_pass": true,
	"per_node": [
		{
			"spec_version": "1.4.0",
			"agent_type": "hermes",
			"agent_id": "ai:alice",
			"node_index": "1",
			"framework_version": "Hermes Agent v0.12.0 (2026.4.30)",
			"ai_memory_version": "0.6.3.1",
			"peer_urls": "http://10.11.0.4:9077,http://10.11.0.2:9077,http://10.11.0.5:9077",
			"config_file_sha256": "12f99ec56116dbd03748777fabc1697dbcc89bd41e0a1470c0dae152987998de",
			"config_attestation": {
				"framework_is_authentic": true,
				"mcp_server_ai_memory_registered": true,
				"llm_backend_is_xai_grok": true,
				"llm_is_default_provider": true,
				"mcp_command_is_ai_memory": true,
				"agent_id_stamped": true,
				"federation_live": true,
				"ufw_disabled": true,
				"iptables_flushed": true,
				"dead_man_switch_scheduled": true
			},
			"negative_invariants": {
				"_description": "Alternative A2A channels must be OFF so a passing scenario is only passing via ai-memory shared memory. Any true here = thesis-preserving.",
				"a2a_protocol_off": true,
				"sub_agent_or_sessions_spawn_off": true,
				"alternative_channels_off": true,
				"tool_allowlist_is_memory_only": true,
				"a2a_gate_profile_locked": true
			},
			"functional_probes": {
				"xai_grok_chat_reachable": true,
				"xai_grok_sample_reply": "READY",
				"substrate_http_canary_f2a": true,
				"substrate_http_canary_uuid": "0d0bbe7b-b49a-474f-a51c-1bc470919a9d",
				"agent_mcp_canary_f2b": false,
				"agent_mcp_canary_uuid": "971cb8e2-f71c-47c7-99cc-94d11b797176",
				"agent_canary_response_head": "DONE ",
				"_f2b_note": "F2b is LLM-dependent and non-blocking. F2a (deterministic HTTP substrate) gates baseline_pass.",
				"hermes_peer_a2a_repro_f3b": false,
				"hermes_peer_a2a_repro_uuid": "",
				"_f3b_note": "F3b is hermes-only and observed (non-blocking). Asserts the agent-driven mcp_memory_memory_store path actually lands a row through the hermes_cli tool dispatcher, distinct from the workflow-level F3 which probes substrate-only HTTP federation. False here while F2b is true => allowlist filter regression in hermes_cli/mcp_tools.py.",
				"mesh_connectivity_f4": true,
				"mesh_edges_ok": 3,
				"mesh_edges_total": 3,
				"mesh_edges_detail": "10.11.0.4:9077:OK,10.11.0.2:9077:OK,10.11.0.5:9077:OK",
				"_f4_note": "F4 verifies this local nodes N-1 OUTBOUND mesh edges to every peer via both GET health and POST sync_push dry_run. Aggregator ANDs across N nodes to confirm full N*(N-1) bidirectional reachability. Gates baseline_pass.",
				"ai_memory_mcp_stdio_f5": true,
				"ai_memory_mcp_stdio_init_ok": true,
				"ai_memory_mcp_stdio_tools_ok": true,
				"ai_memory_mcp_stdio_tools_found": "memory_agent_list,memory_agent_register,memory_archive_list,memory_archive_purge,memory_archive_restore,memory_archive_stats,memory_auto_tag,memory_capabilities,memory_check_duplicate,memory_consolidate,memory_delete,memory_detect_contradiction,memory_entity_get_by_alias,memory_entity_register,memory_expand_query,memory_forget,memory_gc,memory_get,memory_get_links,memory_get_taxonomy,memory_inbox,memory_kg_invalidate,memory_kg_query,memory_kg_timeline,memory_link,memory_list,memory_list_subscriptions,memory_namespace_clear_standard,memory_namespace_get_standard,memory_namespace_set_standard,memory_notify,memory_pending_approve,memory_pending_list,memory_pending_reject,memory_promote,memory_recall,memory_search,memory_session_start,memory_stats,memory_store,memory_subscribe,memory_unsubscribe,memory_update",
				"_f5_note": "F5 spawns the ai-memory stdio MCP subprocess using the framework-configured invocation and verifies initialize + tools/list return memory_store, memory_recall, memory_list. Deterministic (no LLM). Gates baseline_pass.",
				"tls_mode": "off",
				"tls_handshake_f6": true,
				"tls_handshake_f6_reason": "",
				"mtls_enforcement_f7": true,
				"mtls_enforcement_f7_reason": "",
				"_f6_f7_note": "F6 verifies the TLS 1.3 handshake against the local serve + CA chain. F7 verifies mTLS enforcement — anonymous client rejected, whitelisted client accepted. Both gate baseline_pass when tls_mode != off / mtls respectively.",
				"embedder_loaded_f8": true,
				"embedder_loaded_f8_reason": "",
				"_f8_note": "F8 verifies /api/v1/capabilities reports features.embedder_loaded=true — i.e. the MiniLM embedder initialised at serve startup. Gates baseline_pass unconditionally. Without this, scenario-18 silently black-holes (semantic recall returns 0 rows).",
				"agent_mcp_ai_memory_canary": true,
				"canary_uuid": "0d0bbe7b-b49a-474f-a51c-1bc470919a9d",
				"canary_namespace": "_baseline_canary_f2a"
			},
			"baseline_pass": true
		},
		{
			"spec_version": "1.4.0",
			"agent_type": "hermes",
			"agent_id": "ai:bob",
			"node_index": "2",
			"framework_version": "Hermes Agent v0.12.0 (2026.4.30)",
			"ai_memory_version": "0.6.3.1",
			"peer_urls": "http://10.11.0.3:9077,http://10.11.0.2:9077,http://10.11.0.5:9077",
			"config_file_sha256": "889c475f0926686ffdc3827accbd19cba3d820a90c5ae908eb1ac33f71eb0098",
			"config_attestation": {
				"framework_is_authentic": true,
				"mcp_server_ai_memory_registered": true,
				"llm_backend_is_xai_grok": true,
				"llm_is_default_provider": true,
				"mcp_command_is_ai_memory": true,
				"agent_id_stamped": true,
				"federation_live": true,
				"ufw_disabled": true,
				"iptables_flushed": true,
				"dead_man_switch_scheduled": true
			},
			"negative_invariants": {
				"_description": "Alternative A2A channels must be OFF so a passing scenario is only passing via ai-memory shared memory. Any true here = thesis-preserving.",
				"a2a_protocol_off": true,
				"sub_agent_or_sessions_spawn_off": true,
				"alternative_channels_off": true,
				"tool_allowlist_is_memory_only": true,
				"a2a_gate_profile_locked": true
			},
			"functional_probes": {
				"xai_grok_chat_reachable": true,
				"xai_grok_sample_reply": "READY",
				"substrate_http_canary_f2a": true,
				"substrate_http_canary_uuid": "124e9b95-dca3-4459-84fd-03cc56b0c219",
				"agent_mcp_canary_f2b": false,
				"agent_mcp_canary_uuid": "f2a7b87f-8551-4b8e-a876-ecd5dd2f507b",
				"agent_canary_response_head": "DONE ",
				"_f2b_note": "F2b is LLM-dependent and non-blocking. F2a (deterministic HTTP substrate) gates baseline_pass.",
				"hermes_peer_a2a_repro_f3b": false,
				"hermes_peer_a2a_repro_uuid": "",
				"_f3b_note": "F3b is hermes-only and observed (non-blocking). Asserts the agent-driven mcp_memory_memory_store path actually lands a row through the hermes_cli tool dispatcher, distinct from the workflow-level F3 which probes substrate-only HTTP federation. False here while F2b is true => allowlist filter regression in hermes_cli/mcp_tools.py.",
				"mesh_connectivity_f4": true,
				"mesh_edges_ok": 3,
				"mesh_edges_total": 3,
				"mesh_edges_detail": "10.11.0.3:9077:OK,10.11.0.2:9077:OK,10.11.0.5:9077:OK",
				"_f4_note": "F4 verifies this local nodes N-1 OUTBOUND mesh edges to every peer via both GET health and POST sync_push dry_run. Aggregator ANDs across N nodes to confirm full N*(N-1) bidirectional reachability. Gates baseline_pass.",
				"ai_memory_mcp_stdio_f5": true,
				"ai_memory_mcp_stdio_init_ok": true,
				"ai_memory_mcp_stdio_tools_ok": true,
				"ai_memory_mcp_stdio_tools_found": "memory_agent_list,memory_agent_register,memory_archive_list,memory_archive_purge,memory_archive_restore,memory_archive_stats,memory_auto_tag,memory_capabilities,memory_check_duplicate,memory_consolidate,memory_delete,memory_detect_contradiction,memory_entity_get_by_alias,memory_entity_register,memory_expand_query,memory_forget,memory_gc,memory_get,memory_get_links,memory_get_taxonomy,memory_inbox,memory_kg_invalidate,memory_kg_query,memory_kg_timeline,memory_link,memory_list,memory_list_subscriptions,memory_namespace_clear_standard,memory_namespace_get_standard,memory_namespace_set_standard,memory_notify,memory_pending_approve,memory_pending_list,memory_pending_reject,memory_promote,memory_recall,memory_search,memory_session_start,memory_stats,memory_store,memory_subscribe,memory_unsubscribe,memory_update",
				"_f5_note": "F5 spawns the ai-memory stdio MCP subprocess using the framework-configured invocation and verifies initialize + tools/list return memory_store, memory_recall, memory_list. Deterministic (no LLM). Gates baseline_pass.",
				"tls_mode": "off",
				"tls_handshake_f6": true,
				"tls_handshake_f6_reason": "",
				"mtls_enforcement_f7": true,
				"mtls_enforcement_f7_reason": "",
				"_f6_f7_note": "F6 verifies the TLS 1.3 handshake against the local serve + CA chain. F7 verifies mTLS enforcement — anonymous client rejected, whitelisted client accepted. Both gate baseline_pass when tls_mode != off / mtls respectively.",
				"embedder_loaded_f8": true,
				"embedder_loaded_f8_reason": "",
				"_f8_note": "F8 verifies /api/v1/capabilities reports features.embedder_loaded=true — i.e. the MiniLM embedder initialised at serve startup. Gates baseline_pass unconditionally. Without this, scenario-18 silently black-holes (semantic recall returns 0 rows).",
				"agent_mcp_ai_memory_canary": true,
				"canary_uuid": "124e9b95-dca3-4459-84fd-03cc56b0c219",
				"canary_namespace": "_baseline_canary_f2a"
			},
			"baseline_pass": true
		},
		{
			"spec_version": "1.4.0",
			"agent_type": "hermes",
			"agent_id": "ai:charlie",
			"node_index": "3",
			"framework_version": "Hermes Agent v0.12.0 (2026.4.30)",
			"ai_memory_version": "0.6.3.1",
			"peer_urls": "http://10.11.0.3:9077,http://10.11.0.4:9077,http://10.11.0.5:9077",
			"config_file_sha256": "00622f57d71b6fd1a9018469eff1efa71f7ec334eda63f58017c4ce34744c35f",
			"config_attestation": {
				"framework_is_authentic": true,
				"mcp_server_ai_memory_registered": true,
				"llm_backend_is_xai_grok": true,
				"llm_is_default_provider": true,
				"mcp_command_is_ai_memory": true,
				"agent_id_stamped": true,
				"federation_live": true,
				"ufw_disabled": true,
				"iptables_flushed": true,
				"dead_man_switch_scheduled": true
			},
			"negative_invariants": {
				"_description": "Alternative A2A channels must be OFF so a passing scenario is only passing via ai-memory shared memory. Any true here = thesis-preserving.",
				"a2a_protocol_off": true,
				"sub_agent_or_sessions_spawn_off": true,
				"alternative_channels_off": true,
				"tool_allowlist_is_memory_only": true,
				"a2a_gate_profile_locked": true
			},
			"functional_probes": {
				"xai_grok_chat_reachable": true,
				"xai_grok_sample_reply": "READY",
				"substrate_http_canary_f2a": true,
				"substrate_http_canary_uuid": "d9607e67-222c-45dc-b467-87f79d3b18ca",
				"agent_mcp_canary_f2b": false,
				"agent_mcp_canary_uuid": "d8db96e7-97ac-4929-9a3e-7958b7601b57",
				"agent_canary_response_head": "I cannot use the specified tool as it is not available in my toolset. ",
				"_f2b_note": "F2b is LLM-dependent and non-blocking. F2a (deterministic HTTP substrate) gates baseline_pass.",
				"hermes_peer_a2a_repro_f3b": false,
				"hermes_peer_a2a_repro_uuid": "",
				"_f3b_note": "F3b is hermes-only and observed (non-blocking). Asserts the agent-driven mcp_memory_memory_store path actually lands a row through the hermes_cli tool dispatcher, distinct from the workflow-level F3 which probes substrate-only HTTP federation. False here while F2b is true => allowlist filter regression in hermes_cli/mcp_tools.py.",
				"mesh_connectivity_f4": true,
				"mesh_edges_ok": 3,
				"mesh_edges_total": 3,
				"mesh_edges_detail": "10.11.0.3:9077:OK,10.11.0.4:9077:OK,10.11.0.5:9077:OK",
				"_f4_note": "F4 verifies this local nodes N-1 OUTBOUND mesh edges to every peer via both GET health and POST sync_push dry_run. Aggregator ANDs across N nodes to confirm full N*(N-1) bidirectional reachability. Gates baseline_pass.",
				"ai_memory_mcp_stdio_f5": true,
				"ai_memory_mcp_stdio_init_ok": true,
				"ai_memory_mcp_stdio_tools_ok": true,
				"ai_memory_mcp_stdio_tools_found": "memory_agent_list,memory_agent_register,memory_archive_list,memory_archive_purge,memory_archive_restore,memory_archive_stats,memory_auto_tag,memory_capabilities,memory_check_duplicate,memory_consolidate,memory_delete,memory_detect_contradiction,memory_entity_get_by_alias,memory_entity_register,memory_expand_query,memory_forget,memory_gc,memory_get,memory_get_links,memory_get_taxonomy,memory_inbox,memory_kg_invalidate,memory_kg_query,memory_kg_timeline,memory_link,memory_list,memory_list_subscriptions,memory_namespace_clear_standard,memory_namespace_get_standard,memory_namespace_set_standard,memory_notify,memory_pending_approve,memory_pending_list,memory_pending_reject,memory_promote,memory_recall,memory_search,memory_session_start,memory_stats,memory_store,memory_subscribe,memory_unsubscribe,memory_update",
				"_f5_note": "F5 spawns the ai-memory stdio MCP subprocess using the framework-configured invocation and verifies initialize + tools/list return memory_store, memory_recall, memory_list. Deterministic (no LLM). Gates baseline_pass.",
				"tls_mode": "off",
				"tls_handshake_f6": true,
				"tls_handshake_f6_reason": "",
				"mtls_enforcement_f7": true,
				"mtls_enforcement_f7_reason": "",
				"_f6_f7_note": "F6 verifies the TLS 1.3 handshake against the local serve + CA chain. F7 verifies mTLS enforcement — anonymous client rejected, whitelisted client accepted. Both gate baseline_pass when tls_mode != off / mtls respectively.",
				"embedder_loaded_f8": true,
				"embedder_loaded_f8_reason": "",
				"_f8_note": "F8 verifies /api/v1/capabilities reports features.embedder_loaded=true — i.e. the MiniLM embedder initialised at serve startup. Gates baseline_pass unconditionally. Without this, scenario-18 silently black-holes (semantic recall returns 0 rows).",
				"agent_mcp_ai_memory_canary": true,
				"canary_uuid": "d9607e67-222c-45dc-b467-87f79d3b18ca",
				"canary_namespace": "_baseline_canary_f2a"
			},
			"baseline_pass": true
		}
	]
}

raw file

F3 — peer A2A via shared memory F3 OK

Workflow-level probe answering "can agents communicate through ai-memory?". Writer ai:alice posted canary UUID 11505e6d-3867-4662-a296-747260e62d6f to namespace _baseline_peer_canary via node-1's local ai-memory serve HTTP. After W=2 fanout settle, probe confirmed the canary on each of the 3 peer nodes via their local GET /api/v1/memories.

f3-peer-a2a.json
{
	"probe": "F3",
	"name": "peer-a2a-via-shared-memory",
	"description": "Writer agent posts a canary via local ai-memory HTTP on node-1; verifies the row propagates to the 3 peer nodes (W=2/N=4 quorum) before scenarios run.",
	"canary_uuid": "11505e6d-3867-4662-a296-747260e62d6f",
	"canary_namespace": "_baseline_peer_canary",
	"writer_agent": "ai:alice",
	"pass": true
}

raw file

Run focus

Campaign failed: no scenario reports recovered.

What this campaign tested: No scenarios were successfully exercised; 35 were requested covering various transport, framework, and primitive axes, but none reported.

What it demonstrated: The run demonstrated a failure in test harness reporting, proving nothing about AI memory federation reliability.

AI NHI analysis · Claude Opus 4.7

Campaign failed: no scenario reports recovered.

FAIL — no scenario reports recovered

For three audiences

Non-technical end users

This test run didn't produce any results because the system couldn't collect the reports. We have no new information on whether agents can reliably share memories with each other. It looks like a technical glitch in gathering the data.

C-level decision makers

This unsuccessful campaign provides zero insights, maintaining current risk posture with no advancement in production readiness. Customer-facing claims on memory sharing reliability remain unsupported by new evidence. No changes from prior runs; investigate harness stability before retrying.

Engineers & architects

Harness failed to recover any scenario reports, with scenario-1.json noted as unparseable and all others missing. Impacts all requested scenarios (S1, S1b, S2, S4-S6, S9-S18, S22-S25, S28-S42) across federation mesh topology. Probable root cause: CI workflow artifact collection or JSON parsing errors in harness_sha 938af3d9de9802766f884bd7752327379ede7600.

What changes going into the next campaign

Fix scenario report collection and JSON parsing in the CI harness before next run.

Tests performed in this run

Every scenario that produced a JSON report in this campaign, in testbook order. Click a row's scenario id to jump to its full report below. See the Every test performed page for the authoritative catalog.

IDTitleResultReason
S1Per-agent write + read (MCP stdio)?

Scenario 1 — Per-agent write + read (MCP stdio) UNKNOWN

scenario-1.json (report)

          

raw file

scenario-1.log (console trace)
phase A: each agent writes 10 memories via MCP
  ai:alice on 134.209.34.228
  ai:bob on 159.65.252.138

raw file

All artifacts