../ runs index

Campaign a2a-hermes-v0.6.3.1-r10 FAIL

Agent group
hermes (homogeneous)
ai-memory ref
v0.6.3.1
Completed at
2026-05-03T15:38:23Z
Overall pass
false
Skipped reports
0

Infrastructure

Provider
digitalocean
Region
nyc3
Droplet size
Topology
4-node federation mesh (W=2/N=4)
Scenarios started
Scenarios ended
Dispatched by
alphaonedev
Harness SHA
526e48e8cd22
Workflow run
https://github.com/alphaonedev/ai-memory-a2a-v0.6.3.1/actions/runs/25283101096

Node roster

#RoleAgent IDPublic IPPrivate IP
1agentai:alice165.22.32.24310.11.0.4
2agentai:bob104.131.38.25110.11.0.3
3agentai:charlie64.225.8.4510.11.0.2
4memory-only159.65.176.6110.11.0.5

Baseline attestation BASELINE VIOLATION

Per the authoritative baseline spec, every agent node must emit a self-attestation before any scenario is permitted to run. This run's attestation:

Spec version: 1.4.0 — see authoritative baseline.

NodeAgentFrameworkAuthenticMCP ai-memoryxAI cfgxAI defaultAgent IDFederationUFW offiptablesdead-manF1 xAIF2a substrateF2b agent (non-gating)Config SHAPass
node-1ai:alicehermes Hermes Agent v0.12.0 (2026.4.30)12f99ec56116FAIL
node-2ai:bobhermes Hermes Agent v0.12.0 (2026.4.30)889c475f0926PASS
node-3ai:charliehermes Hermes Agent v0.12.0 (2026.4.30)00622f57d71bPASS
a2a-baseline.json
{
	"baseline_pass": false,
	"per_node": [
		{
			"spec_version": "1.4.0",
			"agent_type": "hermes",
			"agent_id": "ai:alice",
			"node_index": "1",
			"framework_version": "Hermes Agent v0.12.0 (2026.4.30)",
			"ai_memory_version": "0.6.3.1",
			"peer_urls": "http://10.11.0.3:9077,http://10.11.0.2:9077,http://10.11.0.5:9077",
			"config_file_sha256": "12f99ec56116dbd03748777fabc1697dbcc89bd41e0a1470c0dae152987998de",
			"config_attestation": {
				"framework_is_authentic": true,
				"mcp_server_ai_memory_registered": true,
				"llm_backend_is_xai_grok": true,
				"llm_is_default_provider": true,
				"mcp_command_is_ai_memory": true,
				"agent_id_stamped": true,
				"federation_live": true,
				"ufw_disabled": true,
				"iptables_flushed": true,
				"dead_man_switch_scheduled": true
			},
			"negative_invariants": {
				"_description": "Alternative A2A channels must be OFF so a passing scenario is only passing via ai-memory shared memory. Any true here = thesis-preserving.",
				"a2a_protocol_off": true,
				"sub_agent_or_sessions_spawn_off": true,
				"alternative_channels_off": true,
				"tool_allowlist_is_memory_only": true,
				"a2a_gate_profile_locked": true
			},
			"functional_probes": {
				"xai_grok_chat_reachable": false,
				"xai_grok_sample_reply": "",
				"substrate_http_canary_f2a": true,
				"substrate_http_canary_uuid": "253ba52b-4040-47f6-8362-d06b29e5f1b2",
				"agent_mcp_canary_f2b": false,
				"agent_mcp_canary_uuid": "a89943f4-a0f2-4e58-978b-26ca010e33b2",
				"agent_canary_response_head": "DONE ",
				"_f2b_note": "F2b is LLM-dependent and non-blocking. F2a (deterministic HTTP substrate) gates baseline_pass.",
				"hermes_peer_a2a_repro_f3b": false,
				"hermes_peer_a2a_repro_uuid": "",
				"_f3b_note": "F3b is hermes-only and observed (non-blocking). Asserts the agent-driven mcp_memory_memory_store path actually lands a row through the hermes_cli tool dispatcher, distinct from the workflow-level F3 which probes substrate-only HTTP federation. False here while F2b is true => allowlist filter regression in hermes_cli/mcp_tools.py.",
				"mesh_connectivity_f4": true,
				"mesh_edges_ok": 3,
				"mesh_edges_total": 3,
				"mesh_edges_detail": "10.11.0.3:9077:OK,10.11.0.2:9077:OK,10.11.0.5:9077:OK",
				"_f4_note": "F4 verifies this local nodes N-1 OUTBOUND mesh edges to every peer via both GET health and POST sync_push dry_run. Aggregator ANDs across N nodes to confirm full N*(N-1) bidirectional reachability. Gates baseline_pass.",
				"ai_memory_mcp_stdio_f5": true,
				"ai_memory_mcp_stdio_init_ok": true,
				"ai_memory_mcp_stdio_tools_ok": true,
				"ai_memory_mcp_stdio_tools_found": "memory_agent_list,memory_agent_register,memory_archive_list,memory_archive_purge,memory_archive_restore,memory_archive_stats,memory_auto_tag,memory_capabilities,memory_check_duplicate,memory_consolidate,memory_delete,memory_detect_contradiction,memory_entity_get_by_alias,memory_entity_register,memory_expand_query,memory_forget,memory_gc,memory_get,memory_get_links,memory_get_taxonomy,memory_inbox,memory_kg_invalidate,memory_kg_query,memory_kg_timeline,memory_link,memory_list,memory_list_subscriptions,memory_namespace_clear_standard,memory_namespace_get_standard,memory_namespace_set_standard,memory_notify,memory_pending_approve,memory_pending_list,memory_pending_reject,memory_promote,memory_recall,memory_search,memory_session_start,memory_stats,memory_store,memory_subscribe,memory_unsubscribe,memory_update",
				"_f5_note": "F5 spawns the ai-memory stdio MCP subprocess using the framework-configured invocation and verifies initialize + tools/list return memory_store, memory_recall, memory_list. Deterministic (no LLM). Gates baseline_pass.",
				"tls_mode": "off",
				"tls_handshake_f6": true,
				"tls_handshake_f6_reason": "",
				"mtls_enforcement_f7": true,
				"mtls_enforcement_f7_reason": "",
				"_f6_f7_note": "F6 verifies the TLS 1.3 handshake against the local serve + CA chain. F7 verifies mTLS enforcement — anonymous client rejected, whitelisted client accepted. Both gate baseline_pass when tls_mode != off / mtls respectively.",
				"embedder_loaded_f8": true,
				"embedder_loaded_f8_reason": "",
				"_f8_note": "F8 verifies /api/v1/capabilities reports features.embedder_loaded=true — i.e. the MiniLM embedder initialised at serve startup. Gates baseline_pass unconditionally. Without this, scenario-18 silently black-holes (semantic recall returns 0 rows).",
				"agent_mcp_ai_memory_canary": true,
				"canary_uuid": "253ba52b-4040-47f6-8362-d06b29e5f1b2",
				"canary_namespace": "_baseline_canary_f2a"
			},
			"baseline_pass": false
		},
		{
			"spec_version": "1.4.0",
			"agent_type": "hermes",
			"agent_id": "ai:bob",
			"node_index": "2",
			"framework_version": "Hermes Agent v0.12.0 (2026.4.30)",
			"ai_memory_version": "0.6.3.1",
			"peer_urls": "http://10.11.0.4:9077,http://10.11.0.2:9077,http://10.11.0.5:9077",
			"config_file_sha256": "889c475f0926686ffdc3827accbd19cba3d820a90c5ae908eb1ac33f71eb0098",
			"config_attestation": {
				"framework_is_authentic": true,
				"mcp_server_ai_memory_registered": true,
				"llm_backend_is_xai_grok": true,
				"llm_is_default_provider": true,
				"mcp_command_is_ai_memory": true,
				"agent_id_stamped": true,
				"federation_live": true,
				"ufw_disabled": true,
				"iptables_flushed": true,
				"dead_man_switch_scheduled": true
			},
			"negative_invariants": {
				"_description": "Alternative A2A channels must be OFF so a passing scenario is only passing via ai-memory shared memory. Any true here = thesis-preserving.",
				"a2a_protocol_off": true,
				"sub_agent_or_sessions_spawn_off": true,
				"alternative_channels_off": true,
				"tool_allowlist_is_memory_only": true,
				"a2a_gate_profile_locked": true
			},
			"functional_probes": {
				"xai_grok_chat_reachable": true,
				"xai_grok_sample_reply": "READY",
				"substrate_http_canary_f2a": true,
				"substrate_http_canary_uuid": "0cac60f8-3814-42e0-b119-f9c21187d7e6",
				"agent_mcp_canary_f2b": false,
				"agent_mcp_canary_uuid": "9fad1136-6c0c-4369-b321-ed4f2c9f88a7",
				"agent_canary_response_head": "",
				"_f2b_note": "F2b is LLM-dependent and non-blocking. F2a (deterministic HTTP substrate) gates baseline_pass.",
				"hermes_peer_a2a_repro_f3b": false,
				"hermes_peer_a2a_repro_uuid": "",
				"_f3b_note": "F3b is hermes-only and observed (non-blocking). Asserts the agent-driven mcp_memory_memory_store path actually lands a row through the hermes_cli tool dispatcher, distinct from the workflow-level F3 which probes substrate-only HTTP federation. False here while F2b is true => allowlist filter regression in hermes_cli/mcp_tools.py.",
				"mesh_connectivity_f4": true,
				"mesh_edges_ok": 3,
				"mesh_edges_total": 3,
				"mesh_edges_detail": "10.11.0.4:9077:OK,10.11.0.2:9077:OK,10.11.0.5:9077:OK",
				"_f4_note": "F4 verifies this local nodes N-1 OUTBOUND mesh edges to every peer via both GET health and POST sync_push dry_run. Aggregator ANDs across N nodes to confirm full N*(N-1) bidirectional reachability. Gates baseline_pass.",
				"ai_memory_mcp_stdio_f5": true,
				"ai_memory_mcp_stdio_init_ok": true,
				"ai_memory_mcp_stdio_tools_ok": true,
				"ai_memory_mcp_stdio_tools_found": "memory_agent_list,memory_agent_register,memory_archive_list,memory_archive_purge,memory_archive_restore,memory_archive_stats,memory_auto_tag,memory_capabilities,memory_check_duplicate,memory_consolidate,memory_delete,memory_detect_contradiction,memory_entity_get_by_alias,memory_entity_register,memory_expand_query,memory_forget,memory_gc,memory_get,memory_get_links,memory_get_taxonomy,memory_inbox,memory_kg_invalidate,memory_kg_query,memory_kg_timeline,memory_link,memory_list,memory_list_subscriptions,memory_namespace_clear_standard,memory_namespace_get_standard,memory_namespace_set_standard,memory_notify,memory_pending_approve,memory_pending_list,memory_pending_reject,memory_promote,memory_recall,memory_search,memory_session_start,memory_stats,memory_store,memory_subscribe,memory_unsubscribe,memory_update",
				"_f5_note": "F5 spawns the ai-memory stdio MCP subprocess using the framework-configured invocation and verifies initialize + tools/list return memory_store, memory_recall, memory_list. Deterministic (no LLM). Gates baseline_pass.",
				"tls_mode": "off",
				"tls_handshake_f6": true,
				"tls_handshake_f6_reason": "",
				"mtls_enforcement_f7": true,
				"mtls_enforcement_f7_reason": "",
				"_f6_f7_note": "F6 verifies the TLS 1.3 handshake against the local serve + CA chain. F7 verifies mTLS enforcement — anonymous client rejected, whitelisted client accepted. Both gate baseline_pass when tls_mode != off / mtls respectively.",
				"embedder_loaded_f8": true,
				"embedder_loaded_f8_reason": "",
				"_f8_note": "F8 verifies /api/v1/capabilities reports features.embedder_loaded=true — i.e. the MiniLM embedder initialised at serve startup. Gates baseline_pass unconditionally. Without this, scenario-18 silently black-holes (semantic recall returns 0 rows).",
				"agent_mcp_ai_memory_canary": true,
				"canary_uuid": "0cac60f8-3814-42e0-b119-f9c21187d7e6",
				"canary_namespace": "_baseline_canary_f2a"
			},
			"baseline_pass": true
		},
		{
			"spec_version": "1.4.0",
			"agent_type": "hermes",
			"agent_id": "ai:charlie",
			"node_index": "3",
			"framework_version": "Hermes Agent v0.12.0 (2026.4.30)",
			"ai_memory_version": "0.6.3.1",
			"peer_urls": "http://10.11.0.4:9077,http://10.11.0.3:9077,http://10.11.0.5:9077",
			"config_file_sha256": "00622f57d71b6fd1a9018469eff1efa71f7ec334eda63f58017c4ce34744c35f",
			"config_attestation": {
				"framework_is_authentic": true,
				"mcp_server_ai_memory_registered": true,
				"llm_backend_is_xai_grok": true,
				"llm_is_default_provider": true,
				"mcp_command_is_ai_memory": true,
				"agent_id_stamped": true,
				"federation_live": true,
				"ufw_disabled": true,
				"iptables_flushed": true,
				"dead_man_switch_scheduled": true
			},
			"negative_invariants": {
				"_description": "Alternative A2A channels must be OFF so a passing scenario is only passing via ai-memory shared memory. Any true here = thesis-preserving.",
				"a2a_protocol_off": true,
				"sub_agent_or_sessions_spawn_off": true,
				"alternative_channels_off": true,
				"tool_allowlist_is_memory_only": true,
				"a2a_gate_profile_locked": true
			},
			"functional_probes": {
				"xai_grok_chat_reachable": true,
				"xai_grok_sample_reply": "READY",
				"substrate_http_canary_f2a": true,
				"substrate_http_canary_uuid": "640aeeda-2597-48f0-aef3-4f88c96fe36e",
				"agent_mcp_canary_f2b": false,
				"agent_mcp_canary_uuid": "faf8d4f5-81ab-4eae-a3a1-efe960a34ee3",
				"agent_canary_response_head": "DONE ",
				"_f2b_note": "F2b is LLM-dependent and non-blocking. F2a (deterministic HTTP substrate) gates baseline_pass.",
				"hermes_peer_a2a_repro_f3b": false,
				"hermes_peer_a2a_repro_uuid": "",
				"_f3b_note": "F3b is hermes-only and observed (non-blocking). Asserts the agent-driven mcp_memory_memory_store path actually lands a row through the hermes_cli tool dispatcher, distinct from the workflow-level F3 which probes substrate-only HTTP federation. False here while F2b is true => allowlist filter regression in hermes_cli/mcp_tools.py.",
				"mesh_connectivity_f4": true,
				"mesh_edges_ok": 3,
				"mesh_edges_total": 3,
				"mesh_edges_detail": "10.11.0.4:9077:OK,10.11.0.3:9077:OK,10.11.0.5:9077:OK",
				"_f4_note": "F4 verifies this local nodes N-1 OUTBOUND mesh edges to every peer via both GET health and POST sync_push dry_run. Aggregator ANDs across N nodes to confirm full N*(N-1) bidirectional reachability. Gates baseline_pass.",
				"ai_memory_mcp_stdio_f5": true,
				"ai_memory_mcp_stdio_init_ok": true,
				"ai_memory_mcp_stdio_tools_ok": true,
				"ai_memory_mcp_stdio_tools_found": "memory_agent_list,memory_agent_register,memory_archive_list,memory_archive_purge,memory_archive_restore,memory_archive_stats,memory_auto_tag,memory_capabilities,memory_check_duplicate,memory_consolidate,memory_delete,memory_detect_contradiction,memory_entity_get_by_alias,memory_entity_register,memory_expand_query,memory_forget,memory_gc,memory_get,memory_get_links,memory_get_taxonomy,memory_inbox,memory_kg_invalidate,memory_kg_query,memory_kg_timeline,memory_link,memory_list,memory_list_subscriptions,memory_namespace_clear_standard,memory_namespace_get_standard,memory_namespace_set_standard,memory_notify,memory_pending_approve,memory_pending_list,memory_pending_reject,memory_promote,memory_recall,memory_search,memory_session_start,memory_stats,memory_store,memory_subscribe,memory_unsubscribe,memory_update",
				"_f5_note": "F5 spawns the ai-memory stdio MCP subprocess using the framework-configured invocation and verifies initialize + tools/list return memory_store, memory_recall, memory_list. Deterministic (no LLM). Gates baseline_pass.",
				"tls_mode": "off",
				"tls_handshake_f6": true,
				"tls_handshake_f6_reason": "",
				"mtls_enforcement_f7": true,
				"mtls_enforcement_f7_reason": "",
				"_f6_f7_note": "F6 verifies the TLS 1.3 handshake against the local serve + CA chain. F7 verifies mTLS enforcement — anonymous client rejected, whitelisted client accepted. Both gate baseline_pass when tls_mode != off / mtls respectively.",
				"embedder_loaded_f8": true,
				"embedder_loaded_f8_reason": "",
				"_f8_note": "F8 verifies /api/v1/capabilities reports features.embedder_loaded=true — i.e. the MiniLM embedder initialised at serve startup. Gates baseline_pass unconditionally. Without this, scenario-18 silently black-holes (semantic recall returns 0 rows).",
				"agent_mcp_ai_memory_canary": true,
				"canary_uuid": "640aeeda-2597-48f0-aef3-4f88c96fe36e",
				"canary_namespace": "_baseline_canary_f2a"
			},
			"baseline_pass": true
		}
	],
	"failure_mode": "baseline-violation"
}

raw file

Run focus

Campaign failed: no scenario reports recovered.

What this campaign tested: No scenarios were exercised as reports were not recovered, despite requesting 35 scenarios covering various transport, framework, and primitive axes.

What it demonstrated: The results demonstrated a complete failure in recovering scenario reports, proving nothing about AI memory functionality.

AI NHI analysis · Claude Opus 4.7

Campaign failed: no scenario reports recovered.

FAIL — no scenarios executed

For three audiences

Non-technical end users

The test run didn't work at all because no results were collected from any of the planned checks. This means we have no new information on whether agents can reliably share memories with each other. Previous tests' findings remain unchanged.

C-level decision makers

This campaign run failed entirely due to missing scenario reports, increasing risk posture from unverified changes in v0.6.3.1. Production readiness is stalled; no customer-facing claims can be updated based on this. Results unchanged from prior successful runs.

Engineers & architects

Failure mode: complete absence of scenario reports for all 35 requested (S1, S1b, S2, S4-S6, S9-S18, S22-S25, S28-S42), with overall_pass false and reason 'no scenario reports recovered'. Impacts all primitives as nothing was tested; probable root cause is harness or CI workflow failure (harness_sha 526e48e8cd221aa862e484799cd5469f87dba0e5). No specific testbook/probe identifiers available due to missing reports.

What changes going into the next campaign

Fix scenario report recovery mechanism in the CI harness to ensure reports are captured and aggregated.

All artifacts