../ runs index

Campaign a2a-hermes-v0.6.3.1-r14 FAIL

Agent group
hermes (homogeneous)
ai-memory ref
v0.6.3.1
Completed at
2026-05-03T17:40:11Z
Overall pass
false
Skipped reports
0

Infrastructure

Provider
digitalocean
Region
nyc3
Droplet size
Topology
4-node federation mesh (W=2/N=4)
Scenarios started
Scenarios ended
Dispatched by
alphaonedev
Harness SHA
f629ca836d1c
Workflow run
https://github.com/alphaonedev/ai-memory-a2a-v0.6.3.1/actions/runs/25285878847

Node roster

#RoleAgent IDPublic IPPrivate IP
1agentai:alice138.197.42.11610.11.0.3
2agentai:bob167.71.184.9410.11.0.5
3agentai:charlie104.131.19.6210.11.0.4
4memory-only104.236.72.2110.11.0.2

Baseline attestation BASELINE VIOLATION

Per the authoritative baseline spec, every agent node must emit a self-attestation before any scenario is permitted to run. This run's attestation:

Spec version: 1.4.0 — see authoritative baseline.

NodeAgentFrameworkAuthenticMCP ai-memoryxAI cfgxAI defaultAgent IDFederationUFW offiptablesdead-manF1 xAIF2a substrateF2b agent (non-gating)Config SHAPass
node-1ai:alicehermes Hermes Agent v0.12.0 (2026.4.30)12f99ec56116FAIL
node-2ai:bobhermes Hermes Agent v0.12.0 (2026.4.30)889c475f0926FAIL
node-3ai:charliehermes Hermes Agent v0.12.0 (2026.4.30)00622f57d71bFAIL
a2a-baseline.json
{
	"baseline_pass": false,
	"per_node": [
		{
			"spec_version": "1.4.0",
			"agent_type": "hermes",
			"agent_id": "ai:alice",
			"node_index": "1",
			"framework_version": "Hermes Agent v0.12.0 (2026.4.30)",
			"ai_memory_version": "0.6.3.1",
			"peer_urls": "http://10.11.0.5:9077,http://10.11.0.4:9077,http://10.11.0.2:9077",
			"config_file_sha256": "12f99ec56116dbd03748777fabc1697dbcc89bd41e0a1470c0dae152987998de",
			"config_attestation": {
				"framework_is_authentic": true,
				"mcp_server_ai_memory_registered": true,
				"llm_backend_is_xai_grok": true,
				"llm_is_default_provider": true,
				"mcp_command_is_ai_memory": true,
				"agent_id_stamped": true,
				"federation_live": true,
				"ufw_disabled": true,
				"iptables_flushed": true,
				"dead_man_switch_scheduled": true
			},
			"negative_invariants": {
				"_description": "Alternative A2A channels must be OFF so a passing scenario is only passing via ai-memory shared memory. Any true here = thesis-preserving.",
				"a2a_protocol_off": true,
				"sub_agent_or_sessions_spawn_off": true,
				"alternative_channels_off": true,
				"tool_allowlist_is_memory_only": true,
				"a2a_gate_profile_locked": true
			},
			"functional_probes": {
				"xai_grok_chat_reachable": false,
				"xai_grok_sample_reply": "",
				"substrate_http_canary_f2a": true,
				"substrate_http_canary_uuid": "a49d2cd5-9df7-4313-b702-afa5ea3e9594",
				"agent_mcp_canary_f2b": false,
				"agent_mcp_canary_uuid": "fe4cd27b-8df1-45d3-8a47-c59c27fba381",
				"agent_canary_response_head": "API call failed after 3 retries: HTTP 429: Error code: 429 - {'code': 'Some resource has been exhausted', 'error': 'Your team 406ee526-41d2-4b73-bf58-d35c12860ed4 has either used all available credits or reached its monthly spending limit. To continue making API requests, please purchase more credits or raise your spending limit.'} ",
				"_f2b_note": "F2b is LLM-dependent and non-blocking. F2a (deterministic HTTP substrate) gates baseline_pass.",
				"hermes_peer_a2a_repro_f3b": false,
				"hermes_peer_a2a_repro_uuid": "",
				"_f3b_note": "F3b is hermes-only and observed (non-blocking). Asserts the agent-driven mcp_memory_memory_store path actually lands a row through the hermes_cli tool dispatcher, distinct from the workflow-level F3 which probes substrate-only HTTP federation. False here while F2b is true => allowlist filter regression in hermes_cli/mcp_tools.py.",
				"mesh_connectivity_f4": true,
				"mesh_edges_ok": 3,
				"mesh_edges_total": 3,
				"mesh_edges_detail": "10.11.0.5:9077:OK,10.11.0.4:9077:OK,10.11.0.2:9077:OK",
				"_f4_note": "F4 verifies this local nodes N-1 OUTBOUND mesh edges to every peer via both GET health and POST sync_push dry_run. Aggregator ANDs across N nodes to confirm full N*(N-1) bidirectional reachability. Gates baseline_pass.",
				"ai_memory_mcp_stdio_f5": true,
				"ai_memory_mcp_stdio_init_ok": true,
				"ai_memory_mcp_stdio_tools_ok": true,
				"ai_memory_mcp_stdio_tools_found": "memory_agent_list,memory_agent_register,memory_archive_list,memory_archive_purge,memory_archive_restore,memory_archive_stats,memory_auto_tag,memory_capabilities,memory_check_duplicate,memory_consolidate,memory_delete,memory_detect_contradiction,memory_entity_get_by_alias,memory_entity_register,memory_expand_query,memory_forget,memory_gc,memory_get,memory_get_links,memory_get_taxonomy,memory_inbox,memory_kg_invalidate,memory_kg_query,memory_kg_timeline,memory_link,memory_list,memory_list_subscriptions,memory_namespace_clear_standard,memory_namespace_get_standard,memory_namespace_set_standard,memory_notify,memory_pending_approve,memory_pending_list,memory_pending_reject,memory_promote,memory_recall,memory_search,memory_session_start,memory_stats,memory_store,memory_subscribe,memory_unsubscribe,memory_update",
				"_f5_note": "F5 spawns the ai-memory stdio MCP subprocess using the framework-configured invocation and verifies initialize + tools/list return memory_store, memory_recall, memory_list. Deterministic (no LLM). Gates baseline_pass.",
				"tls_mode": "off",
				"tls_handshake_f6": true,
				"tls_handshake_f6_reason": "",
				"mtls_enforcement_f7": true,
				"mtls_enforcement_f7_reason": "",
				"_f6_f7_note": "F6 verifies the TLS 1.3 handshake against the local serve + CA chain. F7 verifies mTLS enforcement — anonymous client rejected, whitelisted client accepted. Both gate baseline_pass when tls_mode != off / mtls respectively.",
				"embedder_loaded_f8": true,
				"embedder_loaded_f8_reason": "",
				"_f8_note": "F8 verifies /api/v1/capabilities reports features.embedder_loaded=true — i.e. the MiniLM embedder initialised at serve startup. Gates baseline_pass unconditionally. Without this, scenario-18 silently black-holes (semantic recall returns 0 rows).",
				"agent_mcp_ai_memory_canary": true,
				"canary_uuid": "a49d2cd5-9df7-4313-b702-afa5ea3e9594",
				"canary_namespace": "_baseline_canary_f2a"
			},
			"baseline_pass": false
		},
		{
			"spec_version": "1.4.0",
			"agent_type": "hermes",
			"agent_id": "ai:bob",
			"node_index": "2",
			"framework_version": "Hermes Agent v0.12.0 (2026.4.30)",
			"ai_memory_version": "0.6.3.1",
			"peer_urls": "http://10.11.0.3:9077,http://10.11.0.4:9077,http://10.11.0.2:9077",
			"config_file_sha256": "889c475f0926686ffdc3827accbd19cba3d820a90c5ae908eb1ac33f71eb0098",
			"config_attestation": {
				"framework_is_authentic": true,
				"mcp_server_ai_memory_registered": true,
				"llm_backend_is_xai_grok": true,
				"llm_is_default_provider": true,
				"mcp_command_is_ai_memory": true,
				"agent_id_stamped": true,
				"federation_live": true,
				"ufw_disabled": true,
				"iptables_flushed": true,
				"dead_man_switch_scheduled": true
			},
			"negative_invariants": {
				"_description": "Alternative A2A channels must be OFF so a passing scenario is only passing via ai-memory shared memory. Any true here = thesis-preserving.",
				"a2a_protocol_off": true,
				"sub_agent_or_sessions_spawn_off": true,
				"alternative_channels_off": true,
				"tool_allowlist_is_memory_only": true,
				"a2a_gate_profile_locked": true
			},
			"functional_probes": {
				"xai_grok_chat_reachable": false,
				"xai_grok_sample_reply": "",
				"substrate_http_canary_f2a": true,
				"substrate_http_canary_uuid": "54c9c4a0-8154-49a1-a94a-cc38a513598f",
				"agent_mcp_canary_f2b": false,
				"agent_mcp_canary_uuid": "6d95497e-38dd-4b05-bdfd-773d0a8157b7",
				"agent_canary_response_head": "API call failed after 3 retries: HTTP 429: Error code: 429 - {'code': 'Some resource has been exhausted', 'error': 'Your team 406ee526-41d2-4b73-bf58-d35c12860ed4 has either used all available credits or reached its monthly spending limit. To continue making API requests, please purchase more credits or raise your spending limit.'} ",
				"_f2b_note": "F2b is LLM-dependent and non-blocking. F2a (deterministic HTTP substrate) gates baseline_pass.",
				"hermes_peer_a2a_repro_f3b": false,
				"hermes_peer_a2a_repro_uuid": "",
				"_f3b_note": "F3b is hermes-only and observed (non-blocking). Asserts the agent-driven mcp_memory_memory_store path actually lands a row through the hermes_cli tool dispatcher, distinct from the workflow-level F3 which probes substrate-only HTTP federation. False here while F2b is true => allowlist filter regression in hermes_cli/mcp_tools.py.",
				"mesh_connectivity_f4": true,
				"mesh_edges_ok": 3,
				"mesh_edges_total": 3,
				"mesh_edges_detail": "10.11.0.3:9077:OK,10.11.0.4:9077:OK,10.11.0.2:9077:OK",
				"_f4_note": "F4 verifies this local nodes N-1 OUTBOUND mesh edges to every peer via both GET health and POST sync_push dry_run. Aggregator ANDs across N nodes to confirm full N*(N-1) bidirectional reachability. Gates baseline_pass.",
				"ai_memory_mcp_stdio_f5": true,
				"ai_memory_mcp_stdio_init_ok": true,
				"ai_memory_mcp_stdio_tools_ok": true,
				"ai_memory_mcp_stdio_tools_found": "memory_agent_list,memory_agent_register,memory_archive_list,memory_archive_purge,memory_archive_restore,memory_archive_stats,memory_auto_tag,memory_capabilities,memory_check_duplicate,memory_consolidate,memory_delete,memory_detect_contradiction,memory_entity_get_by_alias,memory_entity_register,memory_expand_query,memory_forget,memory_gc,memory_get,memory_get_links,memory_get_taxonomy,memory_inbox,memory_kg_invalidate,memory_kg_query,memory_kg_timeline,memory_link,memory_list,memory_list_subscriptions,memory_namespace_clear_standard,memory_namespace_get_standard,memory_namespace_set_standard,memory_notify,memory_pending_approve,memory_pending_list,memory_pending_reject,memory_promote,memory_recall,memory_search,memory_session_start,memory_stats,memory_store,memory_subscribe,memory_unsubscribe,memory_update",
				"_f5_note": "F5 spawns the ai-memory stdio MCP subprocess using the framework-configured invocation and verifies initialize + tools/list return memory_store, memory_recall, memory_list. Deterministic (no LLM). Gates baseline_pass.",
				"tls_mode": "off",
				"tls_handshake_f6": true,
				"tls_handshake_f6_reason": "",
				"mtls_enforcement_f7": true,
				"mtls_enforcement_f7_reason": "",
				"_f6_f7_note": "F6 verifies the TLS 1.3 handshake against the local serve + CA chain. F7 verifies mTLS enforcement — anonymous client rejected, whitelisted client accepted. Both gate baseline_pass when tls_mode != off / mtls respectively.",
				"embedder_loaded_f8": true,
				"embedder_loaded_f8_reason": "",
				"_f8_note": "F8 verifies /api/v1/capabilities reports features.embedder_loaded=true — i.e. the MiniLM embedder initialised at serve startup. Gates baseline_pass unconditionally. Without this, scenario-18 silently black-holes (semantic recall returns 0 rows).",
				"agent_mcp_ai_memory_canary": true,
				"canary_uuid": "54c9c4a0-8154-49a1-a94a-cc38a513598f",
				"canary_namespace": "_baseline_canary_f2a"
			},
			"baseline_pass": false
		},
		{
			"spec_version": "1.4.0",
			"agent_type": "hermes",
			"agent_id": "ai:charlie",
			"node_index": "3",
			"framework_version": "Hermes Agent v0.12.0 (2026.4.30)",
			"ai_memory_version": "0.6.3.1",
			"peer_urls": "http://10.11.0.3:9077,http://10.11.0.5:9077,http://10.11.0.2:9077",
			"config_file_sha256": "00622f57d71b6fd1a9018469eff1efa71f7ec334eda63f58017c4ce34744c35f",
			"config_attestation": {
				"framework_is_authentic": true,
				"mcp_server_ai_memory_registered": true,
				"llm_backend_is_xai_grok": true,
				"llm_is_default_provider": true,
				"mcp_command_is_ai_memory": true,
				"agent_id_stamped": true,
				"federation_live": true,
				"ufw_disabled": true,
				"iptables_flushed": true,
				"dead_man_switch_scheduled": true
			},
			"negative_invariants": {
				"_description": "Alternative A2A channels must be OFF so a passing scenario is only passing via ai-memory shared memory. Any true here = thesis-preserving.",
				"a2a_protocol_off": true,
				"sub_agent_or_sessions_spawn_off": true,
				"alternative_channels_off": true,
				"tool_allowlist_is_memory_only": true,
				"a2a_gate_profile_locked": true
			},
			"functional_probes": {
				"xai_grok_chat_reachable": false,
				"xai_grok_sample_reply": "",
				"substrate_http_canary_f2a": true,
				"substrate_http_canary_uuid": "e392ce51-c026-4160-9a5b-c41ad58d5516",
				"agent_mcp_canary_f2b": false,
				"agent_mcp_canary_uuid": "56909404-bf6d-4f19-ba7c-3c804a6ddd9a",
				"agent_canary_response_head": "API call failed after 3 retries: HTTP 429: Error code: 429 - {'code': 'Some resource has been exhausted', 'error': 'Your team 406ee526-41d2-4b73-bf58-d35c12860ed4 has either used all available credits or reached its monthly spending limit. To continue making API requests, please purchase more credits or raise your spending limit.'} ",
				"_f2b_note": "F2b is LLM-dependent and non-blocking. F2a (deterministic HTTP substrate) gates baseline_pass.",
				"hermes_peer_a2a_repro_f3b": false,
				"hermes_peer_a2a_repro_uuid": "",
				"_f3b_note": "F3b is hermes-only and observed (non-blocking). Asserts the agent-driven mcp_memory_memory_store path actually lands a row through the hermes_cli tool dispatcher, distinct from the workflow-level F3 which probes substrate-only HTTP federation. False here while F2b is true => allowlist filter regression in hermes_cli/mcp_tools.py.",
				"mesh_connectivity_f4": true,
				"mesh_edges_ok": 3,
				"mesh_edges_total": 3,
				"mesh_edges_detail": "10.11.0.3:9077:OK,10.11.0.5:9077:OK,10.11.0.2:9077:OK",
				"_f4_note": "F4 verifies this local nodes N-1 OUTBOUND mesh edges to every peer via both GET health and POST sync_push dry_run. Aggregator ANDs across N nodes to confirm full N*(N-1) bidirectional reachability. Gates baseline_pass.",
				"ai_memory_mcp_stdio_f5": true,
				"ai_memory_mcp_stdio_init_ok": true,
				"ai_memory_mcp_stdio_tools_ok": true,
				"ai_memory_mcp_stdio_tools_found": "memory_agent_list,memory_agent_register,memory_archive_list,memory_archive_purge,memory_archive_restore,memory_archive_stats,memory_auto_tag,memory_capabilities,memory_check_duplicate,memory_consolidate,memory_delete,memory_detect_contradiction,memory_entity_get_by_alias,memory_entity_register,memory_expand_query,memory_forget,memory_gc,memory_get,memory_get_links,memory_get_taxonomy,memory_inbox,memory_kg_invalidate,memory_kg_query,memory_kg_timeline,memory_link,memory_list,memory_list_subscriptions,memory_namespace_clear_standard,memory_namespace_get_standard,memory_namespace_set_standard,memory_notify,memory_pending_approve,memory_pending_list,memory_pending_reject,memory_promote,memory_recall,memory_search,memory_session_start,memory_stats,memory_store,memory_subscribe,memory_unsubscribe,memory_update",
				"_f5_note": "F5 spawns the ai-memory stdio MCP subprocess using the framework-configured invocation and verifies initialize + tools/list return memory_store, memory_recall, memory_list. Deterministic (no LLM). Gates baseline_pass.",
				"tls_mode": "off",
				"tls_handshake_f6": true,
				"tls_handshake_f6_reason": "",
				"mtls_enforcement_f7": true,
				"mtls_enforcement_f7_reason": "",
				"_f6_f7_note": "F6 verifies the TLS 1.3 handshake against the local serve + CA chain. F7 verifies mTLS enforcement — anonymous client rejected, whitelisted client accepted. Both gate baseline_pass when tls_mode != off / mtls respectively.",
				"embedder_loaded_f8": true,
				"embedder_loaded_f8_reason": "",
				"_f8_note": "F8 verifies /api/v1/capabilities reports features.embedder_loaded=true — i.e. the MiniLM embedder initialised at serve startup. Gates baseline_pass unconditionally. Without this, scenario-18 silently black-holes (semantic recall returns 0 rows).",
				"agent_mcp_ai_memory_canary": true,
				"canary_uuid": "e392ce51-c026-4160-9a5b-c41ad58d5516",
				"canary_namespace": "_baseline_canary_f2a"
			},
			"baseline_pass": false
		}
	],
	"failure_mode": "baseline-violation"
}

raw file

Run focus

0/0 scenarios passed; 0 failed, 0 skipped

What this campaign tested: Campaign a2a-hermes-v0.6.3.1-r14 ran 0 testbook scenarios across transport, primitives, and cross-cutting axes.

What it demonstrated: Direct counts: pass=0, fail=0, skip=0.

AI NHI analysis · Claude Opus 4.7

0/0 scenarios passed; 0 failed, 0 skipped

PARTIAL — auto-generated (LLM unavailable: xAI HTTP 429: {"code":"Some resource has been exhausted","error":"Your team 406ee526-41d2-4b73-bf58-d35c12860ed4 has eit)

For three audiences

Non-technical end users

This run exercised 0 tests of AI-agent-to-AI-agent communication through ai-memory. 0 worked correctly, 0 did not, and 0 were intentionally skipped because prerequisites weren't met.

C-level decision makers

Run verdict partial. Detailed narrative synthesis unavailable (LLM call failed: xAI HTTP 429: {"code":"Some resource has been exhausted","error":"Your team 406ee526-41d2-4b73-bf58-d35c12860ed4 has eit). Counts are reliable; consult the per-scenario PASS/FAIL and the testbook for primitive-level mapping.

Engineers & architects

Scenario outcomes: pass=0 fail=0 skip=0 of 0 total. First failure reasons are persisted on each scenario-N.json. LLM narrative unavailable: xAI HTTP 429: {"code":"Some resource has been exhausted","error":"Your team 406ee526-41d2-4b73-bf58-d35c12860ed4 has eit.

What changes going into the next campaign

Investigate the LLM-call failure or re-run with XAI_API_KEY verified; counts are unaffected.

All artifacts