../ runs index

Campaign a2a-hermes-v0.6.2-patch2-r23-mtls FAIL

Agent group
hermes (homogeneous)
ai-memory ref
release/v0.6.2
Completed at
2026-04-23T16:48:20Z
Overall pass
false
Skipped reports
0

Infrastructure

Provider
digitalocean
Region
nyc3
Droplet size
s-2vcpu-4gb
Topology
4-node federation mesh (W=2/N=4)
Scenarios started
Scenarios ended
Dispatched by
alphaonedev
Harness SHA
89ac1adba1de
Workflow run
https://github.com/alphaonedev/ai-memory-ai2ai-gate/actions/runs/24845949218

Node roster

#RoleAgent IDPublic IPPrivate IP
1agentai:alice138.197.21.1210.11.2.4
2agentai:bob138.197.126.1110.11.2.3
3agentai:charlie104.131.95.25510.11.2.5
4memory-only174.138.64.2710.11.2.2

Baseline attestation BASELINE VIOLATION

Per the authoritative baseline spec, every agent node must emit a self-attestation before any scenario is permitted to run. This run's attestation:

Spec version: 1.4.0 — see authoritative baseline.

NodeAgentFrameworkAuthenticMCP ai-memoryxAI cfgxAI defaultAgent IDFederationUFW offiptablesdead-manF1 xAIF2a substrateF2b agent (non-gating)Config SHAPass
node-2ai:bobhermes Hermes Agent v0.10.0 (2026.4.16)21635cf63640FAIL
a2a-baseline.json
{
	"baseline_pass": false,
	"per_node": [
		{
			"spec_version": "1.4.0",
			"agent_type": "hermes",
			"agent_id": "ai:bob",
			"node_index": "2",
			"framework_version": "Hermes Agent v0.10.0 (2026.4.16)",
			"ai_memory_version": "v0.6.2",
			"peer_urls": "https://10.11.2.4:9077,https://10.11.2.5:9077,https://10.11.2.2:9077",
			"config_file_sha256": "21635cf6364057fd2a004d28aac89abf8438671d85f9fd2ed1e654d812d23ff1",
			"config_attestation": {
				"framework_is_authentic": true,
				"mcp_server_ai_memory_registered": true,
				"llm_backend_is_xai_grok": true,
				"llm_is_default_provider": true,
				"mcp_command_is_ai_memory": true,
				"agent_id_stamped": true,
				"federation_live": true,
				"ufw_disabled": true,
				"iptables_flushed": true,
				"dead_man_switch_scheduled": true
			},
			"negative_invariants": {
				"_description": "Alternative A2A channels must be OFF so a passing scenario is only passing via ai-memory shared memory. Any true here = thesis-preserving.",
				"a2a_protocol_off": true,
				"sub_agent_or_sessions_spawn_off": true,
				"alternative_channels_off": true,
				"tool_allowlist_is_memory_only": true,
				"a2a_gate_profile_locked": true
			},
			"functional_probes": {
				"xai_grok_chat_reachable": true,
				"xai_grok_sample_reply": "READY",
				"substrate_http_canary_f2a": true,
				"substrate_http_canary_uuid": "13ab798a-1777-4e7a-8a85-343d22e22c66",
				"agent_mcp_canary_f2b": false,
				"agent_mcp_canary_uuid": "3d744d16-1c2c-48e9-a50e-35d7853cacfc",
				"agent_canary_response_head": "Traceback (most recent call last):   File \"/usr/local/bin/hermes\", line 11, in <module>     main()   File \"/root/.hermes/hermes-agent/hermes_cli/main.py\", line 8859, in main     args.func(args)   File \"/root/.hermes/hermes-agent/hermes_cli/main.py\", line 1159, in cmd_chat     from cli import main as cli_main   File \"/root/.hermes/hermes-agent/cli.py\", line 43, in <module>     from prompt_toolkit.history import FileHistory ModuleNotFoundError: No module named 'prompt_toolkit' ",
				"_f2b_note": "F2b is LLM-dependent and non-blocking. F2a (deterministic HTTP substrate) gates baseline_pass.",
				"mesh_connectivity_f4": false,
				"mesh_edges_ok": 1,
				"mesh_edges_total": 3,
				"mesh_edges_detail": "10.11.2.4:9077:FAIL(health=false,sync=false),10.11.2.5:9077:FAIL(health=false,sync=false),10.11.2.2:9077:OK",
				"_f4_note": "F4 verifies this local nodes N-1 OUTBOUND mesh edges to every peer via both GET health and POST sync_push dry_run. Aggregator ANDs across N nodes to confirm full N*(N-1) bidirectional reachability. Gates baseline_pass.",
				"ai_memory_mcp_stdio_f5": true,
				"ai_memory_mcp_stdio_init_ok": true,
				"ai_memory_mcp_stdio_tools_ok": true,
				"ai_memory_mcp_stdio_tools_found": "memory_agent_list,memory_agent_register,memory_archive_list,memory_archive_purge,memory_archive_restore,memory_archive_stats,memory_auto_tag,memory_capabilities,memory_consolidate,memory_delete,memory_detect_contradiction,memory_expand_query,memory_forget,memory_gc,memory_get,memory_get_links,memory_inbox,memory_link,memory_list,memory_list_subscriptions,memory_namespace_clear_standard,memory_namespace_get_standard,memory_namespace_set_standard,memory_notify,memory_pending_approve,memory_pending_list,memory_pending_reject,memory_promote,memory_recall,memory_search,memory_session_start,memory_stats,memory_store,memory_subscribe,memory_unsubscribe,memory_update",
				"_f5_note": "F5 spawns the ai-memory stdio MCP subprocess using the framework-configured invocation and verifies initialize + tools/list return memory_store, memory_recall, memory_list. Deterministic (no LLM). Gates baseline_pass.",
				"tls_mode": "mtls",
				"tls_handshake_f6": true,
				"tls_handshake_f6_reason": "",
				"mtls_enforcement_f7": true,
				"mtls_enforcement_f7_reason": "",
				"_f6_f7_note": "F6 verifies the TLS 1.3 handshake against the local serve + CA chain. F7 verifies mTLS enforcement — anonymous client rejected, whitelisted client accepted. Both gate baseline_pass when tls_mode != off / mtls respectively.",
				"agent_mcp_ai_memory_canary": true,
				"canary_uuid": "13ab798a-1777-4e7a-8a85-343d22e22c66",
				"canary_namespace": "_baseline_canary_f2a"
			},
			"baseline_pass": false
		}
	],
	"failure_mode": "baseline-violation"
}

raw file

Run focus

Campaign failed: no scenario reports recovered.

What this campaign tested: Attempted to exercise 35 scenarios covering transport (mTLS), framework (federation mesh), and primitives like memory sharing, but no reports were recovered.

What it demonstrated: The run demonstrated a critical failure in the testing harness, as no scenario results were collected or reported.

AI NHI analysis · Claude Opus 4.7

Campaign failed: no scenario reports recovered.

FAIL — no scenario reports recovered

For three audiences

Non-technical end users

This test run didn't work because no results from any of the planned checks were collected. We couldn't determine if agents can reliably share memories with each other. The problem seems to be in how the tests were set up or run.

C-level decision makers

High risk due to complete campaign failure from missing reports, blocking validation of v0.6.2 under mTLS for production. Customer-facing claims on reliable AI memory federation remain unproven. Represents a CI reliability regression compared to prior runs.

Engineers & architects

No per-scenario reports recovered, indicating a harness failure in artifact collection (harness_sha: 89ac1adba1de2a6909ed00d21d62f2c9d11ff051). All 35 requested scenarios (S1, S1b, S2, S4-S6, S9-S18, S22-S25, S28-S42) effectively skipped. Probable root cause is a CI workflow issue in retrieving outputs from the 4-node federation mesh; no specific primitives or failure modes observable.

What changes going into the next campaign

Fix the report recovery and artifact upload logic in the CI workflow before re-running.

All artifacts