../ runs index

Campaign a2a-hermes-v3r19-mtls-release-v0.6.2 FAIL

Agent group
hermes (homogeneous)
ai-memory ref
release/v0.6.2
Completed at
2026-04-23T00:08:23Z
Overall pass
false
Skipped reports
0

Infrastructure

Provider
digitalocean
Region
nyc3
Droplet size
s-2vcpu-4gb
Topology
4-node federation mesh (W=2/N=4)
Scenarios started
Scenarios ended
Dispatched by
alphaonedev
Harness SHA
28875c272ce8
Workflow run
https://github.com/alphaonedev/ai-memory-ai2ai-gate/actions/runs/24808691315

Node roster

#RoleAgent IDPublic IPPrivate IP
1agentai:alice45.55.44.910.11.2.4
2agentai:bob167.71.168.13210.11.2.5
3agentai:charlie167.71.242.4110.11.2.2
4memory-only167.172.236.2610.11.2.3

Baseline attestation BASELINE VIOLATION

Per the authoritative baseline spec, every agent node must emit a self-attestation before any scenario is permitted to run. This run's attestation:

Spec version: 1.0.0 — see authoritative baseline.

NodeAgentFrameworkAuthenticMCP ai-memoryxAI cfgxAI defaultAgent IDFederationUFW offiptablesdead-manF1 xAIF2a substrateF2b agent (non-gating)Config SHAPass
a2a-baseline.json
{
	"baseline_pass": false,
	"per_node": [],
	"failure_mode": "baseline-absent"
}

raw file

Run focus

Campaign failed with no scenario reports recovered.

What this campaign tested: No scenarios were exercised, providing zero coverage across transport, framework, or primitives axes due to report recovery failure.

What it demonstrated: The results demonstrated a complete failure in the testing harness, preventing any assessment of AI memory federation functionality.

AI NHI analysis · Claude Opus 4.7

Campaign failed with no scenario reports recovered.

FAIL — no scenarios executed or reported.

For three audiences

Non-technical end users

This test run didn't produce any results because no scenario reports were collected. As a result, we couldn't determine if the AI agents can reliably share memories with each other. The setup needs fixing to run proper tests in the future.

C-level decision makers

The campaign failed entirely without results, highlighting significant risks in CI pipeline reliability and blocking production readiness evaluation. Customer-facing claims about agent memory sharing remain unvalidated due to this harness issue. This represents a full regression, as no prior partial successes were built upon.

Engineers & architects

The run aborted with 'no scenario reports recovered' despite 35 scenarios requested, resulting in an empty scenarios array and overall_pass false. This impacts all primitives, with probable root cause in the harness collection/upload process (harness_sha: 28875c272ce86d10b81466cec10ac4da7bf57b74); no specific S# or F# executed. Infrastructure provisioned successfully (4-node mesh), but post-execution reporting failed entirely.

What changes going into the next campaign

Implement logging and error handling in the harness to diagnose and prevent report recovery failures.

All artifacts