Campaign a2a-ironclaw-v3r15-mtls-develop FAIL

Agent group: ironclaw (homogeneous)
ai-memory ref: develop
Completed at: 2026-04-22T19:34:02Z
Overall pass: false
Skipped reports: 0

Infrastructure

Provider: digitalocean
Region: nyc3
Droplet size: s-2vcpu-4gb
Topology: 4-node federation mesh (W=2/N=4)
Scenarios started
Scenarios ended
Dispatched by: alphaonedev
Harness SHA: 006cdf4a787b
Workflow run: https://github.com/alphaonedev/ai-memory-ai2ai-gate/actions/runs/24797477055

Node roster

#	Role	Agent ID	Public IP	Private IP
1	agent	`ai:alice`	`45.55.205.110`	`10.10.2.4`
2	agent	`ai:bob`	`159.203.164.89`	`10.10.2.3`
3	agent	`ai:charlie`	`104.131.92.7`	`10.10.2.5`
4	memory-only	`—`	`104.236.255.13`	`10.10.2.2`

Baseline attestation BASELINE VIOLATION

Per the authoritative baseline spec, every agent node must emit a self-attestation before any scenario is permitted to run. This run's attestation:

Spec version: 1.0.0 — see authoritative baseline.

Node	Agent	Framework	Authentic	MCP ai-memory	xAI cfg	xAI default	Agent ID	Federation	UFW off	iptables	dead-man	F1 xAI	F2a substrate	F2b agent (non-gating)	Config SHA	Pass

a2a-baseline.json

{
	"baseline_pass": false,
	"per_node": [],
	"failure_mode": "baseline-absent"
}

raw file

Run focus

Campaign failed with no scenario reports recovered.

What this campaign tested: This run intended to exercise 35 scenarios covering basic to advanced AI-to-AI memory sharing across mTLS transport, various frameworks, and federation primitives but recovered no reports.

What it demonstrated: The results demonstrated a complete failure in the testing harness, providing no evidence on memory sharing reliability or functionality.

AI NHI analysis · Claude Opus 4.7

Campaign failed with no scenario reports recovered.

FAIL — no scenario reports recovered

For three audiences

Non-technical end users

The test to check if AI agents can reliably share memories with each other didn't work at all. No results were collected, so we have no idea if agents can remember things from one another. This means the system isn't proven to work as expected.

C-level decision makers

This run poses high risk due to total failure in collecting any test outcomes, indicating the system is not production-ready. No customer-facing claims about reliable AI memory sharing can be supported. Compared to prior runs, this represents a regression in testing infrastructure stability.

Engineers & architects

The run failed entirely with no scenario reports recovered, impacting all requested scenarios (1,1b,2,4-6,9-18,22-25,28-42) across mTLS federation mesh. Probable root cause is a bug in the CI harness artifact collection (harness_sha: 006cdf4a787bfec7cfc5007fd40ae990e22e5860). No primitives were tested; all effectively skipped with overall_pass false.

What changes going into the next campaign

Fix the scenario report recovery mechanism in the CI workflow to ensure artifacts are collected properly.

All artifacts

Generated by scripts/generate_run_html.sh. Methodology: alphaonedev.github.io/ai-memory-ai2ai-gate/methodology. Analysis source: analysis/run-insights.json.