Scenario 2 — Shared-context handoff¶
Purpose¶
Test the canonical A2A pattern: one agent hands a piece of context to another agent through shared memory within a synchronous-enough window for a request-response flow to work.
Mechanics¶
- Agent A (
ai:alice, OpenClaw on node-1) writes a memory with:namespace:scenario2-handoffscope:teamtags:["handoff-to-B"]content: a unique token + timestamp
- Agent A signals done (writes a sentinel memory).
- Agent B (
ai:bob, Hermes on node-2) pollsmemory_recallon the namespace + tag filter every 500 ms. - Agent B's recall must return A's memory within a bounded wall- clock window (default: the ship-gate Phase 2 settle of 90 s plus a 5 s grace).
- Agent B extracts the token and writes a
handoff-ackmemory. - Agent A recalls the ack to confirm the round-trip.
Pass criterion¶
- Agent B sees Agent A's handoff within the bounded window.
- Agent A sees Agent B's ack within the same bounded window.
- Token round-trips unchanged.
metadata.agent_idon the recalled rows matches the writer's.
Report shape¶
{
"scenario": 2,
"pass": true,
"handoff_ms": 734,
"ack_ms": 812,
"bound_ms": 95000,
"token_integrity": true,
"reasons": [""]
}
What a green result proves¶
- Real A2A request-response agent patterns are viable through ai-memory at the timescales operators actually care about (sub- second to sub-minute handoffs).
- The scope=
teamvisibility rule is permissive enough for A2A within a shared namespace, without leaking beyond it. - Two different agent frameworks (OpenClaw ↔ Hermes) round-trip memory without any implementation-specific divergence.
What a red result would mean¶
- Convergence time exceeds 95 s — could be federation-side (if node-4 is federated to peers we're also reading from) or product-side lag.
- Scope enforcement regression (agent B can't see the row at all).
- Framework-level MCP client difference in interpreting the recall filter — caught by the specific OpenClaw-writes / Hermes-reads pairing.
For three audiences¶
This is the scenario that proves your AI agents can pass messages to each other through the memory system. When your assistant hands off a task to your scheduler agent, this is the round-trip that must work.
This scenario directly maps to the customer-facing promise that agents can collaborate through ai-memory. A red result means the headline claim is unsupported and a release is blocked. The timing bound (95 s default) matches what customers need for typical agent-to-agent handoffs.
Specific tags + scope matrix: scope=team with namespace
scenario2-handoff tagged handoff-to-B. Polls at 500 ms
with a 95 s bound. Measured handoff time published in the
artifact so regressions in convergence latency surface
quantitatively, not just as a pass/fail.