Quality gates, aggregated.

A single entry point for everyone who asks "is this ai-memory release tested?" Aggregates the four-phase ship-gate (release testing) and the 42-scenario ai2ai-gate (A2A integration) into per-release evidence pages with cross-references to the underlying run artifacts.

The three gates

Two existing harnesses + this hub.

The gate repos do the work — provision DigitalOcean infrastructure, run scenarios, capture artifacts. This hub aggregates and presents the results so a release evidence page is one click from the ai-memory homepage.

Ship Gate

Four-phase release testing harness on DigitalOcean. Each phase has its own script and result format.

  • Phase 1 — Functional smoke (single-node, all MCP tools)
  • Phase 2 — Multi-agent coordination (federation W of N)
  • Phase 3 — Migration (cross-version upgrade path)
  • Phase 4 — Chaos (peer kill, network partition, clock skew)

A2A Gate (umbrella spec)

4-node DigitalOcean harness exercising ai-memory through IronClaw / Hermes / OpenClaw agents communicating agent-to-agent. The umbrella keeps the specification (testbook, scenario contracts, v1-GA criteria); per-release execution lives in ai-memory-a2a-v<version> repos.

  • v0.6.2 certified — 3/3 consecutive green across 6 cells
  • Matrix: ironclaw + hermes + openclaw × off / TLS / mTLS
  • Per-scenario JSON + stderr + provenance trace
  • Mutation, Byzantine peer, clock skew, identity spoofing
  • v0.6.3.1 execution in flight at ai-memory-a2a-v0.6.3.1 (24 scenarios; S9–S22 are v0.6.3.1-specific).

This Hub

Per-release evidence pages aggregating the gates above + cross-references. No tests run here — it's the presentation layer.

  • Per-release verdict (one line)
  • Phase-by-phase status
  • Cross-links to gate repos' run artifacts
  • Machine-readable summary.json per release
Per-release evidence

Pick a release.

Each release gets one evidence page with the gate-by-gate results and the verdict. The current in-flight release is at the top. The pattern going forward is one ai-memory-a2a-v<version> repo per release; the umbrella ai-memory-ai2ai-gate stays the spec.

v0.7.0-a2a
▶ Wave 4 complete · NOT GREEN · postgres SAL handler-surface gap → Wave 5 follow-up

Three-node A2A regression + net-new for v0.7.0: openclaw ↔ hermes ↔ postgres+AGE on a private DigitalOcean VPC, both agent personas driven by Grok 4.2 reasoning (grok-4.20-0309-reasoning). 76 scenarios consolidate the full prior universe (S1–S51 from ai2ai-gate v0.6.x baselines, ship-gate Phase 1–4, NHI Round 1–4) plus 25 net-new scenarios for v0.7.0-only surfaces (cross-agent attest_level signing, F8 enforce gate cross-agent, smart_load keyword veto, AI_MEMORY_TOOLS_VERBOSE env, audit chain A2A, memory_notify cross-droplet, subscription webhook A2A, postgres+AGE migration + Cypher equivalence, Grok-driven dialog loop, reasoning-trace persistence). New gate standard: two consecutive rounds at 100% GREEN before tag-cut. Wave 4 ran 2026-05-09 against e294aa3: both daemons rebuilt + restarted with --store-url postgres://…@aimemory_w4_live (Path B operator-approved single shared disposable DB; the existing aimemory db was not modified). R1 = 46 PASS / 20 FAIL / 13 SKIP of 74 in-scope (1285s wall); R2 = 44 PASS / 22 FAIL / 13 SKIP (1027s wall). The 20-FAIL deterministic core reproduces identically across rounds. Wave-1 (schema parity v15→v28), Wave-2 (ai-memory schema-init), Wave-3 (--store-url postgres://) all hold; Wave-4 surfaced that the SAL-trait handler-surface migration was not finished as part of Wave-3 — ~13 handlers (recall, inbox/notify, subscriptions, namespace standards, pending governance, KG temporal/timeline, taxonomy, aliases, check_duplicate, agent_quotas on HTTP, permissions enforce on SAL, inheritance owner-chain, link signing observed_by) still surface 501/503 on the postgres path. The daemon explicitly logs this at startup. Recommendation: Wave 5 / v0.7.0.1 follow-up to complete handler-surface migration. Subject under test: ai-memory v0.7.0 commit e294aa3+Wave 1-4 on round-2-fixes. See #649 + #647.

live results →
v0.7.0
▶ NHI tested · ship-with-notes

"Attested-cortex" release — 69/69 epic tasks across 11 tracks (A–K) plus 15 audit-blocker fixes from issue #628. Full-spectrum NHI test (Non-Human Identity) ran the 12-phase playbook against the released v0.7.0 binary on a live MCP database. Result: 114 PASS / 0 FAIL, 3 P2 fixes shipped on PR #636 against release/v0.7.0, F1 + F5 closed as test-bugs (wrong header name).

v0.7.0 NHI evidence →
v0.6.3.1
▶ testing in flight

Per-release A2A campaign on a 4-node DigitalOcean mesh. Subject under test: ai-memory v0.6.3.1 (tag pinned 2026-04-30, schema v19). Phase ladder 0→5: pre-flight, substrate cert (S1–S24), AI orchestration, autonomous NHI playbook (4 scenarios × 4 arms × n=3 = 48 runs), meta-analysis, verdict. S23 (#507 tilde-expansion) and S24 (#318 MCP stdio fanout) are expected-RED on v0.6.3.1; expected GREEN on Patch 2. Findings funnel into Patch 2 umbrella #511.

live results →
v0.6.3
✓ shipped

Hierarchy + KG + Capabilities v2 + 93.08% coverage. Ship-gate 4 phases pass · a2a-gate 48 scenarios pass · all 5 distribution channels live (GitHub Release · Homebrew tap · ghcr.io · Fedora COPR · crates.io).

view evidence →
v0.6.2
✓ shipped

A2A-CERTIFIED. 214 passing scenarios across 6 cells. Federation fanout correctness + S40 catchup hardening.

a2a-gate runs →
v0.6.1
✓ shipped

SAL Postgres adapter + 5 pre-tag SAL blocker punchlist closed (#293).

a2a-gate runs →
v0.6.0
✓ shipped

16+ ship-gate soak runs (soak-v0.6.0-r1 .. r16+). World-class documentation sprint, SAL track-B PR1.

ship-gate runs →
Testing strategy — parallel + distributed

~6.5h parallel campaigns, not 16h sequential.

Existing harnesses already support per-run isolation. We can fan a release campaign across 17 agents at peak coordinated through ai-memory itself — eating our own dogfood. Two pages explain the architecture:

Parallel testing strategy

What collapses inside each stage. Sequential vs parallel time math, line by line. Constraints, costs ($5-15/campaign), and the two execution options (sequential A vs orchestrated B). Net savings: ~9.5h per campaign.

Distributed-agent orchestration

The architecture that makes the parallelism work. ai-memory itself as the coordination bus. Workers fan out across DigitalOcean droplets + GitHub Actions runners. Orchestrator drains memory_inbox, updates evidence in real time. Reusable for v0.7+.

How releases get tested

The two-gate pipeline.

A release tag must pass both gates before it ships. Ship-gate runs first — a regression in any phase blocks the release. A2a-gate runs only after ship-gate phases 1-4 are green; full certification requires the matrix cells the operator selects (full v0.6.2 was 6 cells).

  1. release-rcN tag pushed — operator declares a candidate.
  2. ship-gate Phase 1 (functional) — single-node, all MCP tools exercised. Fast (~30 min).
  3. ship-gate Phase 2 (multi-agent) — federation W of N, peer reconciliation. ~1 hour.
  4. ship-gate Phase 3 (migration) — upgrade path from prior release tag. ~30 min.
  5. ship-gate Phase 4 (chaos) — peer kill, network partition, clock skew. Soak windows for production-quality (3-21 days); abbreviated for fast release. ~2 hours abbreviated.
  6. a2a-gate certification — selected cells (full = 6 cells). ~3-5 hours per cell.
  7. final tag — release pipeline publishes to crates.io / GHCR / Homebrew / Fedora COPR.
  8. post-publish smoke — install from each channel, verify the binary serves memory_capabilities.