EVIDENCE HUB · v0.7.0 · live record

How the numbers were established.

The Frozen Claims page is the single source of truth for ai-memory's public numbers. This hub is the running record behind those numbers — every testing campaign, the exact baseline environment it ran on, and the full pass/fail results with commit SHAs and verbatim test-result lines. Every entry follows the prime-directive audit trail: discovery → tracker → fix → retest → re-check → close.

Reference baseline

Baseline test environment (this node).

Unless a campaign states otherwise, results on this hub were produced on the following baseline. Each campaign page restates its own environment for self-containment.

FieldValue
HostFROSTYi.local — Apple Silicon (arm64)
OSmacOS 26.5 (build 25F71) · Darwin 25.5.0
Toolchainrustc 1.96.0 (ac68faa20 2026-05-25) · cargo 1.96.0
Binaryai-memory v0.7.0
Schema versionv54 (sqlite + postgres)
Feature tierautonomous
Embedder / Rerankernomic-embed-text-v1.5 / ms-marco-MiniLM-L-6-v2
LLM backendopenrouter · google/gemma-4-26b-a4b-it
Branchrelease/v0.7.0
Test isolationAI_MEMORY_NO_CONFIG=1
The four QC gates run on every campaign: cargo fmt --check · cargo clippy --all-targets -- -D warnings -D clippy::all -D clippy::pedantic · AI_MEMORY_NO_CONFIG=1 cargo test · cargo audit.
Campaigns

Testing campaigns & closeouts.

Each card links to the full markdown evidence trail in the repo. Green = fixed/retested/closed; Amber = in progress.

Closed · GREEN

#1466 — TTL-leak immortal-rows fix

Mid/short rows stored with expires_at: None never expired (2,921 leaked). Fixed at the write-path chokepoint + schema backfill on both backends. Four gates green; lib 5,105 / 0; audit clean; QUAL-10 green. Commit 91c032ce.

Full evidence →
Closed · GREEN

#1182 Round 2 — A2A regression (reproducibility)

Independent re-run off a second pristine rig wipe. Domain 1 (sqlite) 7,458 / 0 / 16; Domain 2 (pg+AGE) 8,494 / 0 / 37. Combined 15,952 / 0, zero new defects — confirms #1182 reproducible.

Full evidence →
Closed · GREEN

#1182 Round 1 — final-baseline regression

NO-FAIL-MISSION off a pristine rig (docker compose down -v). Combined 15,951 / 0. Four 1:1 findings filed→fixed→closed (#1444/#1445/#1446/#1447); three codegraph QC audits ZERO-DEFECTS.

Full evidence →
Green · 5/5 pass

DO swarm campaign (T4) — 2026-06-02

Scaled-down T4 swarm on native DigitalOcean droplets (no Docker): 3× 4 GB quorum peers (N=3/W=2, mTLS) + 1× 8 GB Postgres 16 + Apache AGE 1.5.0 + pgvector. All 5 scenarios pass; zero defects; spend ≈ $0.18. Full evidence: campaign README.

Results below →
Completed campaign · 2026-06-02

DO swarm — T4 topology (GREEN).

Scope: swarm testing only (no hive), scaled down significantly per operator directive. Native DigitalOcean droplets — no Docker — in region nyc3, VPC 10.108.0.0/24, all Ubuntu 24.04 x64. Binary ai-memory v0.7.0 (branch release/v0.7.0) built natively --release --features sal-postgres on the 8 GB node and scp'd to the peers. Hard budget cap $75; actual spend ≈ $0.18; all droplets destroyed at close. Full audit trail: 2026-06-02-do-swarm-t4/README.md.

NodeSizeRole
amx-pgs-4vcpu-8gbPostgreSQL 16 + Apache AGE 1.5.0 + pgvector 0.8.2 · amd64 build host
amx-peer-1/2/3s-2vcpu-4gb ×3quorum mesh — N=3 / W=2, mTLS fingerprint allowlist
ScenarioResult
S1 · Quorum-write success (W=2, all peers up)PASS
S2 · Quorum not met under partition → 503 quorum_not_metPASS
S3 · mTLS allowlist — rogue/no cert refused at handshakePASS
S4 · Post-partition convergence (DLQ replay + catchup)PASS
S5 · Postgres + Apache AGE backend (v54 migrate · CRUD · Cypher)PASS
Verdict: GREEN, 5/5, zero defects. Peer-mesh quorum writes, quorum-shortfall semantics, mTLS identity gating, post-partition convergence, and the Postgres + Apache AGE backbone all behave to the T4 contract. The run doubled as a live regression on #1466 — the tier-default expires_at backfill fired on both the federated SQLite write path and the postgres CRUD path. No findings to file.