Capability domains in ai-memory v0.6.3.1¶
ai-memory v0.6.3.1 ships eight capability domains beyond the testbook S1–S42 substrate matrix. This page is the inventory: each domain is described by what it does, why it matters in production agent systems, and which substrate canary (or canaries) probes it on the live mesh.
The naming convention S<N> in this page refers to substrate
canaries living at scenarios/v0.6.3.1/S<N>/ — distinct from the
testbook scenarios at docs/scenarios/. The substrate canary for a
domain probes the documented behaviour holds on the live 4-node
mesh; the testbook scenarios cover the canonical user flows.
| # | Domain | Substrate canary | Status on v0.6.3.1 |
|---|---|---|---|
| 1 | NHI / Agent identity | S28 | live, runtime-tested |
| 2 | Governance — approval gate | S29 | live, runtime-tested |
| 3 | A2A messaging | S30 | live, runtime-tested |
| 4 | Encryption at rest + in transit | S31 (SQLCipher) | live, runtime-tested (SQLCipher); transit covered by F6/F7 |
| 5 | Architecture tiers T1–T5 | — | documented; T1–T3 ship + quorum-write federation; T4 partial; T5 vision |
| 6 | Surface area (133 operations) | — | documented; covered piecewise by testbook S1–S42 |
| 7 | Knowledge graph | — | documented; covered by S38–S42 (recursive CTE, KG timeline, KG invalidate) |
| 8 | Operational modes | — | documented; covered by baseline (foreground stdio, HTTP daemon, sync daemon, curator daemon) |
1. NHI / Agent identity¶
What it does. Every memory carries metadata.agent_id set to
the writer-of-record at the moment of admission. ai-memory enforces
defence-in-depth immutability of that field across every
mutation surface an attacker (or a buggy peer) could plausibly use
to rewrite history:
memory_update/PUT /api/v1/memories/<id>— a later writer cannot rewrite the originalagent_id.updated_atandupdate_countMAY bump; the NHI bind stays sticky.memory_consolidate/ dedup — when two writes with identical content but differentagent_idcollide, both writers appear in the consolidated row's source-agent provenance.- Federation sync (fanout) — the relayed copy on a peer carries
the origin
agent_id, not the relaying peer's. memory_import— the import path round-tripsagent_idbyte-for-byte; restore-from-backup does not rewrite NHI.
Why it matters. Without this invariance, downstream governance (approval gate, audit trail, prime-directive enforcement) loses its identity anchor: an adversarial peer could rewrite the writer-of- record on every convergence cycle and "ai:alice wrote this" silently becomes "node-2 wrote this" or "ai:bob wrote this". The audit trail is then chronologically intact but evidentially worthless.
What probes it. S28 — runs the four invariants on the live mesh.
2. Governance — approval gate¶
What it does. A namespace-scoped policy holds writes in
state=Pending until a distinct NHI agent (typically
ai:operator, role=admin) explicitly Approves or Denies them.
The full matrix is 3 actions × 4 approval levels × 3 approver
types = 36 verdict shapes:
- Actions: Allow (write into Memory), Deny (write rejected, audit-logged), Pending (write held for human review).
- Approval levels: auto-allow, soft (single approver), hard (quorum N-of-M), forensic (immutable + signed).
- Approver types: human, agent (with role=admin), system (curator daemon under policy).
Decisions propagate across federation: a single approve/deny on node-1 settles the queue across the entire mesh — the pending list on every peer clears the moment the decision lands.
Why it matters. The approval gate is the substrate's enforcement point for organisational policy. Without federation-aware propagation, an attacker could race the approval by writing to a peer that hasn't seen the deny yet. With it, "Deny" is a write-once boundary across the whole convergence domain.
What probes it. S29 — exercises both happy-path approve and reject across two peer nodes.
3. A2A messaging¶
What it does. Three coordinated primitives let agents address each other directly without a sidecar message bus:
memory_notify— agent X sends a notification to agent Y (or to a namespace pattern). Returns anotification_id.memory_inbox— agent Y polls or pages its own inbox for unread notifications.memory_subscribe— agent Y registers a subscription (typically a namespace glob) so future writes/notifies that match are pushed without polling.
On top of those, HMAC-SHA256-signed webhooks support the
push-notification path: every emitted notification is POSTed to a
configured URL with X-AIM-Signature: sha256=<hex>. The receiver
recomputes HMAC-SHA256(secret, body) and compares constant-time.
Any mismatch ⇒ forged or tampered in transit.
The notify path is federation-aware: calling memory_notify
on node-1 with a target whose subscription lives on node-3 fans
out across the W=2/N=4 quorum without the caller having to know
which peer hosts the subscription.
Why it matters. A2A messaging makes ai-memory a coordination substrate, not just a memory store. Two agents can agree on a plan, hand off a task, or signal "I'm done" — without sharing private channels, dedicated orchestration layers, or any shared code beyond the MCP interface. The HMAC-signed webhook layer ensures the push path is integrity-protected end-to-end.
What probes it. S30 — notify + inbox + subscribe + HMAC verification + federation fanout.
4. Encryption at rest + in transit¶
ai-memory v0.6.3.1 lays five complementary layers of integrity / confidentiality:
- SQLCipher AES-256 at rest. The on-disk DB is opaque to
anyone without the configured passphrase; stock
sqlite3cannot read it. The passphrase is loaded from/etc/ai-memory-a2a/envorconfig.toml, never embedded in the binary. - mTLS 1.3 with fingerprint allowlist. Every peer connection carries a client certificate; rustls rejects any handshake whose fingerprint is not on the allowlist. Probed by F6/F7 (TLS handshake + mTLS enforcement) in the baseline.
- HMAC-SHA256 webhooks. Every webhook delivery is signed with the per-subscription shared secret. Same primitive as §3 but applied at the at-rest tag layer too.
- GPG-signed release tags.
git tag -s v0.6.3.1is signed with the AlphaOne release key; downstream consumers can verify provenance before extracting the tarball. - SBOM + reproducible builds. The release tarball ships an
SPDX SBOM and a
BUILD_REPRO.mdrecipe; two independent builds of the same source tag MUST produce byte-identical binaries (verified out-of-band by the release process).
Why it matters. Defence in depth: an attacker who steals the DB file gets ciphertext; an attacker on the wire gets rejected at TLS; an attacker in the supply chain gets caught by SBOM diff or reproducible-build mismatch.
What probes it. S31 — SQLCipher (header opaque, plain rejected, keyed works, passphrase not in binary). Layers 2 + 3 covered by F6/F7 + S30. Layers 4 + 5 are release-process invariants verified out-of-band.
5. Architecture tiers T1–T5¶
ai-memory documents five architecture tiers spanning single- laptop deployments to a global hive:
| Tier | Topology | Status on v0.6.3.1 |
|---|---|---|
| T1 | Single laptop, foreground stdio (MCP) | ships; baseline default |
| T2 | Single host, HTTP daemon (mTLS-aware) | ships; baseline tls=on/mtls |
| T3 | Multi-host federation, W-of-N quorum sync | ships; campaign topology (W=2/N=4) |
| T4 | Cross-region federation with cold-store tier | partial; cold-store hooks present, region-aware quorum is roadmap |
| T5 | Global hive, planet-scale eventual consistency | vision; documented in roadmap |
Deployment recipes + topology SVGs live in the ai-memory-mcp docs.
Why it matters. Tiering lets an operator pick the smallest viable footprint for their use case without forking the codebase. A T1 dev laptop, a T3 production cluster, and a T5 federation all run the same binary; only the daemon flags and config layout differ.
What probes it. Tier T1–T3 is exercised by every campaign run; the campaign topology is itself a T3 deployment (4 nodes, W=2/N=4). T4–T5 are not runtime-tested in this campaign — documented as forward-looking in the v1.0 GA criteria.
6. Surface area (43 + 50 + 40 = 133 operations)¶
ai-memory v0.6.3.1 exposes 133 operations across three interface surfaces:
| Surface | Count | Scope |
|---|---|---|
| MCP tools | 43 | memory_store, memory_recall, memory_consolidate, memory_link, memory_kg_query, memory_subscribe, memory_inbox, memory_notify, memory_pending_*, memory_namespace_*, memory_archive_*, memory_kg_*, memory_session_start, memory_capabilities, memory_check_duplicate, memory_detect_contradiction, memory_expand_query, memory_get_taxonomy, memory_auto_tag, memory_promote, memory_forget, memory_gc, memory_get_links, memory_kg_invalidate, memory_kg_timeline, memory_entity_*, memory_agent_register, memory_agent_list, memory_inbox, memory_search, etc. |
| HTTP endpoints | 50 | /api/v1/memories, /api/v1/memory/pending, /api/v1/notify, /api/v1/inbox, /api/v1/subscriptions, /api/v1/namespaces, /api/v1/audit, /api/v1/entities, /api/v1/kg/*, /api/v1/admin/*, etc. |
| CLI commands | 40 | ai-memory boot, ai-memory doctor, ai-memory wrap, ai-memory audit verify, ai-memory mcp, ai-memory peer add, ai-memory namespace policy, ai-memory kg traverse, etc. |
A cross-reference matrix maps every MCP tool to its HTTP and CLI
analogue (where one exists) — see tests.md and the
testbook for the canonical surface coverage.
Why it matters. The three surfaces are not independent — they all hit the same SQLite file via the same coordinator. But they expose different ergonomic profiles: MCP for in-process LLM tools, HTTP for cross-process federation + UI, CLI for ops + scripts. Coverage holes in any one surface (e.g. #318 — MCP stdio writes bypass fanout) are surface-specific bugs, not substrate bugs.
What probes it. Testbook S1–S42 covers the canonical user flows; per-tool MCP coverage is broken out in testbook Suite H. Substrate canaries S23, S24, S25, S26, S27, S28, S29, S30, S31 probe specific cross-surface invariants.
7. Knowledge graph¶
What it does. ai-memory ships a knowledge-graph layer on top of the memory store:
- Recursive CTE traversal with cycle detection — depth-bounded, visited-set pruned, so a malformed graph cannot OOM the daemon.
- Bitemporal filters — every edge carries (
valid_from,valid_to) AND (recorded_at,superseded_at) so queries can reconstruct "what did the graph look like at time T as of reporting time R". Useful for forensic replays. - Entity registry + alias resolution —
memory_entity_register memory_entity_get_by_aliasgive a stable canonical id even as labels drift across agents.- KG timeline —
memory_kg_timelinereturns the ordered history of edges touching a node. - KG invalidate —
memory_kg_invalidatemarks a subgraph superseded without deleting it (audit-preserving).
Why it matters. The KG layer turns ai-memory from a key-value store into a queryable evidence base. An auditor can walk "who-said-what-about-X-as-of-Tuesday" without having to merge a flat memory list against an external graph database.
What probes it. Testbook S38–S42 (memory_kg_* family) and
the substrate-level entity / alias resolution checks in baseline
v1.4.0. Not currently runtime-tested as a substrate canary in
this campaign — documented for completeness.
8. Operational modes¶
ai-memory runs in four operational modes, all from the same binary:
| Mode | Command | What it does |
|---|---|---|
| Foreground stdio (MCP) | ai-memory mcp |
speaks JSON-RPC over stdio; the canonical surface for LLM tool-use |
| HTTP daemon | ai-memory serve (mTLS-aware) |
exposes the 50-endpoint HTTP API; takes peer connections |
| Sync daemon | ai-memory sync |
runs the W-of-N quorum write coordinator + federation fanout |
| Curator daemon | ai-memory curate |
auto-tag + contradiction detection + consolidation, scheduled or event-driven |
A production T3 deployment runs three of the four (serve +
sync + curate); a T1 dev box runs only mcp (the other three
are subprocesses spawned on demand).
Why it matters. The modes are intentional separation of concerns — the curator can be paused for a forensic snapshot without losing recall; the sync daemon can be restarted without disrupting in-flight MCP sessions. Each is independently observable (own systemd unit, own log stream, own healthcheck).
What probes it. Each mode has at least one baseline probe
(F1–F7 in baseline.md), and the curator
contradiction-detection path is tested by Phase 3 scenario D
(behavioral propagation under contradiction). Not currently
runtime-tested as a single substrate canary — documented for
completeness; could be promoted to S32+ if a regression motivates
it.
Cross-domain notes¶
- Audit trail (forensic). S25 / S26 / S27 (parallel substrate
canaries) prove the audit log is hash-chained, tamper-detecting,
and OS-level append-only. Every write surfaced by domains 1–3
above lands a hash-chained line keyed by
agent_id, so the forensic surface is the enforcement point for all the invariants on this page. - Prime Directive. Phase 3 scenarios E–H probe whether the
Prime Directive (immutable in
system/governance/prime-directive) is correctly enforced by agents reading from ai-memory. The approval gate (domain 2) is the substrate's enforcement primitive; Prime Directive scenarios test the NHI behaviour on top of it. - Patch 2 funnel. Bugs against any of these domains funnel
into the findings tracker and become Patch 2
(
v0.6.3.2) candidates. S23 (#507) and S24 (#318) are the two canaries currently green-RED on this campaign.
How to add a new capability domain¶
- Write a contract.md, expected.json, runner.sh under
scenarios/v0.6.3.1/S<N>/(mirror S28–S31 patterns). - Add the runner id to
.github/workflows/a2a-gate.yml's "Run v0.6.3.1-specific expected-RED canaries" loop. - Add a row to the table at the top of this file describing what it tests and why.
- Reference the new canary from at least one Domain section.
- Update
docs/index.md's "Full-spectrum test landscape" matrix if it changes the surface count.
The substrate-canary pattern keeps capability claims honest: every domain documented here either has a runner that passes on every dispatch, or is explicitly flagged as documented-but-not- runtime-tested.