ai-memory  /  audience  /  decision maker
For product / PM / exec

Evaluate ai-memory.

AI agents forget everything between sessions; vector databases store text but not meaning, identity, or governance. ai-memory is the substrate layer that gives autonomous AI agents persistent typed memory, a knowledge graph, and operator-signed governance rules — in a single Apache-2.0 Rust binary, deployable on a laptop or a fleet.

The problem

AI agents have no memory and no identity.

Off-the-shelf, an LLM call is stateless. Conversation history is reconstructed token by token on every turn, which is expensive, error-prone, and forgets again once the context window rolls over. Most teams reach for a vector database; that buys text similarity search but nothing else — no typed memory, no temporal validity, no signed audit trail, no operator policy, no identity for the AI itself.

For real autonomous AI Non-Human Identity (NHI) agents — agents that act on behalf of an org, with persistence across sessions, multiple subordinate agents, and policy that survives a model swap — you need a substrate, not a similarity index.

The comparison

ai-memory vs. vector-DB-only.

CapabilityVector DB onlyai-memory
Semantic similarity searchYesYes (hybrid: FTS5 + cosine)
Typed memory kindsNo10 governed kinds (Form-6 vocabulary)
Knowledge graph with temporal validityNoApache AGE, valid_from / valid_until
Ed25519-signed memory linksNoPer-link attest_level
Operator-signed substrate rulesNoL1–L6, key on disk
HMAC-required event subscriptionsNoSSRF gate by default
NHI agent_id semanticsNoResolution ladder, preservation invariants
Autonomous tier (consolidate / contradict / auto-tag)NoLLM-backed, vendor-agnostic — Ollama local OR 15+ cloud vendors (xAI Grok, OpenAI, Anthropic, Gemini, DeepSeek, Kimi, Qwen, Mistral, Groq, Together, Cerebras, OpenRouter, Fireworks, LMStudio, vLLM)
Single binary, zero cloud dependencyOften hostedSingle Rust binary, sqlite default
Apache-2.0MixedYes

The comparison is not "ai-memory replaces your vector DB"; it is "ai-memory gives you the substrate that holds a vector DB if you want one, plus 9 things a vector DB cannot give you alone."

Risk profile

What you are signing up for.

License risk

Apache-2.0. No copyleft. No CLA. Permits commercial embedding.

Vendor lock-in

Storage layer is SQLite (the world's most-deployed database). Export to JSONL or PostgreSQL+AGE is first-class. No proprietary format.

Supply chain

Rust, cargo audit required to be clean as a release gate. Binary is statically linkable; reproducible builds are on the v1.0 roadmap.

Security posture

v0.7.0 secure-default: permissions enforced by default, SSRF gate on webhooks, signed audit chain, HMAC subscriptions, optional sqlcipher for at-rest encryption.

Operational risk

One process, one DB file. Failure modes are SQLite failure modes (well-understood). Migrations dry-run-tested by the maintainer dogfood loop before every release tag.

Project maturity

v0.7.0 is the third major release. 74 MCP entries at --profile full, 89 HTTP routes (75 unique paths), 80 CLI subcommands (82 with --features sal or --features sal-postgres). CI gates on Linux x64/arm64, macOS x64/arm64, Windows x64, plus iOS + Android cross-compile (Posture 1a, #1068). Test campaigns are reproducible and publicly logged under docs/v0.7.0/test-campaign-*.

Total cost

What it actually costs to run.

There is no SaaS billing surface in ai-memory itself. The only cost you incur is your own hosting plus (optionally) your own LLM call budget, paid directly to whichever vendor you choose.

Deployment matrix

10 deployment postures — from cellphone to private DC.

v0.7.0 unlocks an unusually wide deployment surface because the LLM substrate (provider-agnostic, #1067) and the mobile cross-compile gates (#1068) decouple ai-memory from any single OS, vendor, or hardware floor.

PostureWhere it runsCPU / RAM / GPU floorLLM sourceCost shape
1a. Cellphone / tabletiOS (arm64) + Android (arm64-v8a, armeabi-v7a, x86, x86_64) — in-app embed via FFI1 core / 256 MB / noneCloud (xAI / OpenAI / Anthropic / Gemini)Vendor-metered
1b. Laptop / workstationmacOS arm64/x64, Linux arm64/x64, Windows x642 core / 4 GB / optionalLocal Ollama or cloud$0 local / vendor-metered cloud
2. CPU-only cloud VPSAny $5/mo VPS (Linode, Hetzner, DigitalOcean droplet)1 vCPU / 1 GB / noneCloud LLM only (no local inference floor)~$5/mo host + LLM-metered
3. CPU-only container (Plan C)GHCR Docker image, K8s, ECS, Cloud Run1 vCPU / 512 MB / noneCloud LLM (env-injected)Container-metered + LLM-metered
4. CPU-only sidecar (in-pod)Sidecar to an existing app pod0.25 vCPU / 256 MB / noneCloud LLM via vendor APINegligible host + LLM-metered
5. GPU workstationDev box with NVIDIA / Apple-Silicon NPU4 core / 8 GB / 8 GB VRAMLocal Ollama (gemma3:4b, llama3, qwen3)$0 marginal once hardware owned
6. GPU server (single-node)Bare-metal or cloud GPU instance8 core / 32 GB / 24 GB VRAMLocal vLLM / Ollama / llama.cpp-server (OpenAI-compatible)$0.5-3/hr GPU instance
7. Private DC vLLM clusterOn-prem K8s + vLLM autoscalerCluster-scaleSelf-hosted vLLM (OpenAI-compatible endpoint)Capex + power; no per-token fees
8. Multi-region federation (T4-T5)Multi-region quorum sync, per-region LLM choice3+ nodesMixed: each region picks Ollama / cloud / vLLM independentlyRegion-aggregated
9. Air-gapped / SCIFNo-internet enclaveBring your ownLocal Ollama or self-hosted vLLM only (no cloud egress)$0 marginal post-deploy
10. Edge / IoTarm64 SBC (Raspberry Pi 5, Jetson Nano)2 core / 2 GB / optional NPUCloud LLM (default) or tiny local modelHardware + LLM-metered

Operator picks via the universal AI_MEMORY_LLM_BACKEND precedence ladder (CLI flag > env var > config.toml > compiled default). No code changes between postures — same Rust binary, same MCP / HTTP / CLI surfaces.

Enterprise & federal architecture

The federated story is proven live, not promised.

If your question is "can this run a secure, multi-region fleet of AI agents — teams, swarms, hives — under our compliance regime?", the answer is a public, reproducible artifact rather than a slide deck. The Grand Slam reference architecture is a 15-node, 3-region federated hive (do-1461) with W=2 quorum replication and three independently encrypted legs — each leg proven both positive (traffic flows when keys are right) and negative (traffic refused when they are not).

The same fleet was destroyed and rebuilt from nothing in two independent clean-room rounds; both rounds returned 119/119 verify checks green, and the round-2 fleet passed the 150/150 full-spectrum suite (regression, crypto, federation, zero-touch trust, A2A, AI-NHI, NSA-gap, curator groups). Identity is Ed25519 end to end; enrollment at fleet scale is CA-rooted Zero-Touch Trust; the security posture maps control-by-control onto NSA CSI MCP guidance with live-fleet test citations.

Roadmap

Where this is going.

v0.7.0 — attested-cortex (shipping)

Ed25519 attestation chain + 25-event hook pipeline + Apache AGE acceleration.

Per-agent signed events, programmable hooks, operator-signed substrate rules L1–L6, capabilities v3, real permission system. 6/6 PASS on the NHI Discovery Gate vs. live xAI Grok 4.3.

v0.8.x — planned

Federated quorum sync hardening, SDK polish, multi-tenant HTTP scaling.

Cross-region replication discipline, signed-link verifiers, first-class Python and TypeScript SDKs, lock-contention work on the shared mutex.

v1.0 — targeted

Wire-format stability, reproducible builds, SOC 2-friendly audit packaging.

Frozen MCP tool + HTTP API contract, reproducible-build verification, evidence-pack tooling for compliance auditors.

Who is using it

The dogfooders.

ai-memory is dogfooded by the maintainer's own multi-agent Claude Code workflow (the same workflow that builds every release). Every release/v0.7.x.y branch sits in real use for at least 24 hours against the operator's live MCP database before a tag cuts. Migration round-trips (now through schema v57) are tested against the operator's own DB on every commit. The project is also exercised by the IronClaw A2A 4-domain campaign on Docker against xAI Grok 4.3.

External adoption is intentionally early-and-honest; if you are evaluating for production, the recommended path is to dogfood it against a non-critical agent first and reach out via GitHub issues for any gaps.

Test results

The evidence behind SHIP.

Every release ships with a public test campaign directory: pinned binary SHA, every test in expected-vs-actual form, every issue closed with retest evidence. The release-gate-final campaign (2026-05-22) returned SHIP-RECOMMENDED on 7,321 PASS / 0 FAIL across 269 test binaries with 22 issues (#1120-#1141) fixed in-campaign — no deferrals to v0.8.0. The subsequent final-baseline regression (2026-05-31, off a pristine volume-wiped rig) ran both backends end to end: 15,951 PASS / 0 FAIL (sqlite 7,458 + Postgres/AGE 8,493), reproduced at 15,952 / 0 on an independent round-2 re-run — full provenance on the frozen-claims evidence page.

Next

If you want to dig deeper.