A single Rust binary that gives Claude, ChatGPT, Cursor, Windsurf, Gemini, Hermes — every MCP-compatible AI — durable, shared memory across sessions, projects, and machines. Local-first. Zero cloud dependencies. Already running on your hardware in 60 seconds.
ai-memory is a self-contained Rust daemon. Every AI tool you use plugs into it via MCP and gets the same memory. Stop losing context every conversation. Stop pasting "remember that I…" into every model. Stop paying SaaS fees for what runs locally on your laptop.
projects/alpha/decisions. clients/acme/contracts. research/quantum/papers. Recall scopes to a subtree, never bleeds across contexts. Your finance memories don't leak into your code reviews.
Every link between memories carries valid_from and valid_until. Ask "what was true on Feb 15?" and the system reconstructs the world as you knew it. Supersession is recorded, not destroyed.
Every hot path has a published p95 budget. CI fails any pull request that breaks them by more than 10%. memory_session_start < 100ms. memory_recall < 50ms. No silence-by-default.
PERFORMANCE.md table and a CI gate that enforces it.
Each layer has one job. The Surface layer talks to AI clients. The Core layer reasons about memory. The Safety layer enforces what's allowed. The State layer persists everything to disk. Cross-layer dependencies flow downward only.
cp memory.db is a backup.
Four tiers, each adding capability and dependency. Start at keyword for laptop-grade text search with zero install. Climb to autonomous for self-curating memory with neural reranking. Switch tiers per-deployment.
memory_recall tool returns better results at higher tiers, but always returns results. You can demote at any time — your data is the same on disk.
Every memory follows the same path. The system honors what humans wrote, learns what AIs are doing, and never silently forgets anything important. Compaction is opt-in, archive is reversible, hard delete is your call.
memory_archive_purge (CLI: ai-memory archive purge --older-than-days <N>) — scoped by the older_than_days safety knob and audit-logged.
Numbers below are real measurements, not aspirational. ai-memory bench runs the canonical 1,000-memory workload and reports p50/p95/p99. CI fails any PR that exceeds budget by more than 10%. Hardware baseline: Apple M4 / 32 GB / NVMe SSD.
ai-memory bench on ubuntu-latest and posts a workflow summary with the table above. A regression of more than 10% on any p95 fails the build. There is no "we'll fix the latency later" path.
The v0.6.3 coverage campaign took ai-memory from 56.7% line coverage to 93.08% across 9 waves of parallel agent work. 26 closers shipped ~1,200 net new tests over the ~30K-line Rust codebase. Full report: CAMPAIGN-FINAL-METRICS.md
Run ai-memory on N machines. Writes propagate as W-of-N quorum — by default 2 of 3. Reads stay local; writes acknowledge after quorum. Every peer authenticates via mTLS with fingerprint allowlist. Catchup loop closes partition windows automatically.
Both sides verify each other's certificate. Fingerprint allowlist prevents accidental joins. No central PKI required.
Per-peer sync-state cursor advances with successful pulls. Re-joining peer fast-catches-up to current epoch.
Caller sees structured error: which peers responded, which timed out. No silent partial writes.
ai-memory runs the same way on a developer's laptop as it does on a federation of state-government data centers. The differentiator is configuration, not code path: federation peers, mTLS allowlists, governance policies, autonomous-tier resources.
ai-memory mcp in the MCP config. Memory survives restarts, updates, machine swaps.Public release sequence. Each release ships one demoable headline plus operational substrate the next release builds on. No version skipping. No quiet feature drift. Public ROADMAP.md →
projects/alpha/decisions)bench.yml)Honest comparison against the practical alternatives. Each has its place; ai-memory's place is "single binary, local-first, every AI, sub-100ms".
| Capability | ai-memory | Vector DB (Chroma, Qdrant, etc.) |
SaaS memory (ChatGPT memory, etc.) |
mempalace | Raw text (notes, READMEs) |
|---|---|---|---|---|---|
| AI-agnostic (works with any MCP client) | ✓ | ✗ | ✗ | ✓ | ✓ |
| Cross-session persistence | ✓ | ✓ | ✓ | ✓ | ✓ |
| Hierarchical namespaces | ✓ | ✗ | ✗ | ✓ | ~ |
| Temporal-validity knowledge graph | ✓ | ✗ | ✗ | ~ | ✗ |
| Published latency budgets + CI guard | ✓ | ✗ | ✗ | ✓ | ✗ |
| Hybrid recall (FTS + vector + reranker) | ✓ | ~ | ✗ | ✓ | ✗ |
| Federation across machines | ✓ | ~ | N/A | ✗ | ✗ |
| Local-first · zero cloud deps | ✓ | ~ | ✗ | ✓ | ✓ |
| Single binary install | ✓ | ✗ | N/A | ✗ | N/A |
| mTLS federation | ✓ | ✗ | N/A | ✗ | ✗ |
| Self-curating background daemon | ✓ | ✗ | ✗ | ✗ | ✗ |
| Apache 2.0 OSS · auditable source | ✓ | ~ | ✗ | ✓ | N/A |
| Air-gap deployable | ✓ | ~ | ✗ | ✓ | ✓ |
| Per-namespace governance | ✓ | ✗ | ✗ | ✗ | ✗ |
| Webhook subscriptions for SIEM | ✓ | ✗ | ✗ | ✗ | ✗ |
| Sub-100ms session-start budget | ✓ 42ms | ~ | ~ | ✓ | N/A |
Each step strengthens the trust boundary without breaking the layer below it. The OSS binary is operable at every step. The AgenticMem commercial tiers add managed services on top of what's already shipped.
Every quantitative claim ai-memory makes, sourced from the post-v0.6.3 codebase and the public CAMPAIGN-FINAL-METRICS document.
You have ~30 conversations a day with one or more AIs. Each starts cold. Each ends with knowledge that vanishes. Over a year that's roughly 11,000 lost contexts — a year's worth of relationship-building with the most powerful tool you've ever owned, evaporated every 4 hours.
ai-memory turns those 11,000 cold-starts into one continuous conversation that learns about you over time.
A 25-person engineering team using Claude collectively burns ~600 cold-start latencies per day. At 200ms each, that's 2 minutes/day of pure latency — but the bigger cost is the re-paste: explaining the same project context, repeatedly, to AIs that can't share what they learned.
A federated ai-memory cluster shares understood context across the team. New hires walk into the conversation already in progress.
Every GenAI vendor wants your data. Every compliance officer wants it on your premise. Every architect wants it durable. Every CFO wants it predictable. The only stack that satisfies all four constraints is a local-first memory layer with a published latency contract — sitting underneath whatever AI vendor you happen to use today.
ai-memory is that layer. Apache 2.0. Single binary. mTLS federation. CI-guarded budgets. Auditable from git clone to deployed binary in 60 seconds.
AI mandates are real. Cloud bans are real. Foreign-vendor concerns are real. An OSS Rust binary that runs entirely on your hardware, requires zero outbound traffic, and ships with auditable source code is the only AI-memory primitive that works for federal, state, local, and municipal deployments.
v0.7's attested-identity work targets FIPS-grade key handling. v1.0's federation maturity work targets multi-region resilience. Today's v0.7.0 already runs air-gapped with no compromise.
This page is the hub. Three concentric rings: six audience-facing pages (release spotlight, feature matrix, data flow, integrations, audiences, release pipeline), twelve feature deep-dives (tiers, rules, TTLs, archival, encryption, hierarchies, KG, autonomous, A2A, lifecycle, performance, credits), and five SME-detail references (schema, types, validators, governance, tracing). Pick what your audience needs.
v0.7.x substrate-level coverage of the National Security Agency Cybersecurity Information document on MCP security (U/OO/6030316-26, May 2026). 10 of 10 NSA concerns structurally addressed + 7 of 7 NSA recommendations implemented. Every claim codegraph-verified, file-anchored, independently reproducible by federal procurement reviewers. 47 dedicated tests pin the #1154 daemon-serverInfo signing contract.
74 MCP tools at full / 7 at core, schema v33 → v57, Ed25519 link attestation with V-4 cross-row hash chain, 25-event hook pipeline, sidechain transcripts + memory_replay, postgres + Apache AGE first-class backend, schema-v2 sectioned config (#1146), per-namespace K8 quota dimension (#1156), federation_nonces persistence (#1255), transcript_line_dedup idempotency (#1389), tier-default expiry backfill (#1466), sargable federation-catchup (#1476), 7-level provenance framework. The release for substrate-native attested cortex.
Every MCP tool, every HTTP route, every CLI command — categorized, cross-referenced. 74 MCP tools + 89 HTTP routes (75 unique paths) + 80 CLI subcommands at v0.7.x. The SME's full reference.
Animated write path, read path with adaptive hybrid fusion + 70/30 context blend, federation W=2 quorum diagram. Where every byte goes.
14 AI clients tested with one-block setup snippets each. Claude · ChatGPT · Cursor · Windsurf · Continue · Codex · Gemini · Grok · OpenClaw · Hermes · Llama.
Solo dev → Startup → Mid-market → Enterprise → Government. Pain → Fix → ROI per audience. Deployment patterns. Procurement-ready specs.
Tag → 5 platforms → 5 distribution channels → all signed. CI gates. SBOM. Reproducible builds. Procurement-ready operational spec.
The grand-slam features. Tiers + TTLs (the retention story). Rules (per-namespace policy stack). Archival (two-stage soft-then-hard delete). Encryption (SQLCipher, mTLS, HMAC-mandatory webhooks). Hierarchies (8-level memory trees). Knowledge Graph (6 relation variants incl. reflects_on + derives_from + temporal validity). Autonomous (provider-agnostic LLM substrate — Ollama local OR 15+ cloud vendors via #1067). A2A messaging. Full lifecycle. Performance + bench tool. Credits (Google for Gemma, Nomic, Hugging Face, Ollama, xAI / OpenAI / Anthropic / DeepSeek / Kimi / Qwen / Mistral / Groq / Together / Cerebras / OpenRouter / Fireworks / LMStudio, SQLite, Rust ecosystem).
Short (6h) / Mid (7d) / Long (no TTL). Mirrors human memory architecture. Promotion path with governance gates. The default that lets agents forget noise.
Validation → scope → governance → namespace standard → parent inheritance. Five rule layers, every refusal named with a reason. Multi-tenant isolation, compliance retention, AI-supervisor patterns.
Per-write expires_at + ttl_secs. Per-tier defaults. Daemon-config overrides. Access-driven extension. archive_on_gc. Every dial that controls memory lifetime.
archive → restore or purge. Five archive MCP tools. archive_on_gc soft-delete. auto_purge retention windows. Compliance patterns for GDPR, retention SLAs, forensics.
SQLCipher AES-256 at-rest. mTLS + fingerprint allowlist for federation. HMAC-SHA256 webhooks. Signed git tags + SBOM. v0.7 Ed25519 attested identity roadmap.
8-level deep namespace paths. 5 visibility scopes (private/team/unit/org/collective). Namespace standards inheritance. memory_get_taxonomy tree walker. v0.6.3 Stream A.
6 relation variants (related_to / supersedes / contradicts / derived_from / reflects_on / derives_from). Entity registry with alias resolution. Temporal validity columns + Ed25519 signature column populated. 11 graph-family MCP tools: memory_kg_query / memory_kg_timeline / memory_kg_invalidate / memory_find_paths / memory_link / memory_get_links / memory_entity_register / memory_entity_get_by_alias / memory_get_taxonomy / memory_replay / memory_verify. Apache AGE acceleration on Postgres backend.
Auto-tag, consolidate, expand-query, contradiction detection, memory reflection, session-start. Backed by any LLM via a [llm] section in ~/.config/ai-memory/config.toml (recommended, post-#1146) or AI_MEMORY_LLM_BACKEND env override — Ollama (default; Gemma local) or xAI Grok / OpenAI / Anthropic / Gemini / DeepSeek / Kimi / Qwen / Mistral / Groq / Together / Cerebras / OpenRouter / Fireworks / LMStudio / vLLM / llama.cpp-server. Local-first by default, cloud-flexible by config. Canonical schema: CONFIG_SCHEMA.md. Per-vendor recipes: integrations/llm-backends.md.
memory_notify pushes to inbox (federation-aware). memory_subscribe webhooks fan out events. HMAC-SHA256 signed dispatch. Two patterns, one toolkit.
store → access → consolidate → promote → archive → restore or purge. Six stages, eleven transitions, every transition leaves an audit trail. Timeline visualization.
Public p95 budgets per operation. bench tool with --baseline / --history / --update-performance-md. CI bench gate that fails on regressions. v0.6.3 Streams E + F.
Open-source acknowledgements. Google for Gemma. Nomic AI for embeddings. Hugging Face for tokenizers + reranker. Ollama for the local-LLM runtime; xAI / OpenAI / Anthropic / Gemini / DeepSeek / Kimi / Qwen / Mistral / Groq / Together / Cerebras / OpenRouter / Fireworks / LMStudio / vLLM for the cloud + self-hosted LLM substrate (#1067). SQLite. Rust ecosystem.
When the audience-facing pages are not enough — when an evaluating engineer needs to see every SQL column, every Rust type, every validator, every governance verdict, every log line. These pages are the reference contract for AI clients integrating against ai-memory.
Every SQLite table column-by-column. Schema v57 (was v33 at v0.6.4). 26-field Memory struct (was 15) incl. reflection_depth, memory_kind, entity_id, persona_version, citations, source_uri, source_span, confidence_source, confidence_signals, confidence_decayed_at, version. 6 MemoryLink variants with Ed25519 attestation columns. v55 current (#1476 sargable list_memories_updated_since federation-catchup rewrite + idx_memories_updated_at); v54 (#1466) backfilled tier-default expiry onto legacy NULL-expiry rows; v53 (#1418) scoped the memories_au FTS5 trigger; v52 (#1389) added the transcript_line_dedup idempotency table; v51 (#1255) added federation_nonces persistence; v50 (#1156) added per-namespace K8 quota dimension on agent_quotas. Postgres mirror via SAL with Apache AGE Cypher graph backend. The persistence contract.
Every public struct and enum from src/models/. 16 enums, 50 structs at v0.7.0. Field-level types, defaults, serde tags. The wire-format contract.
42 validate_* functions, every limit explicit, every closed set enumerated. Defense in depth — same checks on HTTP, MCP, CLI, federation receive.
The decision tree. 4 levels × 3 actions matrix. Three approver flavors (Human, Agent, Consensus). Pending-action lifecycle. Federation propagation.
600+ tracing call sites across the codebase. Setup, level taxonomy, canonical phrases AI clients can grep for (incl. the #1562 store::postgres targets). Incident-review recipes.
@media print stylesheet that strips chrome, switches to white background, and applies page-break-protection. Print as a PDF for board decks, procurement reviews, or a "give this to your security team" packet.
No signup. No telemetry. No SaaS. brew install ai-memory or cargo install ai-memory — your laptop, your data, your AI.