A single Rust binary that gives Claude, ChatGPT, Cursor, Windsurf, Gemini, Hermes — every MCP-compatible AI — durable, shared memory across sessions, projects, and machines. Local-first. Zero cloud dependencies. Already running on your hardware in 60 seconds.
ai-memory is a self-contained Rust daemon. Every AI tool you use plugs into it via MCP and gets the same memory. Stop losing context every conversation. Stop pasting "remember that I…" into every model. Stop paying SaaS fees for what runs locally on your laptop.
projects/alpha/decisions. clients/acme/contracts. research/quantum/papers. Recall scopes to a subtree, never bleeds across contexts. Your finance memories don't leak into your code reviews.
Every link between memories carries valid_from and valid_until. Ask "what was true on Feb 15?" and the system reconstructs the world as you knew it. Supersession is recorded, not destroyed.
Every hot path has a published p95 budget. CI fails any pull request that breaks them by more than 10%. memory_session_start < 100ms. memory_recall < 50ms. No silence-by-default.
PERFORMANCE.md table and a CI gate that enforces it.
Each layer has one job. The Surface layer talks to AI clients. The Core layer reasons about memory. The Safety layer enforces what's allowed. The State layer persists everything to disk. Cross-layer dependencies flow downward only.
cp memory.db is a backup.
Four tiers, each adding capability and dependency. Start at keyword for laptop-grade text search with zero install. Climb to autonomous for self-curating memory with neural reranking. Switch tiers per-deployment.
memory_recall tool returns better results at higher tiers, but always returns results. You can demote at any time — your data is the same on disk.
Every memory follows the same path. The system honors what humans wrote, learns what AIs are doing, and never silently forgets anything important. Compaction is opt-in, archive is reversible, hard delete is your call.
memory_archive_purge --older-than-days <N>, and that fires governance approval.
Numbers below are real measurements, not aspirational. ai-memory bench runs the canonical 1,000-memory workload and reports p50/p95/p99. CI fails any PR that exceeds budget by more than 10%. Hardware baseline: Apple M4 / 32 GB / NVMe SSD.
ai-memory bench on ubuntu-latest and posts a workflow summary with the table above. A regression of more than 10% on any p95 fails the build. There is no "we'll fix the latency later" path.
The v0.6.3 coverage campaign took ai-memory from 56.7% line coverage to 93.08% across 9 waves of parallel agent work. 26 closers shipped ~1,200 net new tests over the ~30K-line Rust codebase. Full report: CAMPAIGN-FINAL-METRICS.md
Run ai-memory on N machines. Writes propagate as W-of-N quorum — by default 2 of 3. Reads stay local; writes acknowledge after quorum. Every peer authenticates via mTLS with fingerprint allowlist. Catchup loop closes partition windows automatically.
Both sides verify each other's certificate. Fingerprint allowlist prevents accidental joins. No central PKI required.
Per-peer sync-state cursor advances with successful pulls. Re-joining peer fast-catches-up to current epoch.
Caller sees structured error: which peers responded, which timed out. No silent partial writes.
ai-memory runs the same way on a developer's laptop as it does on a federation of state-government data centers. The differentiator is configuration, not code path: federation peers, mTLS allowlists, governance policies, autonomous-tier resources.
ai-memory mcp in the MCP config. Memory survives restarts, updates, machine swaps.Public release sequence. Each release ships one demoable headline plus operational substrate the next release builds on. No version skipping. No quiet feature drift. Public ROADMAP.md →
projects/alpha/decisions)bench.yml)Honest comparison against the practical alternatives. Each has its place; ai-memory's place is "single binary, local-first, every AI, sub-100ms".
| Capability | ai-memory | Vector DB (Chroma, Qdrant, etc.) |
SaaS memory (ChatGPT memory, etc.) |
mempalace | Raw text (notes, READMEs) |
|---|---|---|---|---|---|
| AI-agnostic (works with any MCP client) | ✓ | ✗ | ✗ | ✓ | ✓ |
| Cross-session persistence | ✓ | ✓ | ✓ | ✓ | ✓ |
| Hierarchical namespaces | ✓ | ✗ | ✗ | ✓ | ~ |
| Temporal-validity knowledge graph | ✓ | ✗ | ✗ | ~ | ✗ |
| Published latency budgets + CI guard | ✓ | ✗ | ✗ | ✓ | ✗ |
| Hybrid recall (FTS + vector + reranker) | ✓ | ~ | ✗ | ✓ | ✗ |
| Federation across machines | ✓ | ~ | N/A | ✗ | ✗ |
| Local-first · zero cloud deps | ✓ | ~ | ✗ | ✓ | ✓ |
| Single binary install | ✓ | ✗ | N/A | ✗ | N/A |
| mTLS federation | ✓ | ✗ | N/A | ✗ | ✗ |
| Self-curating background daemon | ✓ | ✗ | ✗ | ✗ | ✗ |
| Apache 2.0 OSS · auditable source | ✓ | ~ | ✗ | ✓ | N/A |
| Air-gap deployable | ✓ | ~ | ✗ | ✓ | ✓ |
| Per-namespace governance | ✓ | ✗ | ✗ | ✗ | ✗ |
| Webhook subscriptions for SIEM | ✓ | ✗ | ✗ | ✗ | ✗ |
| Sub-100ms session-start budget | ✓ 42ms | ~ | ~ | ✓ | N/A |
Each step strengthens the trust boundary without breaking the layer below it. The OSS binary is operable at every step. The AgenticMem commercial tiers add managed services on top of what's already shipped.
Every quantitative claim ai-memory makes, sourced from the post-v0.6.3 codebase and the public CAMPAIGN-FINAL-METRICS document.
You have ~30 conversations a day with one or more AIs. Each starts cold. Each ends with knowledge that vanishes. Over a year that's roughly 11,000 lost contexts — a year's worth of relationship-building with the most powerful tool you've ever owned, evaporated every 4 hours.
ai-memory turns those 11,000 cold-starts into one continuous conversation that learns about you over time.
A 25-person engineering team using Claude collectively burns ~600 cold-start latencies per day. At 200ms each, that's 2 minutes/day of pure latency — but the bigger cost is the re-paste: explaining the same project context, repeatedly, to AIs that can't share what they learned.
A federated ai-memory cluster shares understood context across the team. New hires walk into the conversation already in progress.
Every GenAI vendor wants your data. Every compliance officer wants it on your premise. Every architect wants it durable. Every CFO wants it predictable. The only stack that satisfies all four constraints is a local-first memory layer with a published latency contract — sitting underneath whatever AI vendor you happen to use today.
ai-memory is that layer. Apache 2.0. Single binary. mTLS federation. CI-guarded budgets. Auditable from git clone to deployed binary in 60 seconds.
AI mandates are real. Cloud bans are real. Foreign-vendor concerns are real. An OSS Rust binary that runs entirely on your hardware, requires zero outbound traffic, and ships with auditable source code is the only AI-memory primitive that works for federal, state, local, and municipal deployments.
v0.7's attested-identity work targets FIPS-grade key handling. v1.0's federation maturity work targets multi-region resilience. Today's v0.6.3 already runs air-gapped with no compromise.
This page is the hub. Three concentric rings: six audience-facing pages (release spotlight, feature matrix, data flow, integrations, audiences, release pipeline), twelve feature deep-dives (tiers, rules, TTLs, archival, encryption, hierarchies, KG, autonomous, A2A, lifecycle, performance, credits), and five SME-detail references (schema, types, validators, governance, tracing). Pick what your audience needs.
The 6 streams shipped, the 9 coverage waves, the 2 SSRF defects fixed, the upgrade path. Animated coverage timeline + before/after security diff.
Every MCP tool, every HTTP endpoint, every CLI command — categorized, cross-referenced. 26 + 39 + 28 = 93 operations. The SME's full reference.
Animated write path, read path with hybrid 70/30 fusion, federation W=2 of N=3 quorum diagram. Where every byte goes.
13 AI clients tested with one-block setup snippets each. Claude · ChatGPT · Cursor · Windsurf · Continue · Codex · Gemini · Grok · OpenClaw · Hermes · Llama.
Solo dev → Startup → Mid-market → Enterprise → Government. Pain → Fix → ROI per audience. Deployment patterns. Procurement-ready specs.
Tag → 5 platforms → 5 distribution channels → all signed. CI gates. SBOM. Reproducible builds. Procurement-ready operational spec.
The grand-slam features. Tiers + TTLs (the retention story). Rules (per-namespace policy stack). Archival (two-stage soft-then-hard delete). Encryption (SQLCipher, mTLS, HMAC). Hierarchies (8-level memory trees). Knowledge Graph (4 relations + temporal validity). Autonomous (Gemma 4-powered). A2A messaging. Full lifecycle. Performance + bench tool. Credits (Google Gemma 4, Nomic, Hugging Face, Ollama, SQLite, Rust ecosystem).
Short (6h) / Mid (7d) / Long (no TTL). Mirrors human memory architecture. Promotion path with governance gates. The default that lets agents forget noise.
Validation → scope → governance → namespace standard → parent inheritance. Five rule layers, every refusal named with a reason. Multi-tenant isolation, compliance retention, AI-supervisor patterns.
Per-write expires_at + ttl_secs. Per-tier defaults. Daemon-config overrides. Access-driven extension. archive_on_gc. Every dial that controls memory lifetime.
archive → restore or purge. Five archive MCP tools. archive_on_gc soft-delete. auto_purge retention windows. Compliance patterns for GDPR, retention SLAs, forensics.
SQLCipher AES-256 at-rest. mTLS + fingerprint allowlist for federation. HMAC-SHA256 webhooks. Signed git tags + SBOM. v0.7 Ed25519 attested identity roadmap.
8-level deep namespace paths. 5 visibility scopes (private/team/unit/org/collective). Namespace standards inheritance. memory_get_taxonomy tree walker. v0.6.3 Stream A.
4 relation types. Entity registry with alias resolution. Temporal validity columns. memory_kg_query / kg_timeline / kg_invalidate. v0.6.3 Streams B + C.
Auto-tag, consolidate, expand-query, contradiction detection, memory reflection, session-start. All powered by Google's open-source Gemma 4 via Ollama. Local-first.
memory_notify pushes to inbox (federation-aware). memory_subscribe webhooks fan out events. HMAC-SHA256 signed dispatch. Two patterns, one toolkit.
store → access → consolidate → promote → archive → restore or purge. Six stages, eleven transitions, every transition leaves an audit trail. Timeline visualization.
Public p95 budgets per operation. bench tool with --baseline / --history / --update-performance-md. CI bench gate that fails on regressions. v0.6.3 Streams E + F.
Open-source acknowledgements. Google for Gemma 4. Nomic AI for embeddings. Hugging Face for tokenizers + reranker. Ollama, SQLite, the Rust ecosystem.
When the audience-facing pages are not enough — when an evaluating engineer needs to see every SQL column, every Rust type, every validator, every governance verdict, every log line. These pages are the reference contract for AI clients integrating against ai-memory.
Every SQLite table column-by-column. 9 tables, 15+ indexes, 6 foreign keys, v15 migration timeline. Postgres mirror via SAL. The persistence contract.
Every public struct and enum from src/models.rs. 5 enums, 22 structs, 5 consts. Field-level types, defaults, serde tags. The wire-format contract.
23 validate_* functions, every limit explicit, every closed set enumerated. Defense in depth — same checks on HTTP, MCP, CLI, federation receive.
The decision tree. 4 levels × 3 actions matrix. Three approver flavors (Human, Agent, Consensus). Pending-action lifecycle. Federation propagation.
176 log call sites across 11 modules. Setup, level taxonomy, canonical phrases AI clients can grep for. Incident-review recipes.
@media print stylesheet that strips chrome, switches to white background, and applies page-break-protection. Print as a PDF for board decks, procurement reviews, or a "give this to your security team" packet.
No signup. No telemetry. No SaaS. brew install ai-memory or cargo install ai-memory — your laptop, your data, your AI.