v0.8.0 · distributed-coordination · Apache-2.0

Memory for AI agents that runs on your hardware — and proves what they did.

ai-memory is the endpoint-resident, model-neutral memory layer for AI agents — local-first Rust + SQLite on your own hardware, from servers to mobile to fully air-gapped, with no cloud dependency and no phone-home. Every state-changing operation lands in an append-only, tamper-evident audit chain — enable attestation and each write is Ed25519-signed at the source — so you can reconstruct what any agent knew, and when. Neutral by construction: any MCP client (Claude, GPT, Grok, Gemini, Llama, local models) reads and writes the same store. It doesn't just retrieve — it governs: typed, deterministic refusals at the endpoint, so the substrate can be stopped and audited without corrupting the record.

Local-first · air-gappable Model-neutral (MCP) 101 MCP tools 92 HTTP routes (78 unique paths) 87 CLI subcommands Ed25519 attested audit chain SQLite + Postgres / AGE NSA CSI MCP 10/10 · 7/7 15,951 / 0 regression suite

Install in 60 seconds What's new in v0.8.0 Why ai-memory?

brew install alphaonedev/tap/ai-memory   # or: cargo install ai-memory

Append-only signed_events chain with a SHA-256 cross-row hash — verify any window with ai-memory verify-chain. Local-first; no telemetry, no phone-home. Complements your identity plane (Entra / Okta / SailPoint) — it doesn't replace it. Security is self-attested; no third-party audit yet — read the threat model.

📱 Endpoint-class deployment · phone to planet

ai-memory runs on your cellphone. And on your IoT.

Most "AI memory" stacks assume a cloud server, a vector database, and a network. ai-memory ships a Rust core that cross-compiles to iOS, Android, and embedded Linux — the same substrate, the same typed memory + KG + signed reflections, in your pocket. No cloud round-trip. No data leaving the device unless you say so.

The lib target produces an xcframework for iOS (device + sim arm64 + sim x86_64) and a jniLibs/<abi> layout for Android (4 ABIs). FTS5 + WAL + HNSW vector recall + embedder all run on the device CPU. CI runtime-tests on iOS Simulator and Android emulator gate every release.

iOS

aarch64-apple-ios
device + sim

Android

aarch64-linux-android
4 ABIs / jniLibs

Embedded Linux

arm64 / armv7
Raspberry Pi, Jetson

Edge IoT

offline-first
opt-in federation

Mobile & IoT deployment guide →

Why ai-memory

One substrate. Five deployment scales. Identical semantics.

Most AI tooling forces a tier choice up front: a hobby SDK, a cloud SaaS, or a bespoke enterprise rollout. ai-memory is one substrate — memory_store, memory_recall, memory_link, memory_reflect — that scales from a single laptop with one AI assistant to a global hive of attesting NHI agents. Same tools. Same wire format. Same governance.

T1 · SINGLETON

Laptop / phone

One agent, one device, one DB. Offline-first.

T2 · HOUSEHOLD

LAN / server

Multi-agent, HTTP daemon, shared sqlite.

T3 · ENTERPRISE

Rack / DC

Postgres + Apache AGE, mTLS, governance L1–L6.

T4 · REGION

Swarm

Quorum federation, peer attestation, HMAC webhooks.

T5 · PLANET

Global hive

Cross-region Ed25519-signed sync, forensic export.

Reference architectures T1→T5 →

Features

What ships in the binary.

One ai-memory Rust binary exposes three interfaces over a shared storage layer: MCP stdio JSON-RPC, an HTTP API on port 9077, and a clap-based CLI. Pick one or use all three.

Storage

Typed memory, three tiers

Short / mid / long lived rows with auto-promotion on access, sliding-window TTL, FTS5 keyword + HNSW semantic recall, hybrid blended scoring.

Knowledge graph

Typed links, temporal validity

Nine relation kinds (related_to, supersedes, contradicts, derived_from, reflects_on, derives_from, plus the v0.8.0 typed-cognition trio decomposes_into, depends_on, advances) with valid_from / valid_until. Apache AGE acceleration on Postgres.

Attestation

Ed25519 signed reflections

Per-agent keypair signs every link. Append-only signed_events audit chain with cross-row hash chaining. Forensic-bundle export & verify.

Governance

Substrate rules L1–L6

Operator-signed rule layer enforces policy on every write. Fail-CLOSED by default. Real permission system replaces v0.6.x advisory governance.

Hooks

Programmable 25-event pipeline

Subprocess JSON-stdio handlers fire on store, recall, reflect, link, GC, federation push. HMAC-signed webhooks with DLQ + replay.

Federation

Quorum sync, peer attestation

mTLS allowlist, per-message nonce freshness, signature bound to body+nonce. Push DLQ. Operator-controlled trust posture per peer.

Provider-agnostic LLM

15+ vendor backends

Ollama, OpenAI, xAI Grok, Anthropic, Gemini, DeepSeek, Kimi, Qwen, Mistral, Groq, Together, Cerebras, OpenRouter, Fireworks, LMStudio. One env var selects.

Capabilities v3

Pre-computed calibration

LLMs converge on accurate first-answer descriptions. memory_capabilities ships summary, to_describe_to_user, callable_now, agent_permitted_families.

Sidechain transcripts

Compressed conversation replay

zstd-3 BLOBs + memory_replay. Full conversation context preserved out-of-band from the memory row.

Recursive learning

Reflect, atomise, consolidate

memory_reflect with a hard depth cap, atomisation into linked atom rows, consolidation, persona generation, and a skills registry. Agents that improve their own memory — under governance.

Coordination

Distributed-coordination primitives

v0.8.0 Pillar-1: a typed action DAG with state machine, leases + heartbeats, Ed25519-signed inter-agent signals, attested checkpoints, and scheduled routines — the substrate for multi-agent orchestration.

Typed cognition

Goal / Plan / Step lifecycle

v0.8.0 Pillar-2: a typed-cognition vocabulary (Goal/Plan/Step + lifecycle_state) wired into the knowledge graph via the decomposes_into / depends_on / advances relations — plan trees agents can reason over.

Observability

Tracing, metrics, forensics

tracing structured logs, a bare /metrics Prometheus surface, the append-only signed-events audit chain, and verifiable forensic-bundle export for incident response.

See what's new in v0.8.0 → · Distributed coordination → · Atlas: everything on one page → · Heterogeneous AI NHI assessment →

Install

Pick your platform. Three minutes to first recall.

A single static Rust binary — no daemons required to get started. The full install matrix (Windows, Docker, .deb, .rpm, source, mobile FFI) lives in the install reference.

macOS · Linux

Homebrew

brew install alphaonedev/tap/ai-memory

Fedora · RHEL

COPR

sudo dnf copr enable alpha-one-ai/ai-memory
sudo dnf install ai-memory

Any platform · Rust

cargo

cargo install ai-memory

Container · Plan C

Docker

docker run --rm -it \
  ghcr.io/alphaonedev/ai-memory:0.8.0

iOS · Android

Mobile FFI

# release artifacts
ai-memory-ios.xcframework.tar.gz
ai-memory-android.tar.gz

Source

Build from git

git clone https://github.com/alphaonedev/ai-memory-mcp
cd ai-memory-mcp && cargo build --release

Quickstart for non-technical users → 3-minute engineer quickstart

Configure the LLM backend

Plug ai-memory's `smart` / `autonomous` tier into any LLM.

ai-memory ships a provider-agnostic LLM client. Any of 16+ backends — local Ollama, LMStudio, vLLM, llama.cpp server, xAI Grok, OpenAI, Anthropic, Google Gemini, DeepSeek, Kimi (Moonshot), Qwen (DashScope), Mistral, Groq, Together, Cerebras, OpenRouter, Fireworks — configured via a [llm] section in ~/.config/ai-memory/config.toml (recommended, post-#1146) or via AI_MEMORY_LLM_BACKEND env vars (override path).

No GPU required. Nothing here is hard-wired to a GPU, to Ollama, or to Gemma — those are the local-first default, not a requirement. Wherever you have API access, run on hosts with no GPUs, and want autonomous mode with --profile full, point [llm] at a remote cloud API (e.g. OpenRouter as a low-cost example) or an internal air-gapped HA inference endpoint. See the No GPU required — any LLM backend walkthrough on the autonomous page.

Single source of truth. config.toml is consumed by every surface — the MCP server, the HTTP daemon, ai-memory atomise, ai-memory curator, the boot banner, and the ai-memory doctor reachability probe — so they all report the same backend. Example for xAI Grok 4.3:

schema_version = 2

[llm]
backend     = "xai"
model       = "grok-4.3"
base_url    = "https://api.x.ai/v1"
api_key_env = "XAI_API_KEY"            # env-var name (inline keys rejected at parse time)

Export XAI_API_KEY in your shell rc; the MCP config stays minimal (no env: block needed). The override path — an env: block on the MCP server config — still works and takes precedence; it's useful for CI / per-session tweaks but shell exports do NOT reach the MCP-spawned subprocess.

Running under launchd / systemd? A serve or curator --daemon started by a service manager does NOT inherit a shell-rc export either — its env comes only from the unit/plist. Prefer api_key_file = "…/api.key" (mode 0400, env-independent) or declare the key in the unit's Environment= / plist EnvironmentVariables dict.

Canonical config schema → Per-vendor recipes → Standalone CLI / HTTP daemon setup

Migrating to v0.8.0

Already running v0.6.4 or v0.7.x? The upgrade is one boot.

v0.8.0 migrates your existing database in place. The sqlite schema steps up to v70 on the first ai-memory serve after the upgrade (a v0.7.x DB steps v57 → v70; a v0.6.4 DB walks the full ladder). It is non-destructive — every existing memory carries forward unchanged, archive → restore is lossless, and new columns default to safe values for legacy rows. Typical downtime is 30 seconds to 2 minutes on a laptop-sized database.

Step 1 · Back up

Copy the DB + sidecars

Stop the daemon, then copy ai-memory.db and any -wal / -shm sidecars to a .bak.pre-v08 file. Verify it with PRAGMA integrity_check.

Step 2 · Upgrade

Install v0.8.0 & boot once

Install the new binary, then run ai-memory serve. The schema ladder migrates automatically and you are back online when it finishes.

Step 3 · Roll back if needed

File restore is the escape hatch

The ladder is idempotent on replay but not reversible in place. To roll back, restore the .bak.pre-v08 file and reinstall the prior binary.

Walk-me-through-it migration guide → Deep technical companion Postgres migration

Integrate

Talk to your AI assistant of choice.

ai-memory speaks the Model Context Protocol over stdio JSON-RPC, plus an HTTP REST API for any client that prefers HTTP. Out of the box it pairs with Claude Code, Cursor, ChatGPT desktop, and any MCP-compliant harness.

Anthropic

Claude Code

Cursor-style coding agent with MCP. SessionStart hook auto-loads relevant memory.

Configure →

Anysphere

Cursor

MCP server config. Persistent memory across editor sessions and projects.

Configure →

OpenAI

ChatGPT desktop

MCP integration. Cross-session recall for the desktop client.

Configure →

Standard

Generic MCP

stdio JSON-RPC 2.0. Any harness that implements MCP works unchanged.

Configure →

HTTP

REST API

92 routes (78 unique paths) at /api/v1/. Axum daemon on port 9077. mTLS optional.

API reference →

CLI

Command line

87 subcommands (89 with --features sal), optional --json. For shell scripts, cron, and humans.

CLI reference →

Full integration guide — any AI, any harness →

Docs by audience

Pick your role. We will get to the point.

Four pathways, each answering one question in the first paragraph. Reference material lives in the linked docs; the audience pages stay short.

Everyone · non-technical

I want to talk to my AI and have it remember.

Install ai-memory in 60 seconds. Plug it into Claude Code or ChatGPT desktop. Your assistant remembers across sessions. No databases, no servers, no jargon.

Plain-language tour →

Decision maker · C-level / procurement

I need the risk, cost, and compliance picture.

Apache-2.0, no SaaS billing surface, vendor-independent LLM layer, NSA CSI MCP mapping, and a federated reference architecture proven live on a 15-node, 3-region fleet.

Decision-maker brief →

Developer · integration

I am building with ai-memory.

Three interfaces: MCP, HTTP, CLI. Persistent typed memory for your AI app, agent, or harness. Generic MCP works out of the box; Claude Code, Cursor, ChatGPT desktop, others have first-class recipes.

Developer deep-dive →

Operator · SME engineer / architect

I am deploying across racks, DCs, regions.

Postgres + Apache AGE backend, mTLS daemon, federated quorum, governance L1–L6, Ed25519 attestation, swarm + hive topologies. Reproducible reference architectures T3–T5.

Operator deep-dive →

Working refs: install-quickstart · integration-guide · enterprise-deployment. AI-NHI design essays: What · How · Why. A2A federation: see A2A messaging for the federation peer-to-peer recall walkthrough.

Reference architectures

Five topologies. One playbook each.

Each tier has a reproducible reference deployment: which interfaces are enabled, which backend is selected, which governance posture, which federation mode. Pick the tier that fits today; the upgrade path to the next tier is documented.

Singleton

One agent · one DB · offline

Household

LAN · multi-agent · sqlite

Enterprise

Postgres+AGE · mTLS · rack

Region

Swarm · quorum · webhooks

Planet

Global hive · cross-region sync

GRAND SLAM

3-Region Hive

15 nodes · W=2 quorum · proven live (do-1461)

Deep dives

Topology diagrams Architecture diagrams, capacity guidance, failure-mode notes for each tier. T1→T5 narrative Walks the continuum: which capability lights up at which tier. Grand Slam reference architecture The live-proven 3-region Batman-mode AI Agent Hive: 15 nodes, 9 federated peers, three encryption legs, each proven positive + negative (do-1461). Agent hierarchies Hive / swarm / mesh patterns and the NHI roles they assume. Federation Quorum sync, peer attestation, mTLS allowlist, nonce + signature posture. Distributed coordination v0.8.0 Pillar-1 + Pillar-2: action DAG, leases, signed signals, attested checkpoints, routines, and the Goal/Plan/Step typed-cognition lifecycle. Zero-Touch Trust CA-rooted federation identity at scale — O(1) enrollment, short-lived auto-rotating credentials, hierarchical trust. OS-agnostic. Reproducible baselines Three reproducibility contracts: two-round clean-room 0→60 fleet reproduction (do-1461, 119/119 verify checks green both rounds), bench p95 baselines + regression gates, the release-gate full-suite contract.

Admin & operator

Run it in production.

Operator-grade documentation for SRE, platform, and security teams. Hardening, observability, upgrade paths, governance rule signing, forensic export — everything you would expect for a substrate that holds your agents' identities.

Production deployment Daemon, systemd unit, TLS, mTLS, hardening, observability, Plan C container. Admin guide Backup/restore, schema migrations, GC tuning, archive policy. Governance L1–L6 Substrate rules, operator-signed policy, fail-closed posture. Encryption at rest sqlcipher build, passphrase-file discipline, key directory layout. Forensic export Verifiable bundle export for audit, compliance, incident response. Tracing & metrics Structured tracing, Prometheus /metrics, signed-event audit chain. Troubleshooting Common errors, recovery procedures, diagnostic CLI flags. Migration v0.6.4 → v0.7.0 Step-by-step migration, schema changes, breaking-change list. v0.8.0 release notes What changed in the current release — distributed coordination, typed cognition, federation hardening. Track-by-track ship status.

Reference

Dig deeper.

The full docs surface — concept pages, API, guides, ADRs — lives alongside the source.

Compliance & ProcurementNSA CSI MCP mapping · honest limitations · Memory Portability Spec v1 Evidence — frozen claimsEvery public number, pinned to a commit + test artifact. 15,951/0 final baseline. User guideEnd-to-end MCP tool reference Developer guideBuild from source, extend, add tools HTTP API92 routes (78 unique paths) CLI reference87 subcommands GlossaryEvery concept, one place Knowledge graphTyped links, temporal validity Agent identity (NHI)agent_id semantics Hook pipeline27 events, JSON-stdio handlers Confidence calibrationAuto-confidence, shadow, decay Atomisation (WT-1)Atom rows, parent links Engineering standardsCode, test, security, release Engineering disciplineThe #1558 literal burn-down: 497 → 28 baseline entries (−94%), ~2,700 sites routed to SSOT consts, CI ratchet + committed census AtlasEverything, one page hive-1461 DO baseline7-node federated DigitalOcean baseline · 40/40 validate · 22/22 full-spectrum docker-1461 local baseline2-peer federated local-Docker baseline · 20/20 validate · 25/25 full-spectrum

ai-memory runs on your cellphone. And on your IoT.

One substrate. Five deployment scales. Identical semantics.

What ships in the binary.

Typed memory, three tiers

Typed links, temporal validity

Ed25519 signed reflections

Substrate rules L1–L6

Programmable 25-event pipeline

Quorum sync, peer attestation

15+ vendor backends

Pre-computed calibration

Compressed conversation replay

Reflect, atomise, consolidate

Distributed-coordination primitives

Goal / Plan / Step lifecycle

Tracing, metrics, forensics

Pick your platform. Three minutes to first recall.

Plug ai-memory's smart / autonomous tier into any LLM.

Already running v0.6.4 or v0.7.x? The upgrade is one boot.

Copy the DB + sidecars

Install v0.8.0 & boot once

File restore is the escape hatch

Talk to your AI assistant of choice.

Claude Code

Cursor

ChatGPT desktop

Generic MCP

REST API

Command line

Pick your role. We will get to the point.

I want to talk to my AI and have it remember.

I need the risk, cost, and compliance picture.

I am building with ai-memory.

I am deploying across racks, DCs, regions.

Five topologies. One playbook each.

Deep dives

Run it in production.

Dig deeper.

Plug ai-memory's `smart` / `autonomous` tier into any LLM.