SHIPS TODAY · v0.6.3

Tier 1 — Single node, single agent

The bedrock primitive. One process. One consumer. Zero network. SQLite + WAL, FTS5 keyword recall, in-process HNSW vector index, no replication, no peers.

SQLite WAL FTS5 + HNSW ~10⁶ memories sub-ms recall

This is what runs on every developer's laptop, every Claude Code session, every solo agent that needs to remember things between turns. SQLite under the hood, WAL-mode for atomic writes, FTS5 for keyword recall, an in-process HNSW vector index for semantic recall. No replication, no peers, no governance to coordinate.

Architecture diagram

Memory data flow — single agent.

T1 · Memory data flow — single agent
AI agent Claude · GPT · local LLM MCP / HTTP / CLI client single namespace ai-memory process single Arc<Mutex<Connection>> MCP stdio JSON-RPC HTTP :9077 /api/v1/* (Axum) CLI clap subcommands Recall pipeline FTS5 keyword + HNSW semantic adaptive blend · touch + auto-promote Governance policy / pending scope filter v0.6.2+ Knowledge graph links · taxonomy temporal validity v0.6.3 SQLite (WAL) memories memory_links FTS5 virtual table HNSW (in-memory) archive store / link recall / kg_query
write / store recall / fetch persistent I/O
Agent talks to ai-memory over MCP, HTTP, or CLI. ai-memory holds a single SQLite connection behind a mutex. Every recall is a hot loop — FTS5 + HNSW + scoring + access-count touch — all in-process.
Walkthrough

What's actually happening.

A write — memory_store

  1. The agent calls memory_store over MCP (or POST /api/v1/memory over HTTP, or ai-memory store ... from the CLI).
  2. The handler validates the input (src/validate.rs), then acquires the connection mutex.
  3. Governance gate runs (src/db.rsgovernance::check()). For T1 with a single agent, the default policy returns Allow immediately. With a stricter policy, the write may be queued as a PendingAction (see Tier 2 for that flow).
  4. Memory row is INSERTed (or UPDATEd on (title, namespace) collision). FTS5 is kept in sync via triggers. The scope_idx generated column gets recomputed automatically.
  5. If the feature tier is semantic or above, the embedding is generated and added to the in-memory HNSW index.
  6. WAL is fsynced; the call returns.

A recall — memory_recall

  1. Agent calls memory_recall with a context string.
  2. FTS5 keyword pass — fuzzy OR query, scored by fts.rank + priority*0.5 + access_count*0.1 + confidence*2.0 + tier_bonus + recency_factor.
  3. Semantic pass — cosine similarity via HNSW (or linear scan fallback in the keyword tier).
  4. Adaptive blend — semantic weight varies 0.50 (short content) → 0.15 (long content) because embeddings lose information on long text. Final score is semantic_weight * cosine + (1 - semantic_weight) * norm_fts.
  5. Touch — every returned memory has access_count incremented, TTL extended, and (for mid-tier memories) auto-promoted to long-tier on the 5th access. Priority bumps every 10 accesses.
  6. Result formatted as toon_compact (40-60% smaller than JSON), toon, or json.

That's it. No network, no peers, no consensus.

Deployment recipe

Install, run, verify.

# install
cargo install ai-memory --version 0.6.3

# run as MCP server (for Claude Code, etc.)
ai-memory --db ~/.claude/ai-memory.db mcp --tier semantic

# OR run as HTTP daemon
ai-memory --db ~/.claude/ai-memory.db serve --bind 127.0.0.1:9077 --tier semantic

# verify
ai-memory --db ~/.claude/ai-memory.db stats

A single config file at ~/.config/ai-memory/config.toml covers tier selection, embedding model, Ollama URL (for the smart/autonomous tiers), and DB path.

Wiring

Governance, skills, and attestations at T1.

Limits

Honest ceilings.

DimensionT1 ceilingWhen it bites
Concurrent writers1 (mutex-serialized)If a second writer joins, you've graduated to T2
Total memories~10⁶ before HNSW RAM pinches (~1.5 GB at 384-dim)Vector index lives in-process
FTS5 query latencysub-millisecond at 10⁵ rowsStays fast through 10⁶
Crash recoveryWAL replay on next openAtomic — never lose committed writes
Network exposurenone unless you bind HTTP to non-loopbackDefault is 127.0.0.1

When you hit any of these, walk to Tier 2 (more agents on this same node) or Tier 3 (more nodes).

Source

Source-of-truth references.