ai-memory · T1 — Single node, single agent

This is what runs on every developer's laptop, every Claude Code session, every solo agent that needs to remember things between turns. SQLite under the hood, WAL-mode for atomic writes, FTS5 for keyword recall, an in-process HNSW vector index for semantic recall. No replication, no peers, no governance to coordinate.

Architecture diagram

Memory data flow — single agent.

T1 · Memory data flow — single agent

write / store recall / fetch persistent I/O

Agent talks to ai-memory over MCP, HTTP, or CLI. ai-memory holds a single SQLite connection behind a mutex. Every recall is a hot loop — FTS5 + HNSW + scoring + access-count touch — all in-process.

Walkthrough

What's actually happening.

A write — `memory_store`

The agent calls memory_store over MCP (or POST /api/v1/memory over HTTP, or ai-memory store ... from the CLI).
The handler validates the input (src/validate.rs), then acquires the connection mutex.
Governance gate runs (src/db.rs → governance::check()). For T1 with a single agent, the default policy returns Allow immediately. With a stricter policy, the write may be queued as a PendingAction (see Tier 2 for that flow).
Memory row is INSERTed (or UPDATEd on (title, namespace) collision). FTS5 is kept in sync via triggers. The scope_idx generated column gets recomputed automatically.
If the feature tier is semantic or above, the embedding is generated and added to the in-memory HNSW index.
WAL is fsynced; the call returns.

A recall — `memory_recall`

Agent calls memory_recall with a context string.
FTS5 keyword pass — fuzzy OR query, scored by fts.rank + priority*0.5 + access_count*0.1 + confidence*2.0 + tier_bonus + recency_factor.
Semantic pass — cosine similarity via HNSW (or linear scan fallback in the keyword tier).
Adaptive blend — semantic weight varies 0.50 (short content) → 0.15 (long content) because embeddings lose information on long text. Final score is semantic_weight * cosine + (1 - semantic_weight) * norm_fts.
Touch — every returned memory has access_count incremented, TTL extended, and (for mid-tier memories) auto-promoted to long-tier on the 5th access. Priority bumps every 10 accesses.
Result formatted as toon_compact (40-60% smaller than JSON), toon, or json.

That's it. No network, no peers, no consensus.

Deployment recipe

Install, run, verify.

# install
cargo install ai-memory --version 0.6.3

# run as MCP server (for Claude Code, etc.)
ai-memory --db ~/.claude/ai-memory.db mcp --tier semantic

# OR run as HTTP daemon
ai-memory --db ~/.claude/ai-memory.db serve --bind 127.0.0.1:9077 --tier semantic

# verify
ai-memory --db ~/.claude/ai-memory.db stats

A single config file at ~/.config/ai-memory/config.toml covers tier selection, embedding model, Ollama URL (for the smart/autonomous tiers), and DB path.

Wiring

Governance, skills, and attestations at T1.

Governance — Default policy is write: any, promote: any, delete: owner. Override per-namespace via memory_namespace_set_standard if you want pending-approval behavior for sensitive memories. The same governance machinery as T2+ is present; you just rarely use it with one agent.
Skills — Auto-tagging (memory_auto_tag) and contradiction detection (memory_detect_contradiction) require the smart or autonomous feature tier (Ollama backend). Capabilities introspection v2 (v0.6.3) lets the agent ask the server which skills are available before invoking them.
Attestations — Not enforced at T1; the signature column on memory_links (added schema v15 / v0.6.3) is reserved for v0.7 attested identity.

Limits

Honest ceilings.

Dimension	T1 ceiling	When it bites
Concurrent writers	1 (mutex-serialized)	If a second writer joins, you've graduated to T2
Total memories	~10⁶ before HNSW RAM pinches (~1.5 GB at 384-dim)	Vector index lives in-process
FTS5 query latency	sub-millisecond at 10⁵ rows	Stays fast through 10⁶
Crash recovery	WAL replay on next open	Atomic — never lose committed writes
Network exposure	none unless you bind HTTP to non-loopback	Default is `127.0.0.1`

When you hit any of these, walk to Tier 2 (more agents on this same node) or Tier 3 (more nodes).

Source

Source-of-truth references.

src/main.rs — daemon setup, command dispatch (26 CLI commands)
src/mcp.rs — MCP tool handlers wired to the same DB layer
src/handlers.rs — HTTP route handlers (the Axum router in main.rs registers 48 /api/v1/* routes)
src/db.rs — schema, recall scoring, GC, migrations (current schema v15 per v0.6.3 release notes)
src/reranker.rs — adaptive blend semantic + keyword
src/hnsw.rs — in-memory ANN
docs/ARCHITECTURE.md — full module map

Tier 1 — Single node, single agent