# Developer Guide

## Architecture Overview

`ai-memory` is an AI-agnostic memory management system built as a single Rust binary that serves three roles:

1. **MCP tool server** -- stdio JSON-RPC server exposing 74 advertised entries at `--profile full` (73 callable memory tools + the always-on `memory_capabilities` bootstrap) + 2 MCP prompts for any MCP-compatible AI client (Claude AI, OpenAI ChatGPT, xAI Grok, META Llama, and others)
2. **CLI tool** -- direct SQLite operations for store, recall, search, list, etc. (completely AI-agnostic)
3. **HTTP daemon** -- an Axum web server exposing the same operations as a REST API with 89 route registrations / 75 unique URL paths at v0.7.0 (completely AI-agnostic)

**Key architectural features:** Zero token cost (no context loaded until recall), TOON compact default response format (79% smaller than JSON), MCP prompts capability (`recall-first` behavioral rules + `memory-workflow` reference card), 4 feature tiers with optional local LLMs via Ollama, true dedup on title+namespace, 6-factor recall scoring with score field in responses.

All three interfaces share the same storage layer (`src/storage/`, exposed as the `db` alias) and validation layer (`validate.rs`). The daemon adds automatic garbage collection (every 30 minutes) and graceful shutdown with WAL checkpointing.

```
main.rs            -- Thin CLI shim (W6 refactor); top-level Command enum now lives in daemon_runtime.rs (82 subcommands at v0.7.x with --features sal-postgres, 80 in the default build)
daemon_runtime.rs  -- HTTP daemon `serve` bootstrap, MCP `mcp` dispatch, top-level clap Command enum
models/            -- Data structures: Memory (26 fields at v0.7.0), MemoryLink (6 relations at v0.7.0), MemoryKind (Batman Form-6 vocab), Citation/SourceSpan (Form-4), query types, constants
handlers/          -- HTTP request handlers split per domain (http.rs, federation_receive.rs, hook_subscribers.rs, transport.rs, plus per-surface modules: recall.rs, memories.rs, admin.rs, kg.rs, …); Axum extractors + JSON responses; error sanitization. Route-path SSOT in handlers/routes.rs (#1558 batch 4 — one const per production route path; lib.rs registers them, the postgres gate / federation receiver / doctor match on them)
storage/           -- sqlite SQL primitives; CRUD, FTS5, recall scoring, GC, migration (CURRENT_SCHEMA_VERSION = 57)
store/             -- SAL `MemoryStore` trait + adapter implementations (sqlite + postgres + AGE feature gates); new DB operations land here FIRST (post-#961)
mcp/               -- MCP server over stdio JSON-RPC; tool registry (registry.rs incl. the tool_names const module), per-tool handlers under tools/, JSON-RPC wire-constant SSOT (mcp/jsonrpc.rs, #1558 batch 3 — version tag, reserved error codes, method names), tool-call param-name SSOT (mcp/param_names.rs), notification handling
identity/          -- NHI identity: keypair storage (keypair.rs — DAEMON_KEYPAIR_LABEL), reserved-principal sentinel SSOT (sentinels.rs, #1558 batch 2 — DAEMON_PRINCIPAL, ANONYMOUS_INVALID, …; validate::RESERVED_AGENT_IDS is built from these), attestation (attest.rs), signing/verification (sign.rs/verify.rs), replay protection (replay.rs)
models/field_names.rs -- wire-field-name SSOT (one const per JSON response key shared across handlers/tools)
validate.rs        -- Input validation for all write paths (RequestValidator + single-field free fns)
errors.rs          -- Structured error types (ApiError, MemoryError), error_codes consts, the errors::msg wire-message const module, error sanitization for HTTP responses
color.rs           -- ANSI color output for CLI (zero dependencies, auto-detects terminal)
config.rs          -- Tier configuration system (keyword, semantic, smart, autonomous), feature gating, TtlConfig, archive_on_gc
embeddings.rs      -- Embedding pipeline: HuggingFace model loading, vector generation, cosine similarity
llm.rs             -- LLM integration via Ollama for query expansion, auto-tagging, contradiction detection
mine.rs            -- Retroactive conversation import from Claude, ChatGPT, and Slack exports
reranker.rs        -- Hybrid recall algorithm: blends semantic (embedding) and keyword (FTS5) scores
hnsw.rs            -- In-memory HNSW vector index for approximate nearest-neighbor search
governance/        -- Rule engine, agent-action evaluator, signed rule storage (L1-6 substrate rules)
atomisation/       -- WT-1 atomiser engine + LlmCurator
multistep_ingest/  -- Form 3 multi-step ingest orchestrator (two-phase deterministic + LLM)
synthesis/         -- Form 1 online dedup-and-synthesis
confidence/        -- Form 5 auto-confidence + shadow + decay
persona/           -- QW-2 persona-as-artifact generator
offload/           -- QW-3 context-offload primitive + TTL sweep
forensic/          -- L2-5 forensic bundle export/verify
federation/        -- Quorum sync, peer attestation, mTLS allowlist
kg/                -- Knowledge-graph traversal (recursive-CTE + AGE Cypher)
subscriptions.rs   -- HMAC-signed webhook dispatch (mandatory at v0.7.0 post R3-S1.HMAC; unsigned dispatch DISABLED), DLQ, replay
signed_events.rs   -- Append-only audit chain with V-4 cross-row hash chain
```

### Embedding Pipeline (semantic tier and above)

When running at the `semantic` tier or higher, ai-memory loads a HuggingFace embedding model at startup and generates dense vector embeddings for each memory. The pipeline:

1. **Model loading** (`embeddings.rs`) -- downloads and caches a sentence-transformer model from HuggingFace on first run
2. **Embedding generation** -- new memories are embedded at insert time; existing memories are backfilled on first startup with embeddings enabled
3. **Storage** -- embeddings are stored as BLOB columns in the `memories` table (schema migration v3)
4. **Hybrid recall** (`reranker.rs`) -- at recall time, the query is embedded and compared against stored embeddings via cosine similarity, then blended with FTS5 keyword scores to produce a final ranking

**Embedding models:**
- `all-MiniLM-L6-v2` (384 dimensions, ~90 MB) -- used at the `semantic` tier
- `nomic-embed-text-v1.5` (768 dimensions, ~270 MB) -- used at the `smart` and `autonomous` tiers

## Code Structure

### `src/main.rs`

- `Cli` struct with `clap` derive -- defines all CLI commands and global flags (`--db`, `--json`). Lives in `src/daemon_runtime.rs` (W6 refactor moved it off `src/main.rs`).
- `Command` enum (in `src/daemon_runtime.rs`) -- at v0.7.x the enum carries **82 unique variants** under `--features sal` (80 in the default build — the gap is the two sal-gated variants `Migrate` + `SchemaInit`; SSOT: `EXPECTED_CLI_SUBCOMMANDS_DEFAULT=80` / `EXPECTED_CLI_SUBCOMMANDS_SAL=82` in `src/lib.rs`, pinned by `tests/cli_subcommand_count_invariant.rs`): the v0.6.x core (`Serve`, `Mcp`, `Store`, `Update`, `Recall`, `Search`, `Get`, `List`, `Delete`, `Promote`, `Forget`, `Link`, `Consolidate`, `Resolve`, `Shell`, `Sync`, `SyncDaemon`, `AutoConsolidate`, `Gc`, `Stats`, `Namespaces`, `Namespace`, `Config`, `Export`, `Import`, `Completions`, `Man`, `Mine`, `Archive`, `Agents`, `Pending`, `Backup`, `Restore`, `Curator`, `Bench`, `Migrate` (gated `--features sal`), `SchemaInit` (gated `--features sal`), `Doctor`, `Boot`, `Install`, `Wrap`, `Logs`, `Audit`), the v0.7 additions (`Identity`, `Offload`, `Deref`, `Rules`, `Governance`, `VerifyReflectionChain`, `VerifySignedEventsChain`, `ExportForensicBundle`, `VerifyForensicBundle`, `ExportReflections`, `Atomise`, `Persona`, `Calibrate`, `Skill`, `Share`, `Expand`, `Reembed` (#1598)), the FX-12/FX-C3 MCP↔CLI parity batch (`KgQuery`, `FindPaths`, `RecallObservations`, `CheckDuplicate`, `Replay`, `Reflect`, `Subscribe`, `Unsubscribe`, `ListSubscriptions`, `SubscriptionReplay`, `SubscriptionDlqList`, `Notify`, `Inbox`, `IngestMultistep`, `KgInvalidate`, `KgTimeline`, `EntityRegister`, `EntityGetByAlias`, `DependentsOfInvalidated`, `ReflectionOrigin`, `QuotaStatus`), and the #1389 L2 `RecoverPreviousSession`. Run `ai-memory --help` for the live list.
- `StoreArgs` includes `--expires-at` and `--ttl-secs` flags for custom expiration
- `UpdateArgs` includes `--expires-at` flag for setting expiration on existing memories
- `ListArgs` includes `--offset` flag for pagination
- `auto_namespace()` -- detects namespace from git remote URL or directory name
- `human_age()` -- formats ISO timestamps as "2h ago", "3d ago" for CLI output
- `serve()` -- starts the Axum server with all routes (**89 production `.route(...)` registrations / 75 unique URL paths at v0.7.0** — includes `POST /memories/{id}/promote`, the 4 archive endpoints, namespace-standard endpoints, webhook subscription endpoints, KG endpoints, approval-SSE, quota status, link-verify, capture_turn, share, skills, the 14 #1111 MCP-parity paths, federation sync), spawns GC task, handles graceful shutdown via SIGINT with WAL checkpoint
- `cmd_*()` functions -- one per CLI command, each opens the DB directly

### `src/models/`

- `Tier` enum (`Short`, `Mid`, `Long`) with TTL defaults: 6h, 7d, none
- `Memory` struct -- the core data type with **26 fields at v0.7.0** (was 15 at v0.6.x): adds `reflection_depth` (Task 1/8), `memory_kind` (Batman Form-6 vocabulary), `entity_id` + `persona_version` (QW-2), `citations` + `source_uri` + `source_span` (Form-4 fact provenance), `confidence_source` + `confidence_signals` + `confidence_decayed_at` (Form-5 calibration), and `version` (schema v45 — Gap-1 optimistic concurrency for `memory_update`). Extensible `metadata` JSON column still present. Canonical truth in `src/models/memory.rs`.
- `MemoryLink` struct -- typed directional links between memories. **Six relation variants at v0.7.0** (was four at v0.6.x): `related_to`, `supersedes`, `contradicts`, `derived_from`, `reflects_on`, `derives_from`. Carries v0.7 temporal-validity (`valid_from`, `valid_until`, `observed_by`) and attestation (`signature`, `attest_level`, `signed_at`) columns.
- Request types: `CreateMemory`, `UpdateMemory`, `SearchQuery`, `ListQuery`, `RecallQuery`, `RecallBody`, `LinkBody`, `ForgetQuery`, `ConsolidateBody`, `ImportBody`
- Response types: `Stats`, `TierCount`, `NamespaceCount`
- `TtlConfig` struct -- per-tier TTL overrides loaded from `config.toml` (`short_ttl_secs`, `mid_ttl_secs`, `long_ttl_secs`, `short_extend_secs`, `mid_extend_secs`)
- `ResolvedTtl` struct -- resolved TTL values after merging config defaults with per-tier overrides
- Constants: `MAX_CONTENT_SIZE` (65536), `PROMOTION_THRESHOLD` (5), `SHORT_TTL_EXTEND_SECS` (3600), `MID_TTL_EXTEND_SECS` (86400)

### `src/mcp/`

The MCP (Model Context Protocol) server implementation. MCP is an open standard -- this server works with any MCP-compatible AI client. Runs over stdio, processing one JSON-RPC message per line. **At v0.7.0 the registry exposes 74 advertised entries at `--profile full`** (73 callable "memory tools" + the always-on `memory_capabilities` bootstrap; both numbers are intentional, see issue [#862](https://github.com/alphaonedev/ai-memory-mcp/issues/862)). Default `--profile core` ships 7 tools (the original 5 + `memory_load_family` + `memory_smart_load`) plus the always-on bootstrap.

The pre-#1066 monolithic `src/mcp.rs` is GONE — the module is split: `src/mcp/registry.rs` owns the canonical `registered_tools()` iterator + `tool_definitions()` view + the `tool_names` const module (74 canonical tool-name consts at v0.7.0 — extracted per #1187 / Wave-1 PR1, then +1 for `memory_capture_turn`; `tool_names::ALL.len()` is pinned against `Profile::full().expected_tool_count()`); `src/mcp/tools/*.rs` host per-tool handlers AND each tool's `<ToolName>Request` schemars struct + `McpTool` impl; `src/mcp/mod.rs` wires the JSON-RPC dispatch loop; `src/mcp/jsonrpc.rs` is the JSON-RPC wire-constant SSOT (#1558 batch 3) and `src/mcp/param_names.rs` the tool-call param-name SSOT.

Post-v0.7.0 #987 (D1.6) the source-of-truth lives in `registered_tools()` — a single `Vec<RegisteredTool>` with one entry per `McpTool` impl. `tool_definitions()` is now a thin four-line view that iterates the vec and projects each row to the wire shape (`name`/`description`/`docs`/`inputSchema`). The hand-coded `json!({...})` macro that previously held every tool's schema verbatim is GONE.

- `RpcRequest` / `RpcResponse` / `RpcError` -- JSON-RPC 2.0 types
- `registered_tools()` (`src/mcp/registry.rs`) -- canonical iterator over every per-tool `McpTool` impl. Each entry is a `RegisteredTool` derived from `T::name()` / `T::description()` / `T::docs()` / `T::family()` / `T::input_schema()` where `T` is a zero-sized type defined in the per-tool module (e.g. `crate::mcp::store::StoreTool`). Adding a new MCP tool = ONE line here + the impl.
- `tool_definitions()` (`src/mcp/registry.rs`) -- thin view over `registered_tools()` that returns the full-surface tool schemas for `tools/list` (every Family; includes v0.7 additions `memory_reflect`, `memory_atomise`, `memory_ingest_multistep`, `memory_persona`, `memory_persona_generate`, `memory_offload`, `memory_deref`, `memory_calibrate_confidence`, the 7 L1-5 Agent Skills tools, `memory_check_agent_action`, `memory_rule_list`, `memory_export_reflection`, `memory_dependents_of_invalidated`, `memory_find_paths`, `memory_verify`, `memory_quota_status`, the original `memory_capabilities`, `memory_expand_query`, `memory_auto_tag`, `memory_detect_contradiction`, the 4 archive tools, etc.). Filtered to the active `--profile` by `tool_definitions_for_profile()`. **Per issue [#864](https://github.com/alphaonedev/ai-memory-mcp/issues/864): "Family" in this codebase always refers to the MCP tool-family enum — eight variants at v0.7.0 (`Family::Core` / `Lifecycle` / `Graph` / `Governance` / `Power` / `Meta` / `Archive` / `Other` in `src/profile.rs`) — NEVER to the `MemoryKind` Batman vocabulary; those are unrelated taxonomies.**
  - `memory_recall` schema includes `until` parameter and `format` parameter (enum: `"json"`, `"toon"`, `"toon_compact"`, default: `"toon_compact"`)
  - `memory_search` schema includes `format` parameter (enum: `"json"`, `"toon"`, `"toon_compact"`, default: `"toon_compact"`) and enforces `maximum: 200` on limit
  - `memory_list` schema includes `format` parameter (enum: `"json"`, `"toon"`, `"toon_compact"`, default: `"toon_compact"`) and enforces `maximum: 200` on limit
  - `memory_consolidate` schema enforces `minItems: 2, maxItems: 100` on IDs
  - `memory_update` schema includes `expires_at` parameter
- `handle_store()`, `handle_recall()`, `handle_search()`, `handle_list()`, `handle_delete()`, `handle_promote()`, `handle_forget()`, `handle_stats()`, `handle_update()`, `handle_get()`, `handle_link()`, `handle_get_links()`, `handle_consolidate()`, `handle_archive_list()`, `handle_archive_restore()`, `handle_archive_purge()`, `handle_archive_stats()` -- one handler per tool
- `handle_request()` -- routes JSON-RPC methods: `initialize`, `notifications/initialized`, `tools/list`, `tools/call`, `ping`
- Notification handling: all JSON-RPC notifications (requests without an `id` field) are correctly skipped without sending a response, per the JSON-RPC 2.0 specification
- `run_mcp_server()` -- main loop: reads lines from stdin, parses JSON-RPC, dispatches, writes responses to stdout

Protocol version: `2024-11-05`. All tool responses are wrapped in MCP content blocks (`{"content": [{"type": "text", "text": "..."}]}`). The protocol is AI-agnostic -- any MCP client can connect.

**MCP Prompts:** The server exposes 2 prompts via `prompts/list`:
- **recall-first** -- System prompt with 8 behavioral rules for proactive memory use. Supports an optional `namespace` argument for scoped recall.
- **memory-workflow** -- Quick reference card for the full tool surface.

**MCP Error Codes:** The server uses standard JSON-RPC 2.0 error codes:
- `-32700` -- Parse error (malformed JSON)
- `-32600` -- Invalid request (missing required fields)
- `-32601` -- Method not found (unknown JSON-RPC method)
- `-32602` -- Invalid params (bad tool arguments)
- Application-level errors are returned as text in the MCP content block with `"isError": true`, not as JSON-RPC error codes.

### `src/validate.rs`

Input validation for every write path. Called by CLI, HTTP handlers, and MCP handlers.

| Function | Validates |
|----------|-----------|
| `validate_title()` | Non-empty, max 512 chars (`MAX_TITLE_LEN`), no control chars |
| `validate_content()` | Non-empty, max 64KB, no null bytes |
| `validate_namespace()` | Non-empty, max 512 chars (`MAX_NAMESPACE_LEN`); `/` allowed as hierarchy delimiter (no leading/trailing/empty segments); no backslashes/spaces/nulls |
| `validate_source()` | Must be one of `VALID_SOURCES`: user, nhi, claude (deprecated), hook, api, cli, import, consolidation, system, chaos, notify |
| `validate_tags()` | Max 50 tags, each max 128 bytes, no empty strings |
| `validate_id()` | Non-empty, max 128 bytes, no null bytes |
| `validate_expires_at()` | Valid RFC3339, not in the past |
| `validate_ttl_secs()` | Positive, max 1 year |
| `validate_relation()` | Must be one of `VALID_RELATIONS` (six at v0.7.0): related_to, supersedes, contradicts, derived_from, reflects_on, derives_from |
| `validate_confidence()` | Finite number, 0.0 to 1.0 |
| `validate_priority()` | Integer, 1 to 10 |
| `validate_create()` | Full validation for CreateMemory |
| `validate_memory()` | Full validation for Memory (import) |
| `validate_update()` | Validates only present fields |
| `validate_link()` | Validates both IDs, relation, and rejects self-links |
| `validate_consolidate()` | 2-100 IDs, validates title, summary, namespace |

### `src/color.rs`

ANSI color output for CLI -- zero external dependencies. Auto-detects terminal via `std::io::IsTerminal`.

- `init()` -- sets global color flag based on terminal detection
- `short()`, `mid()`, `long()` -- tier-specific colors (red, yellow, green)
- `dim()`, `bold()`, `cyan()` -- semantic colors
- `tier_color()` -- dispatches to tier color by string name
- `priority_bar()` -- renders a 10-character bar (`█████░░░░░`) colored by priority level (green for 8+, yellow for 5-7, red for 1-4)

Colors are suppressed when stdout is not a terminal (e.g., piping to file). The `--json` flag bypasses color output entirely.

### `src/errors.rs`

Structured error types for the HTTP API:

- `ApiError` -- serializable error with `code` and `message` fields
- `MemoryError` enum -- `NotFound`, `ValidationFailed`, `DatabaseError`, `Conflict`
- Implements `IntoResponse` for Axum, mapping to appropriate HTTP status codes
- Implements `From<anyhow::Error>` and `From<rusqlite::Error>`
- **Error sanitization**: `DatabaseError` responses return a generic `"Internal server error"` message to clients, never leaking internal database error details. Detailed errors are logged server-side.

### `src/handlers/`

All HTTP handlers for the **89 production `.route(...)` registrations / 75 unique URL paths** at v0.7.0 (canonical count from CLAUDE.md §Architecture; counted via `codegraph_search kind=route limit=100`). The pre-Wave-1 monolithic `src/handlers.rs` (~17.8k LOC) is GONE — split into `src/handlers/{mod,http,transport,federation_receive,hook_subscribers}.rs`. State is the `Db = Arc<Mutex<(Connection, PathBuf, ResolvedTtl, bool)>>` extractor defined in `src/handlers/transport.rs`. Each handler acquires the lock, validates input via `crate::validate::RequestValidator` (#966 Wave-2 Tier-C1), performs DB operations through the SAL `MemoryStore` trait (`src/store/`), and returns JSON.

Key handlers:
- `create_memory` / `bulk_create` -- memory creation with deduplication (bulk limited to 1,000 items)
- `get_memory` / `list_memories` / `update_memory` / `delete_memory` -- standard CRUD
- `promote_memory` -- `POST /memories/{id}/promote` endpoint for promoting to long-term
- `search` / `recall` -- FTS-powered search with sanitized queries
- `forget` / `consolidate` -- bulk operations
- `import_memories` -- import with 1,000 item limit
- `archive_list` / `archive_restore` / `archive_purge` / `archive_stats` -- archive management endpoints
- All ID path parameters are validated before database access

> **Note:** HTTP handlers are tested via integration tests (`tests/integration.rs`), not unit tests.

### `src/storage/` (alias `db` via `pub use storage as db` in `src/lib.rs:52`)

The sqlite storage layer. The pre-#961 monolithic `src/db.rs` is GONE — split into the per-domain modules under `src/storage/` (CRUD, FTS5, recall scoring, GC, schema migrations at `src/storage/migrations.rs`, `reflect` at `src/storage/reflect.rs`). The legacy free-function names are preserved as the `db::*` alias for call-site backward compat. Post-#961 SAL boundary cleanup: new DB operations land on the `MemoryStore` trait in `src/store/mod.rs` FIRST; `storage/*` hosts primitives the sqlite adapter delegates to. Key functions:

| Function | Description |
|----------|-------------|
| `open()` | Opens DB, sets WAL mode, creates schema, runs migrations |
| `insert()` | Upsert on `(title, namespace)` -- never downgrades tier, keeps max priority |
| `get()` | Fetch by ID |
| `touch()` | Bump access count, extend TTL, auto-promote mid->long at 5 accesses, reinforce priority every 10 accesses. **Uses BEGIN IMMEDIATE/COMMIT transaction** for atomicity. |
| `update()` | Partial update of any fields |
| `delete()` | Delete by ID (links cascade) |
| `forget()` | Bulk delete by namespace + FTS pattern + tier |
| `list()` | List with filters: namespace, tier, priority, date range, tags, offset |
| `search()` | FTS5 AND search with 6-factor composite scoring |
| `recall()` | FTS5 OR search + touch + auto-promote + TTL extension |
| `find_contradictions()` | Find memories in same namespace with similar titles |
| `consolidate()` | Merge multiple memories, delete originals, aggregate tags and max priority. **Uses BEGIN IMMEDIATE/COMMIT transaction** for atomicity. |
| `sanitize_fts_query()` | Strips special characters and quotes tokens to prevent FTS injection |
| `create_link()` / `get_links()` / `delete_link()` | Memory linking (ON DELETE CASCADE) |
| `gc()` | Delete expired memories |
| `stats()` | Aggregate statistics (totals, by tier, by namespace, expiring soon, links, DB size) |
| `list_namespaces()` | List namespaces with memory counts |
| `export_all()` / `export_links()` | Full data export |
| `checkpoint()` | WAL checkpoint (TRUNCATE) for clean shutdown |
| `archive_memory()` | Move a memory to the archive table |
| `list_archived()` | List all archived memories |
| `restore_archived()` | Restore an archived memory to the active table |
| `purge_archive()` | Permanently delete all archived memories |
| `archive_stats()` | Archive statistics (count, size, date range) |
| `health_check()` | Verifies DB accessibility and FTS5 integrity |

**Transaction safety**: `touch()` and `consolidate()` use `BEGIN IMMEDIATE` to acquire a write lock upfront, preventing deadlocks and ensuring the entire read-modify-write cycle is atomic. This is critical for `touch()` because it reads the current access count, computes promotion/reinforcement logic, and writes back -- all of which must be atomic under concurrent access.

**FTS query sanitization**: The `sanitize_fts_query()` function strips all FTS5 special characters (`*`, `"`, `(`, `)`, `:`, `+`, `-`, `~`, `^`, `{`, `}`, `[`, `]`, `|`, `\`) from user input and wraps each remaining token in double quotes. This prevents injection of FTS query syntax that could cause unexpected results or errors.

**Migration error handling**: The migration logic only ignores "duplicate column" errors (indicating the migration already ran). All other errors are propagated, ensuring real failures are caught early.

### `src/hnsw.rs`

In-memory HNSW (Hierarchical Navigable Small World) vector index for approximate nearest-neighbor search. The `VectorIndex` struct provides `insert`, `search`, and `remove` operations on dense embeddings. When the index is small (below the HNSW threshold), it falls back to linear scan. The index has no persistence -- it is rebuilt from the database on startup. This keeps the on-disk format simple (embeddings stored as BLOBs in SQLite) while providing fast in-memory ANN search during runtime.

### `src/toon.rs`

TOON (Token-Oriented Object Notation) serializer. Converts JSON recall/search/list responses into the compact TOON wire format. The format spec is documented in the [TOON Format Specification](#toon-format-specification) section below. Public API: `memories_to_toon()`, `search_to_toon()`. Compact mode emits a 6-field projection (id/tier/title/namespace/score/created_at); full mode emits the complete record.

### `src/config.rs`

Tier configuration system + global runtime config. Parses `~/.config/ai-memory/config.toml`, applies environment-variable overrides (`AI_MEMORY_*`), validates tier capabilities (`keyword`, `semantic`, `smart`, `autonomous`), and emits the immutable `Config` consumed by every other module. Includes `TtlConfig` (per-tier TTL + extension windows), `archive_on_gc`, embedding-model selection, Ollama URL, and feature gating that disables higher-tier code paths when the configured tier doesn't permit them.

### `src/embeddings.rs`

Embedding pipeline for `semantic+` tiers. Loads HuggingFace sentence-transformer models (`all-MiniLM-L6-v2` 384-dim or `nomic-embed-text-v1.5` 768-dim) on first run via `hf-hub`, runs inference via Candle, generates dense vectors at insert time, and backfills missing embeddings on first startup. Vectors are stored as BLOBs in the `memories.embedding` column. Consumed by `reranker.rs` for hybrid recall and `hnsw.rs` for approximate nearest-neighbour indexing.

### `src/llm.rs`

Provider-agnostic LLM client (#1067) for query expansion, auto-tagging, and contradiction detection. Two wire shapes — Ollama-native (`/api/chat` + `/api/embed`, no auth) and OpenAI-compatible (`/v1/chat/completions` + `/v1/embeddings`, Bearer auth). Backend selected by `AI_MEMORY_LLM_BACKEND` env var with 15 vendor aliases (xai, openai, anthropic, gemini, deepseek, kimi, qwen, mistral, groq, together, cerebras, openrouter, fireworks, lmstudio, plus the generic `openai-compatible` escape hatch). The struct name `OllamaClient` is preserved post-#1066 for call-site backward compat (rename to `LlmClient` is non-breaking and tracked separately). Vendor identifiers in this module are legitimate per the substrate-canonical-discipline carve-out enforced by `scripts/check-vendor-literals.sh` (#1200). Supplies the production implementation of the `AutonomyLlm` trait (see `src/autonomy.rs`). Prompts are kept short and structured to minimize token cost; failures are non-fatal — the curator and autonomy passes log and continue.

### `src/mine.rs`

Retroactive conversation import — bulk-imports historical Claude / ChatGPT / Slack export files into ai-memory as backfilled memories. Each conversation becomes a single memory; metadata captures `source`, `agent_id`, and timestamps from the export. Used to seed memory before live capture is available.

### `src/reranker.rs`

Hybrid recall algorithm. Blends the FTS5 keyword score and the embedding cosine similarity into a single ranking, applying configurable weighting and a 6-factor scoring formula (recency, priority, access count, tier weight, content match, namespace match). Returns a score field in every recall response so callers can audit ranking decisions.

### `src/identity/`

Non-Human Identity (NHI) resolution for `agent_id` (split from the former `src/identity.rs` into per-domain modules: `mod.rs`, `attest.rs`, `sign.rs`, `verify.rs`, `replay.rs`, plus the #1558 additions `sentinels.rs` — reserved caller identities / `RESERVED_AGENT_IDS` — and `keypair.rs` — `DAEMON_KEYPAIR_LABEL` + daemon signing-keypair load). Centralises the precedence chain across CLI, MCP, and HTTP entry points so `metadata.agent_id` is uniformly populated. Public API: `resolve_agent_id()` (CLI/MCP), `resolve_http_agent_id()` (HTTP body + `X-Agent-Id` header), `preserve_agent_id()` (round-trip), `process_discriminator()` (stable per-process identifier). Default-id formats: `ai:<client>@<hostname>:pid-<pid>` (MCP), `host:<hostname>:pid-<pid>-<uuid8>` (CLI), `anonymous:req-<uuid8>` (HTTP per-request fallback). By default `agent_id` is a *claimed* identity, not attested; a write that presents a valid Ed25519 `signature` is upgraded to `agent_attested` (#626 Layer-3 — see `identity::attest::stamp_attestation`).

### `src/curator/`

Autonomous curator daemon (v0.6.1; split from the former `src/curator.rs` into `mod.rs` + `pipeline.rs`, `candidates.rs`, `cluster.rs`, `compaction.rs`, `persist.rs`, `reflection_pass.rs`). Runs a periodic sweep over stored memories, invoking `auto_tag` and `detect_contradiction` via the configured LLM and persisting results into each memory's metadata. Complements the synchronous post-store hooks (#265). Hard cap on operations per cycle (default 100); skips internal `_`-prefixed namespaces; honours include/exclude lists; dry-run mode emits a report without touching rows; LLM errors are logged but never abort a cycle. Public API: `CuratorConfig`, `CuratorReport`, `run_once()`, `run_daemon()`.

### `src/autonomy.rs`

Full-autonomy loop — stacks on the curator daemon. Four passes beyond auto-tag:

1. **Consolidation** — find near-duplicate memories in the same namespace (Jaccard ≥ 0.55, max cluster size 8), LLM-summarise into a single canonical memory, archive originals.
2. **Forgetting of superseded memories** — when `metadata.confirmed_contradictions` is set, demote/forget the contradicted entry.
3. **Priority feedback** — nudge `priority` up for hot memories, down for cold ones (purely arithmetic, no LLM call).
4. **Rollback log + self-report** — every autonomous action lands in `_curator/rollback/<ts>` (reversible) and every cycle in `_curator/reports/<ts>`.

Defines the `AutonomyLlm` trait so the curator can be unit-tested without a live Ollama instance. Public API: `run_autonomy_passes()`, `persist_self_report()`, `reverse_rollback_entry()`, `RollbackEntry`, `AutonomyPassReport`.

### `src/replication.rs`

W-of-N quorum-write layer for the peer-mesh sync (v0.7 track C). Scaffolds the contract described in [`ADR-0001-quorum-replication.md`](ADR-0001-quorum-replication.md). The `QuorumWriter` sits ABOVE the existing sync-daemon — deployments without `--quorum-writes` keep the v0.6.0 one-way push behaviour byte-for-byte. Public API: `QuorumPolicy`, `QuorumWriter::commit`, `AckTracker`. Emits metrics: `replication_quorum_ack_total{result}`, `replication_quorum_failures_total{reason}`, `replication_clock_skew_seconds`.

### `src/federation/`

Federation autonomy (split from the former `src/federation.rs` into `mod.rs` + `quorum.rs`, `peer.rs`, `peer_attestation.rs`, `receive.rs`, `sync.rs`, `signing.rs`, `push_dlq.rs`, `reflection_bookkeeping.rs`, `identity/`) — wires the quorum primitives from `replication` into the HTTP write path (v0.7 track C, PR 2 of N). When `ai-memory serve` is started with `--quorum-writes N --quorum-peers <urls>`, every successful HTTP write fans out a 1-memory `/api/v1/sync/push` POST to each peer; the write returns OK only once `W-1` peer acks land within `--quorum-timeout-ms`. Fewer acks → `503 quorum_not_met`. Public API: `FederationConfig`, `broadcast_store_quorum()`.

### `src/subscriptions.rs`

v0.6.0.0 webhook subscriptions. Subscribers register a URL + shared secret + event/namespace/agent filters; matching events POST an HMAC-SHA256-signed JSON payload (header `X-Ai-Memory-Signature: sha256=<hex>`) over a fire-and-forget thread. SSRF hardening: `http://` only to `127.0.0.0/8` or `localhost`; everywhere else requires `https://`; RFC1918 / RFC4193 / link-local hosts rejected unless `allow_private_networks=true`. Stored secret is SHA-256 of the plaintext (plaintext returned once at registration). Public API: `Subscription`, `NewSubscription`, `insert()`, `delete()`, `list()`, `dispatch_event()`, `validate_url()`.

### `src/migrate.rs`

Cross-backend migration tool — streams memories from one SAL backend to another (v0.7 track B, PR 2 of N). Gated behind `--features sal`; extended transparently by `--features sal-postgres`. Supported URLs: `sqlite:///abs/path.db`, `sqlite://./relative.db`, `postgres://user:pass@host:port/db`. CLI: `ai-memory migrate --from <url> --to <url> [--batch 1000] [--dry-run] [--namespace foo]`. Reads via `MemoryStore::list`, writes via `MemoryStore::store` with the source memory's id verbatim — adapter upsert-on-id semantics make repeated migration idempotent.

### `src/metrics.rs`

v0.6.0.0 Prometheus metrics, exposed at `GET /metrics` by the daemon. Minimal, non-invasive instrumentation — single global `Registry`, a handful of `IntCounter` / `IntCounterVec` / `IntGauge` / `HistogramVec` handles. Callers increment via typed helpers (`record_store(tier, ok)`, `record_recall(mode, latency_seconds)`, `record_autonomy_hook(kind, ok)`, `curator_cycle_completed(...)`) rather than poking the registry directly so a future metrics-backend swap stays internal. Public API: `Metrics` (struct), `registry()`, `render()`.

## Database Schema

### `memories` table

```sql
CREATE TABLE memories (
    id               TEXT PRIMARY KEY,
    tier             TEXT NOT NULL,           -- 'short', 'mid', 'long'
    namespace        TEXT NOT NULL DEFAULT 'global',
    title            TEXT NOT NULL,
    content          TEXT NOT NULL,
    tags             TEXT NOT NULL DEFAULT '[]',  -- JSON array
    priority         INTEGER NOT NULL DEFAULT 5,  -- 1-10
    confidence       REAL NOT NULL DEFAULT 1.0,   -- 0.0-1.0
    source           TEXT NOT NULL DEFAULT 'api', -- 'user', 'claude', 'hook', 'api', 'cli', etc.
    access_count     INTEGER NOT NULL DEFAULT 0,
    created_at       TEXT NOT NULL,           -- ISO 8601 / RFC3339
    updated_at       TEXT NOT NULL,
    last_accessed_at TEXT,
    expires_at       TEXT,                    -- NULL for long-term
    embedding        BLOB                     -- dense vector (v3 migration, NULL if keyword tier)
);

-- Indexes
CREATE INDEX idx_memories_tier ON memories(tier);
CREATE INDEX idx_memories_namespace ON memories(namespace);
CREATE INDEX idx_memories_priority ON memories(priority DESC);
CREATE INDEX idx_memories_expires ON memories(expires_at);

-- Unique constraint enables upsert/deduplication behavior
CREATE UNIQUE INDEX idx_memories_title_ns ON memories(title, namespace);
```

### `memories_fts` virtual table

```sql
CREATE VIRTUAL TABLE memories_fts USING fts5(
    title, content, tags,
    content=memories, content_rowid=rowid
);
```

Kept in sync via `AFTER INSERT`, `AFTER DELETE`, and `AFTER UPDATE` triggers on `memories`.

### `memory_links` table

```sql
CREATE TABLE memory_links (
    source_id   TEXT NOT NULL REFERENCES memories(id) ON DELETE CASCADE,
    target_id   TEXT NOT NULL REFERENCES memories(id) ON DELETE CASCADE,
    relation    TEXT NOT NULL DEFAULT 'related_to',
    created_at  TEXT NOT NULL,
    PRIMARY KEY (source_id, target_id, relation)
);
```

Relation types (six at v0.7.0): `related_to`, `supersedes`,
`contradicts`, `derived_from`, `reflects_on`, `derives_from`. The
table shown above is the original core shape — at v0.7.0 each link row
also carries the temporal-validity columns (`valid_from`,
`valid_until`, `observed_by`) and attestation columns (`signature`,
`attest_level`, `signed_at`).

### `archived_memories` table

```sql
CREATE TABLE archived_memories (
    id               TEXT PRIMARY KEY,
    tier             TEXT NOT NULL,
    namespace        TEXT NOT NULL DEFAULT 'global',
    title            TEXT NOT NULL,
    content          TEXT NOT NULL,
    tags             TEXT NOT NULL DEFAULT '[]',
    priority         INTEGER NOT NULL DEFAULT 5,
    confidence       REAL NOT NULL DEFAULT 1.0,
    source           TEXT NOT NULL DEFAULT 'api',
    access_count     INTEGER NOT NULL DEFAULT 0,
    created_at       TEXT NOT NULL,
    updated_at       TEXT NOT NULL,
    last_accessed_at TEXT,
    expires_at       TEXT,
    archived_at      TEXT NOT NULL,
    archive_reason   TEXT NOT NULL DEFAULT 'gc'
);

-- Indexes
CREATE INDEX idx_archived_memories_namespace ON archived_memories(namespace);
CREATE INDEX idx_archived_memories_archived_at ON archived_memories(archived_at);
```

Added in schema migration v3 -> v4 (shown in its original 16-column shape). Stores memories archived by GC before deletion; the columns mirror the `memories` table with additions including `archived_at` (timestamp of archival) and `archive_reason` (e.g., `'gc'`). Schema v49 (#1025) added 14 more nullable columns (`reflection_depth`, `memory_kind`, `citations`, `version`, …) so archive → restore is lossless for the full v0.7.0 26-field Memory shape; v49+ also carries `original_tier` / `original_expires_at`, re-applied on restore.

> The `CREATE TABLE` blocks in this section show the original core
> columns for orientation — the canonical current DDL is the `SCHEMA`
> const + migration ladder in `src/storage/mod.rs` /
> `src/storage/migrations.rs`.

### `schema_version` table

Tracks migration state. Current version: **57** (`CURRENT_SCHEMA_VERSION` in `src/storage/migrations.rs`).

## Recall Scoring Formula

The recall function uses a 6-factor composite score to rank results:

```
score = (fts_rank * -1)                                              -- FTS5 relevance (negated: lower = better in SQLite)
      + (priority * 0.5)                                             -- Priority weight (1-10 -> 0.5-5.0)
      + (MIN(access_count, 50) * 0.1)                                         -- Frequency bonus
      + (confidence * 2.0)                                           -- Certainty weight (0.0-1.0 -> 0.0-2.0)
      + tier_boost                                                   -- long=3.0, mid=1.0, short=0.0
      + (1.0 / (1.0 + (julianday('now') - julianday(updated_at)) * 0.1))  -- Recency decay
```

The `search` function uses the same formula minus the tier boost.

### Hybrid Recall Algorithm (semantic tier and above)

At the `semantic` tier and above, the `reranker.rs` module blends two scoring signals:

1. **Semantic score** -- cosine similarity between the query embedding and each memory's stored embedding (0.0 to 1.0)
2. **Keyword score** -- the existing 6-factor FTS5 composite score, normalized to 0.0-1.0

The final score is a weighted blend: `final = (semantic_weight * semantic_score) + ((1 - semantic_weight) * keyword_score)`. The semantic weight is **adaptive by content length** — 0.50 for short content (≤ 500 chars) sliding to 0.15 for long content (≥ 5000 chars) — because embeddings lose information on long text. Results from both pipelines are merged, deduplicated by memory ID, and sorted by the blended score.

### Tier Configuration System

The `config.rs` module defines 4 feature tiers that gate functionality:

| Tier | Embeddings | LLM | Capability gating |
|------|-----------|-----|-----------------|
| `keyword` | No | No | FTS5-only recall; LLM-backed tools return a tier-requirement notice |
| `semantic` | Yes | No | Hybrid (semantic + keyword) recall; embedding-backed tools (e.g. `memory_check_duplicate`) active |
| `smart` | Yes | Yes | Adds LLM-backed expansion / auto-tag / contradiction detection |
| `autonomous` | Yes | Yes | Adds cross-encoder reranking + autonomous behaviors |

The tier gates **capabilities** (embedder / LLM / reranker), not the advertised tool count — the tool surface is selected separately by `--profile` (7 entries at `core`, 74 at `full`). Tier is set at startup via `ai-memory mcp --tier <tier>` and cannot be changed at runtime. Post-#1067 the LLM is provider-agnostic (`AI_MEMORY_LLM_BACKEND`), not Ollama-only. The `memory_capabilities` tool reports the active tier and which features are available, allowing AI clients to adapt their behavior.

> **Note:** Configuration is loaded once at process startup. Changes to `config.toml` require restarting the ai-memory process (MCP server, HTTP daemon, or CLI) to take effect.

The recency decay factor ensures that recent memories rank higher when other factors are similar. A memory updated today gets a boost of ~1.0, a memory from 10 days ago gets ~0.5, and a memory from 100 days ago gets ~0.09.

### TOON Format Specification

TOON (Token-Oriented Object Notation) is a token-efficient serialization format designed for LLM communication. It replaces JSON for recall, search, and list responses, reducing output size by 40-60% by declaring field names once as a header and listing values row by row with pipe delimiters.

The implementation is in `src/toon.rs`.

#### Structure Overview

A TOON response consists of three parts in order:

1. **Metadata line** (optional) -- key:value pairs for scalar fields
2. **Header line** -- declares field names once
3. **Data rows** -- one per object, values matching header column order

#### Metadata Line Syntax

Scalar (non-array) response fields are serialized as pipe-delimited `key:value` pairs on the first line:

```
count:3|mode:hybrid
```

If there are no metadata fields, this line is omitted entirely.

#### Header Line Syntax

The header declares the array name followed by field names in square brackets, pipe-delimited, ending with a colon:

```
memories[id|title|tier|namespace|priority|confidence|score|access_count|tags|source|created_at|updated_at]:
```

Field names appear exactly once in the entire output regardless of how many data rows follow. This is the primary source of token savings over JSON.

#### Data Row Syntax

Each data row contains values pipe-delimited in the same order as the header fields:

```
abc-123|PostgreSQL 16 config|long|infra|9|1.0|0.763|2|postgres,database|claude|2026-04-03T15:00:00+00:00|2026-04-03T15:00:00+00:00
```

- **Strings** are output as-is (unless they require escaping)
- **Numbers** (integers and floats) are output as their string representation
- **Booleans** are output as `1` (true) or `0` (false)
- **Arrays** (e.g., tags) are joined with commas: `postgres,database`
- **Objects** are output as the literal `[object]`
- **Null/missing values** are represented as an empty string (zero characters between the delimiters), e.g., `abc||mid` means the second field is null

#### Escaping Rules

Two characters require escaping in TOON values:

| Character | Escaped As | Reason |
|-----------|-----------|--------|
| `\|` (pipe) | `\\|` | Pipe is the field delimiter |
| `\n` (newline) | `\\n` | Newline is the row delimiter |

Escaping is only applied when the value actually contains a pipe or newline character. Values without these characters are output verbatim with no additional escaping.

Example: a title containing a pipe like `A|B` is serialized as `A\|B` in the data row.

#### Compact vs Full Mode

TOON supports two modes that differ only in which fields are included:

**Full mode** (12 fields):
```
memories[id|title|tier|namespace|priority|confidence|score|access_count|tags|source|created_at|updated_at]:
```

**Compact mode** (7 fields) -- omits timestamps, confidence, access_count, and source for tighter output:
```
memories[id|title|tier|namespace|priority|score|tags]:
```

The MCP server defaults to compact mode (`toon_compact`). Clients can request `"toon"` for full mode or `"json"` for standard JSON via the `format` parameter on recall, search, and list tools.

#### Search Response Normalization

Search responses use a `"results"` key instead of `"memories"`. The TOON serializer normalizes this internally -- the output always uses the `memories[...]` header regardless of the source key.

#### Complete Parsing Example

Given this JSON response:

```json
{
  "memories": [
    {"id": "abc-123", "title": "PostgreSQL config", "tier": "long", "namespace": "infra", "priority": 9, "score": 0.763, "tags": ["postgres", "db"]},
    {"id": "def-456", "title": "Redis cache", "tier": "long", "namespace": "infra", "priority": 8, "score": 0.541, "tags": ["redis"]},
    {"id": "ghi-789", "title": "Deploy notes", "tier": "mid", "namespace": "infra", "priority": 5, "score": 0.320, "tags": []}
  ],
  "count": 3,
  "mode": "hybrid"
}
```

TOON compact output:

```
count:3|mode:hybrid
memories[id|title|tier|namespace|priority|score|tags]:
abc-123|PostgreSQL config|long|infra|9|0.763|postgres,db
def-456|Redis cache|long|infra|8|0.541|redis
ghi-789|Deploy notes|mid|infra|5|0.32|
```

To parse TOON:

1. Read the first line. If it does not start with a bracket-containing identifier (e.g., `memories[`), parse it as metadata: split on `|`, then split each segment on `:` to get key-value pairs.
2. Read the header line. Extract the array name and field list: strip the trailing `:`, extract the portion inside `[...]`, and split on `|` to get the ordered field names.
3. Read each subsequent non-empty line as a data row. Split on `|` (respecting `\|` escapes), mapping each positional value to the corresponding header field name.
4. Unescape `\|` to `|` and `\n` to newline in each value. Empty values represent null/missing fields.

## API Reference

Base URL: `http://127.0.0.1:9077/api/v1`

All responses are JSON. Error responses include `{"error": "message"}`. Database errors are sanitized -- clients receive `"Internal server error"` instead of raw SQLite error details.

The HTTP API exposes **89 production `.route(...)` registrations / 75 unique URL paths** at v0.7.0 (canonical count via codegraph `codegraph_search kind=route limit=100` filtered to `src/lib.rs` excluding the `#[cfg(test)]`-gated `/slow` route at line 996; multi-line-aware path extraction via `awk '/\.route\(/{in=1}in&&/"\/[^"]*"/{match($0,/"\/[^"]*"/);print substr($0,RSTART,RLENGTH);in=0}' src/lib.rs | sort -u`; v0.6.3.1 baseline of 50 and v0.6.3 baseline of 42 are frozen on the [evidence page](https://alphaonedev.github.io/ai-memory-mcp/evidence.html)).

### Health Check

```
GET /health
```

Deep health check: verifies DB is readable and FTS5 integrity-check passes.

Response (200): `{"status": "ok", "service": "ai-memory"}`
Response (503): `{"status": "error", "service": "ai-memory"}`

### Create Memory

```
POST /memories
Content-Type: application/json

{
  "title": "Project uses Axum",
  "content": "The HTTP server is built with Axum 0.8.",
  "tier": "mid",
  "namespace": "ai-memory",
  "tags": ["rust", "web"],
  "priority": 6,
  "confidence": 1.0,
  "source": "api",
  "expires_at": "2026-04-06T00:00:00Z",
  "ttl_secs": 86400
}
```

Response (201):
```json
{
  "id": "a1b2c3d4-...",
  "tier": "mid",
  "namespace": "ai-memory",
  "title": "Project uses Axum",
  "potential_contradictions": ["id1", "id2"]
}
```

Defaults: `tier=mid`, `namespace=global`, `priority=5`, `confidence=1.0`, `source=api`.

Optional: `expires_at` (RFC3339), `ttl_secs` (overrides tier default). Deduplicates on title+namespace (upsert).

### Bulk Create

```
POST /memories/bulk
Content-Type: application/json

[
  {"title": "Memory 1", "content": "..."},
  {"title": "Memory 2", "content": "..."}
]
```

Response: `{"created": 2, "errors": []}`

Limited to **1,000 items per request**.

### Get Memory

```
GET /memories/{id}
```

Response:
```json
{
  "memory": { ... },
  "links": [ ... ]
}
```

### Update Memory

```
PUT /memories/{id}
Content-Type: application/json

{
  "content": "Updated content",
  "priority": 8,
  "expires_at": "2026-06-01T00:00:00Z"
}
```

All fields are optional. Only provided fields are updated. Validated before write.

### Delete Memory

```
DELETE /memories/{id}
```

Response: `{"deleted": true}`. Links are cascade-deleted.

### Promote Memory

```
POST /memories/{id}/promote
```

Promotes a memory to long-term tier and clears its expiry.

Response: `{"promoted": true}`

### List Memories

```
GET /memories?namespace=my-app&tier=long&limit=20&offset=0&min_priority=5&since=2026-01-01T00:00:00Z&until=2026-12-31T23:59:59Z&tags=rust
```

All query parameters are optional. The limit is capped at
`max_page_size` (compiled default 1000; `[limits].max_page_size` /
`AI_MEMORY_MAX_PAGE_SIZE`).

Response: `{"memories": [...], "count": 5}`

### Search (AND semantics)

```
GET /search?q=database+migration&namespace=my-app&tier=mid&limit=10&since=...&until=...&tags=...
```

Response: `{"results": [...], "count": 3, "query": "database migration"}`

Uses 6-factor scoring (without tier boost). Queries are sanitized to prevent FTS injection.

### Recall (OR semantics + touch)

```
GET /recall?context=auth+flow+jwt&namespace=my-app&limit=10&tags=auth&since=2026-01-01T00:00:00Z&until=2026-12-31T23:59:59Z
```

Or via POST:

```
POST /recall
Content-Type: application/json

{"context": "auth flow jwt", "namespace": "my-app", "limit": 10}
```

Response: `{"memories": [...], "count": 5}`

Recall automatically: bumps `access_count`, extends TTL, and auto-promotes mid-tier memories with 5+ accesses to long-term. The touch operation is transactional.

### Forget (Bulk Delete)

```
POST /forget
Content-Type: application/json

{"namespace": "my-app", "pattern": "deprecated API", "tier": "short"}
```

At least one field is required. Pattern uses FTS matching (sanitized). Response: `{"deleted": 3}`

### Consolidate

```
POST /consolidate
Content-Type: application/json

{
  "ids": ["id1", "id2", "id3"],
  "title": "Auth system summary",
  "summary": "JWT with refresh tokens, RBAC middleware, Redis sessions.",
  "namespace": "my-app",
  "tier": "long"
}
```

Requires 2-100 IDs. Deletes source memories, creates new with aggregated tags and max priority. The entire operation is transactional. Response (201): `{"id": "new-id", "consolidated": 3}`

### Create Link

```
POST /links
Content-Type: application/json

{"source_id": "id1", "target_id": "id2", "relation": "related_to"}
```

Relations: `related_to`, `supersedes`, `contradicts`, `derived_from`. Self-links rejected. Response (201): `{"linked": true}`

### Get Links

```
GET /links/{id}
```

Response: `{"links": [{"source_id": "...", "target_id": "...", "relation": "...", "created_at": "..."}]}`

### Namespaces

```
GET /namespaces
```

Response: `{"namespaces": [{"namespace": "my-app", "count": 42}]}`

Admin-gated at v0.7.0 (#945) — as are `GET /stats`, `POST /gc`,
`GET /export`, `POST /import`, `POST /forget`, `GET /agents`,
`GET /taxonomy`, `GET /archive`, `GET /archive/stats`, and the
`/skill/*` routes. See `docs/API_REFERENCE.md` §"Admin-gated
endpoints".

### Stats

```
GET /stats
```

Response:
```json
{
  "total": 150,
  "by_tier": [{"tier": "long", "count": 80}, ...],
  "by_namespace": [{"namespace": "my-app", "count": 42}, ...],
  "expiring_soon": 5,
  "links_count": 12,
  "db_size_bytes": 524288
}
```

### Garbage Collection

```
POST /gc
```

Response: `{"expired_deleted": 3}`

### Export

```
GET /export
```

Response: full JSON dump of all memories and links with `exported_at` timestamp.

### Import

```
POST /import
Content-Type: application/json

{"memories": [...], "links": [...]}
```

Validates each memory before import. Limited to **1,000 memories per request**. Response: `{"imported": 50, "errors": []}`

## Error Code Reference

Structured error codes returned by the HTTP API and MCP server:

| Code | HTTP Status | Description |
|------|-------------|-------------|
| `NOT_FOUND` | 404 | Memory or resource not found |
| `VALIDATION_FAILED` | 400 | Invalid input parameters |
| `DATABASE_ERROR` | 500 | SQLite or internal error |
| `CONFLICT` | 409 | Duplicate or conflicting operation |

The baseline HTTP error envelope is `{"error": "<message>"}` (message
strings centralised in `src/errors.rs::msg`); typed classes
additionally carry a `code` field (e.g. `{"code": "ATTESTATION_FAILED",
"error": …}` — the full code vocabulary lives in
`src/errors.rs::error_codes`, which also includes v0.7.0 additions like
`REFLECTION_DEPTH_EXCEEDED`, `GOVERNANCE_REFUSED`, `QUOTA_EXCEEDED`).
`DATABASE_ERROR`-class responses are sanitized -- clients receive a
generic internal-server-error message; detailed errors are logged
server-side only.

## CLI Reference

Global flags:
- `--db <path>` -- database path (default: `ai-memory.db`, env: `AI_MEMORY_DB`)
- `--json` -- output as machine-parseable JSON

### `serve`

Start the HTTP daemon (89 route registrations / 75 unique URL paths at v0.7.0).

```bash
ai-memory serve --host 127.0.0.1 --port 9077
```

### `mcp`

Run as an MCP tool server over stdio. This is the primary integration path for any MCP-compatible AI client. At v0.7.0, the `--profile full` surface advertises 74 entries (73 callable memory tools + the always-on `memory_capabilities` bootstrap); the default `--profile core` ships 7 + the bootstrap.

```bash
ai-memory mcp
ai-memory mcp --tier semantic   # default
ai-memory mcp --tier smart      # enables LLM-powered tools (any backend via AI_MEMORY_LLM_BACKEND, #1067)
```

Reads JSON-RPC from stdin, writes responses to stdout. Logs to stderr. Correctly handles notifications (no response sent). Works with any MCP-compatible client (Claude AI, OpenAI ChatGPT, xAI Grok, META Llama, etc.).

### `store`

```bash
ai-memory store \
  -T "Title" \
  -c "Content" \
  --tier mid \
  --namespace my-app \
  --tags "tag1,tag2" \
  --priority 7 \
  --confidence 0.9 \
  --source claude \
  --expires-at "2026-04-15T00:00:00Z" \
  --ttl-secs 86400
```

Use `-c -` to read content from stdin. Validates all fields before writing. `--expires-at` sets an explicit expiration timestamp (RFC3339). `--ttl-secs` sets a TTL in seconds (overrides tier default).

### `update`

```bash
ai-memory update <id> -T "New title" -c "New content" --priority 8 --expires-at "2026-06-01T00:00:00Z"
```

The `--expires-at` flag sets or changes the expiration on an existing memory.

### `recall`

```bash
ai-memory recall "search context" --namespace my-app --limit 10 --tags auth --since 2026-01-01T00:00:00Z
```

### `search`

```bash
ai-memory search "exact terms" --namespace my-app --tier long --limit 20 --since 2026-01-01 --until 2026-12-31 --tags rust
```

### `get`

```bash
ai-memory get <id>
```

Shows the memory plus all its links.

### `list`

```bash
ai-memory list --namespace my-app --tier mid --limit 50 --offset 0 --since 2026-01-01 --until 2026-12-31 --tags devops
```

The `--offset` flag enables pagination. Use with `--limit` to page through results.

### `delete`

```bash
ai-memory delete <id>
```

### `promote`

```bash
ai-memory promote <id>
```

Promotes to long-term and clears expiry.

### `forget`

```bash
ai-memory forget --namespace my-app --pattern "old stuff" --tier short
```

At least one filter is required.

### `link`

```bash
ai-memory link <source-id> <target-id> --relation supersedes
```

Relation types: `related_to` (default), `supersedes`, `contradicts`, `derived_from`. Self-links rejected.

### `consolidate`

```bash
ai-memory consolidate "id1,id2,id3" -T "Summary title" -s "Consolidated content" --namespace my-app
```

### `gc`

```bash
ai-memory gc
```

### `stats`

```bash
ai-memory stats
```

### `namespaces`

```bash
ai-memory namespaces
```

### `export` / `import`

```bash
ai-memory export > backup.json
ai-memory import < backup.json
```

Export includes memories and links. Import validates each memory and skips invalid ones.

### `resolve`

Resolve a contradiction by marking one memory as superseding another.

```bash
ai-memory resolve <winner_id> <loser_id>
```

Creates a "supersedes" link from winner to loser. Demotes the loser (priority=1, confidence=0.1). Touches the winner (bumps access count).

### `shell`

Interactive REPL for browsing and managing memories.

```bash
ai-memory shell
```

REPL commands: `recall <ctx>`, `search <q>`, `list [ns]`, `get <id>`, `stats`, `namespaces`, `delete <id>`, `help`, `quit`. Color output with tier labels and priority bars.

### `sync`

Sync memories between two database files.

```bash
ai-memory sync <remote.db> --direction pull|push|merge
```

- `pull` -- import all memories from remote into local
- `push` -- export all local memories to remote
- `merge` -- bidirectional sync (both databases get all memories)

Uses dedup-safe upsert (title+namespace). Links are synced alongside memories.

### `auto-consolidate`

Automatically group and consolidate memories.

```bash
ai-memory auto-consolidate [--namespace <ns>] [--short-only] [--min-count 3] [--dry-run]
```

Groups memories by namespace+primary tag. Groups with >= min_count members are consolidated into one long-term memory. Use `--dry-run` to preview.

### `mine`

Import memories from historical conversations (Claude, ChatGPT, Slack exports).

```bash
ai-memory mine --format claude <path-to-export>
ai-memory mine --format chatgpt <path-to-export>
ai-memory mine --format slack <path-to-export>
```

Takes `--format` to specify the input file format (`claude`, `chatgpt`, `slack`) and a path to the export file or directory.

### `man`

Generate roff man page to stdout.

```bash
ai-memory man           # print roff to stdout
ai-memory man | man -l -  # view immediately
```

### `completions`

```bash
ai-memory completions bash
ai-memory completions zsh
ai-memory completions fish
```

## Adding New Features

1. **Add the model** in `src/models/` -- new struct or new fields on existing structs (wire keys via `src/models/field_names.rs` consts)
2. **Add validation** in `validate.rs` -- new validation function (or a `RequestValidator` method)
3. **Add the DB operation** on the SAL `MemoryStore` trait in `src/store/mod.rs` FIRST (post-#961), implemented on `SqliteStore` (usually delegating to a `crate::storage::*` primitive) AND `PostgresStore` — a sqlite-only free function will 501 on the postgres route gate
4. **Add the HTTP handler** in the matching per-domain module under `src/handlers/`
5. **Add the route**: one path const in `src/handlers/routes.rs` (the route-path SSOT) + one `.route(handlers::routes::<CONST>, …)` registration in `src/lib.rs::build_router_with_timeout`, then bump `EXPECTED_PRODUCTION_ROUTES_COUNT` / `EXPECTED_PRODUCTION_UNIQUE_PATHS_COUNT` and the postgres-gate allowlist (`src/handlers/postgres_gate.rs`)
6. **Add the CLI command** -- new variant in the `Command` enum in `src/daemon_runtime.rs`, an `Args` struct (typically under `src/cli/commands/`), a dispatch arm, and bump `EXPECTED_CLI_SUBCOMMANDS_*`
7. **Add the MCP tool** (post-v0.7.0 #987 D1.x): define `<ToolName>Request` (schemars `JsonSchema` derive; NO `deny_unknown_fields` per the #1052 wire-truthfulness pin) + `<ToolName>Tool` (zero-sized) with `impl McpTool` in `src/mcp/tools/<name>.rs`; register ONE `RegisteredTool::of::<…>()` line in `registered_tools()` in `src/mcp/registry.rs`; add a `tool_names::*` const + `ALL` slice entry (census/pin tests track the count automatically against `Profile::full().expected_tool_count()`); add the handler + dispatch arm in `src/mcp/mod.rs::handle_request()`; add a `d1_6_987_tests` parity-test mod. The pre-D1.6 step "add JSON definition in `tool_definitions()`" is gone — `tool_definitions()` is now a four-line iteration.
8. **Add tests** under `tests/` (integration) and in the module's unit-test suite

## Testing

The project has **1,886 lib tests + 49+ integration tests at 93.84% line coverage** as of v0.6.3.1 (was 1,600 lib / 93.08% on v0.6.3). v0.6.3 baseline numbers are frozen on the [evidence page](https://alphaonedev.github.io/ai-memory-mcp/evidence.html); v0.6.3.1 deltas are documented in the release notes. Modules each carry their own unit-test suite; integration tests live under `tests/`.

```bash
# Run all tests
cargo test

# Run with output
cargo test -- --nocapture

# Run a specific test
cargo test test_name

# Check formatting
cargo fmt --check

# Run clippy
cargo clippy -- -D warnings
```

Integration tests run through the CLI binary, creating temporary databases for isolation.

## Benchmarks

### Criterion (microbenchmarks)

Criterion benchmarks are in `benches/recall.rs`. They test insert, recall, and search performance at 1,000 memories scale.

```bash
cargo bench
# recall/short_query, recall/medium_query, recall/long_query
# search/simple_search, search/filtered_search
# insert/store_memory
```

### LongMemEval (end-to-end accuracy)

The `benchmarks/longmemeval/` directory evaluates recall accuracy against the [LongMemEval](https://github.com/xiaowu0162/LongMemEval) dataset (ICLR 2025). Four harnesses are available:

| Harness | Strategy | R@5 | Speed |
|---------|----------|-----|-------|
| `harness_99.py --no-expand` | Parallel FTS5, 10 cores | **97.0%** | 232 q/s (2.2s) |
| `harness_99.py` | LLM expansion + parallel FTS5 | **97.8%** | 142 q/s (3.5s) |
| `harness_fast.py` | Single-process native SQLite | 96.2% | 57 q/s (8.8s) |
| `harness.py` | CLI subprocess per operation | 96.2% | 1.2 q/s (414s) |

Best result: **97.8% R@5 (489/500), 99.0% R@10, 99.8% R@20** -- 499/500 at R@20.

```bash
# Quick run (keyword, ~2s)
python3 benchmarks/longmemeval/harness_99.py \
  --dataset-path /tmp/LongMemEval --variant S --no-expand --workers 10

# Full run with LLM expansion (requires Ollama + gemma3:4b)
python3 benchmarks/longmemeval/harness_99.py \
  --dataset-path /tmp/LongMemEval --variant S --workers 10
```

See `benchmarks/longmemeval/README.md` for full replication instructions.

## CI/CD Pipeline

GitHub Actions CI runs on every push and pull request. The four cargo
gates every PR must pass (see `CLAUDE.md` §Build & Test Commands):

1. **Check formatting** -- `cargo fmt --check`
2. **Clippy** -- `cargo clippy -- -D warnings -D clippy::all -D clippy::pedantic`
3. **Run tests** -- `AI_MEMORY_NO_CONFIG=1 cargo test`
4. **Dependency audit** -- `cargo audit`

Plus the script-based HARD-BLOCK gates wired into
`.github/workflows/c8-precheck.yml` (#1174 PR10):

- **C8 caller-context allowlist** -- `scripts/qc-codegraph-precheck.sh`
  (no new `CallerContext::for_agent("<literal>")` / `for_admin` sites
  outside `scripts/qc-codegraph-allowlists/`).
- **Vendor-monoculture + SECS_PER_\*** -- `scripts/check-vendor-literals.sh`
  (vendor identifiers only in the documented substrate carve-outs;
  no raw 3600/86400/604800 `Duration::from_secs` literals — use
  `SECS_PER_HOUR`/`SECS_PER_DAY`/`SECS_PER_WEEK`).
- **Hardcoded-literal duplication ratchet** --
  `scripts/check-hardcoded-literals.sh` against the frozen baseline.
- **Docs-vs-SSOT drift gate** -- `scripts/check-docs-vs-ssot.sh`
  (narrative counts in the docs must match the canonical Rust consts:
  schema version, tool counts, route/path counts, CLI subcommand
  counts, Memory field count, link-relation count, …).

Each script gate also runs a `--self-test` step proving it is
load-bearing. Coverage floors are enforced per-module from
`coverage/thresholds.toml` (`.github/workflows/coverage.yml`) —
thresholds rise across releases, never fall. Additional workflows:
`bench.yml` (p95 budgets), `token-budget.yml` (`tools/list` token
ceiling), `tool-count-drift.yml`, `fuzz.yml`, `mobile-runtime.yml`.

### Release Pipeline

On tag push (e.g., `v0.2.0`):

1. Builds release binaries for `x86_64-unknown-linux-gnu` and `aarch64-apple-darwin`
2. Packages as `.tar.gz`
3. Creates a GitHub Release with the artifacts

## Building from Source

```bash
git clone https://github.com/alphaonedev/ai-memory-mcp.git
cd ai-memory

# Debug build
cargo build

# Release build (optimized, stripped)
cargo build --release

# The binary is at target/release/ai-memory
```

### New Dependencies (v0.4.0)

- `candle-core`, `candle-nn`, `candle-transformers` -- HuggingFace Candle for local embedding model inference
- `hf-hub` -- HuggingFace Hub client for downloading embedding models
- `tokenizers` -- HuggingFace tokenizers for text preprocessing
- `reqwest` -- HTTP client for Ollama API communication (LLM inference)

All dependencies are always compiled; tier selection controls which features are activated at runtime.

Release profile settings (from `Cargo.toml`):
- `opt-level = 3`
- `strip = true` (removes debug symbols)
- `lto = "thin"` (link-time optimization)

---

## Working Under an Autonomous Campaign

When this repository is being driven by the `campaign` Python harness
(at `alphaonedev/agentic-mem-labs/tools/campaign/`, Apache 2.0 ©
AlphaOne LLC), the development workflow is the same workflow described
above plus the constraints in
[`ENGINEERING_STANDARDS.md` §7](ENGINEERING_STANDARDS.md#7-autonomous-campaign-workflow).

### Concurrent operation

- A live campaign holds a designated `release/vX.Y.Z` branch as its
  exclusive merge target. Human contributors can still open PRs against
  `develop` (or pre-existing release branches) without conflict.
- The campaign records every decision and PR to its ai-memory namespace
  (named after the campaign, e.g. `campaign-v063`). To see what the
  agent has done in the current iteration window:

      ai-memory --db ~/.claude/ai-memory.db list --namespace campaign-v063

- The append-only audit trail lives on a `campaign-log/vX.Y.Z` branch
  of `agentic-mem-labs`. One markdown report per iteration:

      git -C ~/agentic-mem-labs-log log --oneline campaign-log/v0.6.3

### Memory namespace as the campaign's operating substrate

Every campaign uses an ai-memory namespace named after the campaign.
The namespace contains: the campaign's overview/scope/hard rules,
approvals, code-quality standards, Engineering Standards alignment, a
snapshot of open issues + PRs at campaign start, one summary memory per
iteration, decisions, blockers, and "future"/deferred items.

Treat the namespace as both the agent's working memory and the
historical record. After a campaign ends, the namespace is preserved
indefinitely (`tier = long`).

### What human reviewers should focus on under a campaign

PRs from `campaign/<slug>` branches into `release/vX.Y.Z` get
`gh pr merge --squash --delete-branch` once CI is green. The agent
self-reviews quality (clippy pedantic, fmt, tests). For human
spot-checks: charter alignment, hard-rule compliance, test coverage,
audit consistency on the `campaign-log/vX.Y.Z` branch.

The campaign is a *complement* to human development, not a replacement.
For everything outside the active charter — bug triage, design ADRs,
release cuts, dependency upgrades, security response — humans still
own the work.
