ai-memory for developers — build, embed, extend

Build from source

Rust 1.96+, one binary.

git clone https://github.com/alphaonedev/ai-memory-mcp
cd ai-memory-mcp
cargo build --release
# binary at target/release/ai-memory (~26 MB stripped)

All four gates must pass before a PR is opened:

cargo fmt --check
cargo clippy -- -D warnings -D clippy::all -D clippy::pedantic
AI_MEMORY_NO_CONFIG=1 cargo test
cargo audit

Three interfaces, one example each

Pick the surface that fits your call site.

All three share src/storage/ + src/validate.rs. There is no behavioural divergence; whatever the CLI does, the MCP tool and the HTTP endpoint do too. Post-#966 (Wave-2 Tier-C1, May 2026), DTO-bundling validation routes through the shared RequestValidator fluent surface so adding a new cross-field invariant is one impl method on the facade instead of three audited per-surface duplicates.

MCP — stdio JSON-RPC 2.0

101 entries at --profile full (100 callable), 7 default in --profile core

printf '%s\n' \
  '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"demo","version":"0"}}}' \
  '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"memory_store","arguments":{"title":"hello","content":"first MCP memory","tier":"mid"}}}' \
  | ai-memory mcp --profile core

HTTP — Axum REST

92 route registrations (78 unique paths) at /api/v1/ (+ /metrics)

ai-memory serve --host 127.0.0.1 --port 9077 &

curl -X POST http://127.0.0.1:9077/api/v1/memories \
  -H 'Content-Type: application/json' \
  -H 'X-Agent-Id: ai:my-app@host01' \
  -d '{"title":"hello","content":"first HTTP memory","tier":"mid"}'

CLI — clap subcommands

87 subcommands in default build (89 with --features sal or --features sal-postgres), all support --json

ai-memory store --title hello --content "first CLI memory" --tier mid
ai-memory recall "hello"
ai-memory stats --json | jq .

Full surface: USER_GUIDE.md (MCP tools), API_REFERENCE.md (HTTP), CLI_REFERENCE.md (CLI). The canonical tool count is asserted by Profile::full().expected_tool_count() in src/profile.rs; see issue #862 for the 99+1 disambiguation.

LLM provider configuration matrix (#1067)

Provider-agnostic LLM substrate.

Every tier that calls an LLM (smart, autonomous, query expansion, auto-tag, contradiction detection, consolidate, reflect, atomise) routes through a single LlmProvider abstraction. The wire shape is selected via AI_MEMORY_LLM_BACKEND; defaults to ollama for backward compatibility with v0.6.x.

Recognized backend aliases

# AI_MEMORY_LLM_BACKEND= one of:
ollama              # native /api/chat + /api/embed, no auth (default)
openai-compatible   # generic; requires AI_MEMORY_LLM_BASE_URL

# OR a vendor alias (pre-fills base URL + accepts vendor-canonical key env var):
openai      # https://api.openai.com/v1                              · OPENAI_API_KEY
xai         # https://api.x.ai/v1                                    · XAI_API_KEY
anthropic   # https://api.anthropic.com/v1                           · ANTHROPIC_API_KEY
gemini      # https://generativelanguage.googleapis.com/v1beta/openai · GEMINI_API_KEY | GOOGLE_API_KEY
deepseek    # https://api.deepseek.com/v1                            · DEEPSEEK_API_KEY
kimi        # https://api.moonshot.cn/v1                             · MOONSHOT_API_KEY | KIMI_API_KEY
qwen        # https://dashscope.aliyuncs.com/compatible-mode/v1      · DASHSCOPE_API_KEY | QWEN_API_KEY
mistral     # https://api.mistral.ai/v1                              · MISTRAL_API_KEY
groq        # https://api.groq.com/openai/v1                         · GROQ_API_KEY
together    # https://api.together.xyz/v1                            · TOGETHER_API_KEY
cerebras    # https://api.cerebras.ai/v1                             · CEREBRAS_API_KEY
openrouter  # https://openrouter.ai/api/v1                           · OPENROUTER_API_KEY
fireworks   # https://api.fireworks.ai/inference/v1                  · FIREWORKS_API_KEY
lmstudio    # http://localhost:1234/v1                               · (none / local)

Override knobs (universal precedence: CLI flag > env > config > default)

AI_MEMORY_LLM_BACKEND — selector (see list above)
AI_MEMORY_LLM_BASE_URL — overrides the alias default URL (use for self-hosted vLLM, llama.cpp-server, etc.)
AI_MEMORY_LLM_API_KEY — Bearer key; vendor-canonical env vars (above) are also accepted
AI_MEMORY_LLM_MODEL — model name passed through verbatim (grok-4, gpt-4o, claude-3-7-sonnet, deepseek-chat, qwen-max, …)
Legacy OLLAMA_BASE_URL still honored when AI_MEMORY_LLM_BACKEND=ollama

Examples

# 1. Default — local Ollama with gemma3:4b
ai-memory serve   # no env needed; uses http://localhost:11434

# 2. Cloud — xAI Grok via vendor alias
export AI_MEMORY_LLM_BACKEND=xai
export XAI_API_KEY=xai-...
export AI_MEMORY_LLM_MODEL=grok-4
ai-memory serve

# 3. Self-hosted vLLM (private DC, posture 7)
export AI_MEMORY_LLM_BACKEND=openai-compatible
export AI_MEMORY_LLM_BASE_URL=https://vllm.internal.corp/v1
export AI_MEMORY_LLM_API_KEY=$(cat /run/secrets/vllm-token)
export AI_MEMORY_LLM_MODEL=meta-llama/Llama-3.1-70B-Instruct
ai-memory serve

Wire-shape canonical source: src/llm.rs::LlmProvider (the Ollama / OpenAiCompatible { api_key } enum). See the release notes for the cross-vendor compat table.

NHI agent_id semantics

Every memory carries a claimed identity.

metadata.agent_id is the AI Non-Human Identity marker. On an unsigned write the id is claimed, not attested — don't make security decisions on it alone. As of v0.7.0 (#626 Layer-3) a caller holding the agent's keypair can present a detached Ed25519 signature on the CLI (store --sign), MCP (memory_store), or HTTP (POST /api/v1/memories) store path; the daemon verifies it against the agent's bound public key and stamps metadata.attest_level = "agent_attested". As of v0.9.0 (#1751) store-path attestation is required by default: an unsigned direct write is rejected (403 ATTESTATION_FAILED), and operators opt out with AI_MEMORY_REQUIRE_AGENT_ATTESTATION=0 to restore the permissive claimed posture.

Resolution ladder (CLI + MCP)

Explicit caller value (--agent-id, MCP agent_id param, or metadata.agent_id in store request)
AI_MEMORY_AGENT_ID env var
(MCP only) initialize.clientInfo.name → ai:<client>@<hostname>:pid-<pid>
host:<hostname>:pid-<pid>-<uuid8> (stable per-process)
anonymous:pid-<pid>-<uuid8> (no hostname)

The HTTP daemon is multi-tenant and skips the process-level default: it reads agent_id from the request body, then X-Agent-Id, then assigns anonymous:req-<uuid8> and logs WARN.

Read-path visibility caller (#1468 / #1469)

The ladder above resolves the write identity. The MCP read tools that enforce per-row scope=private ownership — memory_session_start, memory_list, memory_search, memory_recall — resolve their visibility caller from AI_MEMORY_AGENT_ID only (else None, trust-all). The pid-synthesized clientInfo id is deliberately not used: it embeds the live PID and could never match the metadata.agent_id an earlier process wrote, which would hide every prior-session private row from its owner. Set AI_MEMORY_AGENT_ID to drop cross-agent private rows from read results; leave it unset for single-tenant trust-all reads.

Validation regex: ^[A-Za-z0-9_\-:@./]{1,128}$. Permits prefixed forms (ai:, host:), @ scope separator, / for future SPIFFE-style ids. Rejects whitespace, null bytes, control chars, shell metacharacters.

Full reference (immutability invariants, special metadata keys, defaults-that-leak warnings): agent-identity.html.

Hooks and subscriptions

27-event hook pipeline + HMAC subscriptions.

The hook pipeline fires on 27 named substrate events (snake_case wire names from the HookEvent enum in src/hooks/events.rs): pre_store, post_store, pre_recall, post_recall, pre_link, post_reflect, etc. Hooks run synchronously (default) or async-fan-out (subscriptions).

Register a webhook subscription

ai-memory subscribe \
  --url https://my-app.example.com/ai-memory-events \
  --events post_store \
  --secret "$(cat /etc/my-app/hmac.key)"

Events are POSTed as JSON with an X-AI-Memory-Signature HMAC header. Unsigned subscribers are refused at registration. SSRF gate: loopback URLs require AI_MEMORY_ALLOW_LOOPBACK_WEBHOOKS=1.

Full hook taxonomy + payload schemas: hook-pipeline.md.

Extend the substrate

New tools, new families, new hooks.

New MCP tool

Create src/mcp/tools/<name>.rs with a <Name>Request struct (#[derive(JsonSchema, Deserialize)]) and a zero-sized <Name>Tool implementing McpTool (#972 D1.x pattern)
Register the tool in registered_tools() in src/mcp/registry.rs (D1.6 #987 iterator collapses the historical tool_definitions() macro)
Add the handler under src/mcp/tools/<name>.rs (or an existing module): fn(&Connection, params) -> Result<Value>
Add a parity test in the same file's d1_x_NNN_tests mod via crate::mcp::parity_test_helpers::*

New CLI subcommand

Add variant to the top-level Command enum in src/daemon_runtime.rs (the v0.7.0 W6 refactor moved it out of src/main.rs)
Define an Args struct
Add dispatch case in main()
Implement cmd_* handler taking &Path + args

New HTTP endpoint

Add route in the Axum router (src/lib.rs; route-path SSOT in src/handlers/routes.rs)
Implement handler in src/handlers/ using the Db extractor

For deeper changes (new memory_kind, new family, new tier), the writeup is in DEVELOPER_GUIDE.md and the family/kind distinction is clarified in agent-skills.md.

Test results

What the NHI playbook verified.

The Track A NHI test playbook exercises the MCP tool surface, HTTP API parity, CLI dispatch, knowledge-graph traversal, capabilities v3 discovery, token budget, hooks, and chaos — the exact developer-facing surfaces you'd build against. Latest release-gate campaign (2026-05-22): 7,321 PASS / 0 FAIL across 269 test binaries, 22 issues (#1120-#1141) fixed in-campaign, no deferrals to v0.8.0.

Latest campaign index2026-05-22 · SHIP-RECOMMENDED · tip fd172f2cf Engineering writeupTrack A build + install verification For SME engineersReproducibility, methodology, schema ladder v15→v49, per-issue root cause, future-bug prevention For decision-makersVerdict, risk, cost, roadmap

Developer guideFull reference MCP tool reference101 entries at --profile full (100 callable) HTTP API92 route registrations (78 unique paths) CLI reference87 subcommands Agent identityNHI semantics Hook pipeline27 events Knowledge graphTyped links + temporal validity Recursive learningreflect / atomise / consolidate, depth-capped IntegrationsClaude Code, Cursor, etc Engineering disciplineThe #1558 literal burn-down: 497→28 baseline entries, CI ratchet, committed census

Build with ai-memory.

Rust 1.96+, one binary.

Pick the surface that fits your call site.

MCP — stdio JSON-RPC 2.0

HTTP — Axum REST

CLI — clap subcommands

Provider-agnostic LLM substrate.

Recognized backend aliases

Override knobs (universal precedence: CLI flag > env > config > default)

Examples

Every memory carries a claimed identity.

Resolution ladder (CLI + MCP)

Read-path visibility caller (#1468 / #1469)

27-event hook pipeline + HMAC subscriptions.

Register a webhook subscription

New tools, new families, new hooks.

New MCP tool

New CLI subcommand

New HTTP endpoint

What the NHI playbook verified.

Where to go from here.