ai-memory -- Persistent Memory for Any AI

LongMemEval Benchmark (ICLR 2025) — 500 questions, 6 categories

97.8% R@5 (489/500)

99.0% R@10 (495/500)

99.8% R@20 (499/500)

2.2s 232 q/s (keyword)

$0 Cloud API costs

Pure SQLite FTS5 + BM25 — zero cloud dependencies — full benchmark details & replication steps

Install

One command. No dependencies for pre-built binaries. Three installation methods.

Recommended

macOS / Linux

Pre-built binary. Auto-detects OS & architecture.

curl -fsSL https://raw.githubusercontent.com/alphaonedev/ai-memory-mcp/main/install.sh | sh

Windows

PowerShell installer. Adds to PATH automatically.

irm https://raw.githubusercontent.com/alphaonedev/ai-memory-mcp/main/install.ps1 | iex

Cargo (crates.io)

From source. Needs Rust + C compiler.

cargo install ai-memory

Docker

Containerized HTTP server on port 9077.

docker build -t ai-memory .
docker run -p 9077:9077 -v data:/data ai-memory

cargo-binstall

Pre-built binary via cargo. No compile step.

cargo binstall ai-memory

Supported platforms: macOS (Intel + Apple Silicon) • Linux (x86_64 + ARM64) • Windows (x86_64) • WSL • Docker

Build from source? Ubuntu/Debian: sudo apt install build-essential pkg-config • Fedora/RHEL: sudo dnf install gcc pkg-config • macOS: Xcode CLT (pre-installed) • Windows: MSVC C++ build tools

Optional: Ollama for Smart & Autonomous tiers Optional

The keyword and semantic tiers work with zero dependencies. The smart and autonomous tiers add LLM-powered query expansion, auto-tagging, and neural reranking via Ollama.

Install Ollama Smart & Autonomous Tiers

The smart and autonomous tiers use local LLMs via Ollama for query expansion, auto-tagging, contradiction detection, and cross-encoder reranking. Skip this step if you only need keyword or semantic search.

macOS

# Install via Homebrew
brew install ollama

# Or download the macOS app:
# https://ollama.com/download/mac

# Start the Ollama service
ollama serve &
# (or launch the Ollama.app -- it runs as a menu bar item)

# Pull models for your tier
ollama pull nomic-embed-text  # Embeddings (smart+)
ollama pull gemma4:e2b         # LLM — Smart (~1GB)
ollama pull gemma4:e4b         # LLM — Autonomous (~2.3GB)

Linux

# One-line install script
curl -fsSL https://ollama.com/install.sh | sh

# Enable and start the systemd service
sudo systemctl enable ollama
sudo systemctl start ollama

# Pull models for your tier
ollama pull nomic-embed-text  # Embeddings (smart+)
ollama pull gemma4:e2b         # LLM — Smart (~1GB)
ollama pull gemma4:e4b         # LLM — Autonomous (~2.3GB)

Windows

# Install via winget
winget install Ollama.Ollama

# Or download the installer:
# https://ollama.com/download/windows

# Ollama runs as a system service after install

# Pull models for your tier
ollama pull nomic-embed-text  # Embeddings (smart+)
ollama pull gemma4:e2b         # LLM — Smart (~1GB)
ollama pull gemma4:e4b         # LLM — Autonomous (~2.3GB)

Verify Ollama

# Check Ollama is running and models are available
curl http://localhost:11434/api/tags
ollama run gemma4:e2b "Hello, world"   # Should respond in ~1s

ai-memory connects to Ollama at localhost:11434 automatically. Override with ollama_url in ~/.config/ai-memory/config.toml or --ollama-url flag. If Ollama is unavailable, ai-memory gracefully falls back to the semantic tier.

Configure your AI platform

Choose the integration method that fits your setup.

Claude Code Codex CLI Gemini CLI Cursor Windsurf Continue.dev Grok Llama OpenClaw Any MCP Client

Claude Code MCP Configuration Scopes:

Scope	File	Applies to
User (global)	`~/.claude.json`	All projects on your machine
Project (shared)	`.mcp.json` in project root	Everyone on the project (via git)
Local (private)	`~/.claude.json` under `projects`	One project, just you

User scope (recommended) — merge mcpServers into your existing ~/.claude.json (macOS/Linux) or %USERPROFILE%\.claude.json (Windows):

{
  "mcpServers": {
    "memory": {
      "command": "ai-memory",
      "args": ["--db", "~/.claude/ai-memory.db", "mcp", "--tier", "semantic"]
    }
  }
}json

Restart Claude Code. It will discover all 17 memory tools natively. No daemon, no ports. MCP servers do not go in settings.json or settings.local.json. The --tier flag is required — options: keyword, semantic (default), smart, autonomous. Smart/autonomous require Ollama.

Windows: Use ai-memory.exe for the command and forward slashes in paths: "C:/Users/YourName/.claude/ai-memory.db"

OpenAI Codex CLI Configuration Scopes:

Scope	File	Applies to
Global (user)	`~/.codex/config.toml`	All projects on your machine
Project	`.codex/config.toml` in project root	Trusted projects only

Windows: %USERPROFILE%\.codex\config.toml. Override config dir with CODEX_HOME env var.

# OpenAI Codex CLI MCP configuration
[mcp_servers.memory]
command = "ai-memory"
args = ["--db", "~/.local/share/ai-memory/memories.db", "mcp", "--tier", "semantic"]
enabled = truetoml

CLI shortcut: codex mcp add memory -- ai-memory --db ~/.local/share/ai-memory/memories.db mcp --tier semantic

Codex uses TOML with underscored key mcp_servers (not camelCase). Supports env, env_vars, enabled_tools, disabled_tools, startup_timeout_sec, tool_timeout_sec. Use /mcp in the TUI to view server status. Windows/WSL: WSL uses Linux home by default — set CODEX_HOME to share config with Windows host. See Codex MCP docs.

Google Gemini CLI Configuration Scopes:

Scope	File	Applies to
User (global)	`~/.gemini/settings.json`	All projects on your machine
Project	`.gemini/settings.json` in project root	Scoped to that project

Windows: %USERPROFILE%\.gemini\settings.json. Env vars: $VAR / ${VAR} (all platforms), %VAR% (Windows).

{
  "mcpServers": {
    "memory": {
      "command": "ai-memory",
      "args": ["--db", "~/.local/share/ai-memory/memories.db", "mcp", "--tier", "semantic"],
      "timeout": 30000
    }
  }
}json

CLI shortcut: gemini mcp add memory ai-memory -- --db ~/.local/share/ai-memory/memories.db mcp --tier semantic

Avoid underscores in server names (use hyphens). Tool names are auto-prefixed as mcp_memory_<toolName>. Env vars in env field support $VAR / ${VAR} (all platforms) and %VAR% (Windows). Gemini sanitizes sensitive patterns (*TOKEN*, *SECRET*) from inherited env unless declared. Add "trust": true to skip confirmation. CLI: gemini mcp list/remove/enable/disable. See Gemini CLI MCP docs.

Cursor IDE Configuration Scopes:

Scope	File	Applies to
Global (user)	`~/.cursor/mcp.json`	All projects on your machine
Project	`.cursor/mcp.json` in project root	Overrides global for same-named servers

Windows: %USERPROFILE%\.cursor\mcp.json. Also configurable via Settings > Tools & MCP.

{
  "mcpServers": {
    "memory": {
      "command": "ai-memory",
      "args": ["--db", "~/.local/share/ai-memory/memories.db", "mcp", "--tier", "semantic"]
    }
  }
}json

Or add via Cursor Settings > Tools & MCP. Restart Cursor after editing. Verify with green dot in Settings. Supports env, envFile, ${env:VAR_NAME} interpolation (can be unreliable for shell profile vars — use envFile as workaround). ~40 tool limit across all servers. See Cursor MCP docs.

Windsurf (Codeium) Configuration Scopes:

Scope	File	Applies to
Global only	`~/.codeium/windsurf/mcp_config.json`	All projects (no project scope)

Windows: %USERPROFILE%\.codeium\windsurf\mcp_config.json. Also configurable via MCP Marketplace or Settings > Cascade > MCP Servers.

{
  "mcpServers": {
    "memory": {
      "command": "ai-memory",
      "args": ["--db", "~/.codeium/windsurf/ai-memory.db", "mcp", "--tier", "semantic"]
    }
  }
}json

Supports ${env:VAR_NAME} interpolation in command, args, env, serverUrl, url, and headers. 100 tool limit across all servers. Can also add via MCP Marketplace or Settings > Cascade > MCP Servers. See Windsurf MCP docs.

Continue.dev Configuration Scopes:

Scope	File	Applies to
User (global)	`~/.continue/config.yaml`	All projects on your machine
Project	`.continue/mcpServers/` dir in project root	Per-server YAML/JSON files

Windows: %USERPROFILE%\.continue\config.yaml. Project dir auto-detects JSON configs from other tools.

# Continue.dev MCP configuration
mcpServers:
  - name: memory
    command: ai-memory
    args:
      - "--db"
      - "~/.continue/ai-memory.db"
      - "mcp"
      - "--tier"
      - "semantic"yaml

MCP tools only work in agent mode. Supports ${{ secrets.SECRET_NAME }} for secret interpolation. Project-level .continue/mcpServers/ directory auto-detects JSON configs from other tools (Claude Code, Cursor, etc.). See Continue MCP docs.

xAI Grok Configuration:

Scope	Method	Applies to
Per-request	API `tools` array (no config file)	Each API call individually

Remote HTTPS only (no stdio). Start ai-memory behind an HTTPS reverse proxy.

# Step 1: Start the ai-memory HTTP server
ai-memory serve --host 127.0.0.1 --port 9077 &
# Expose via HTTPS reverse proxy (nginx, caddy, cloudflare tunnel, etc.)

# Step 2: Add the MCP server to your Grok API call
curl https://api.x.ai/v1/responses \
  -H "Authorization: Bearer $XAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "grok-3",
    "tools": [{
      "type": "mcp",
      "server_url": "https://your-server.example.com/mcp",
      "server_label": "memory",
      "server_description": "Persistent AI memory with recall and search"
    }],
    "input": "What do you remember about our project?"
  }'bash

HTTPS required. server_label is required. Supports Streamable HTTP and SSE transports. Optional: allowed_tools, authorization, headers. Works with xAI SDK, OpenAI-compatible Responses API, and Voice Agent API. See xAI Remote MCP docs.

META Llama Stack Configuration:

Scope	Method	Applies to
Declarative	`run.yaml` — `tool_groups` section	Deployment-wide (supports `${env.VAR}`)
Programmatic	Python/Node SDK — `toolgroups.register()`	Runtime registration

Llama Stack uses toolgroup registration with an HTTP backend.

# Step 1: Start the ai-memory HTTP server
ai-memory serve --host 127.0.0.1 --port 9077 &

# Step 2: Register as a Llama Stack toolgroup
# In your Llama Stack config, register the MCP endpoint:
#   toolgroup: ai-memory
#   provider: remote::mcp-endpoint
#   url: http://127.0.0.1:9077

# Or use the REST API directly in custom tool definitions:
#   POST /api/v1/memories, GET /api/v1/recall, etc.bash

META Llama uses Llama Stack for tool registration. Run ai-memory serve and register as a toolgroup via Python SDK or run.yaml (supports ${env.VAR_NAME} interpolation). Transport migrating from SSE to Streamable HTTP. See Llama Stack Tools docs.

OpenClaw Configuration:

Scope	File	Applies to
Single config	Platform config file	All projects (single config file)

Important: OpenClaw uses mcp.servers (NOT mcpServers). The key structure is different from most other platforms.

{
  "mcp": {
    "servers": {
      "memory": {
        "command": "ai-memory",
        "args": ["--db", "~/.local/share/ai-memory/memories.db", "mcp", "--tier", "semantic"]
      }
    }
  }
}json

CLI shortcut:

openclaw mcp set memory '{"command":"ai-memory","args":["--db","~/.local/share/ai-memory/memories.db","mcp","--tier","semantic"]}'bash

Management: openclaw mcp list · openclaw mcp show <name> · openclaw mcp unset <name>. See OpenClaw MCP docs.

Generic MCP Client Configuration:

Transport	Method	Details
stdio	`ai-memory mcp`	JSON-RPC 2.0, spawned by AI client
HTTP	`ai-memory serve`	REST API on localhost:9077

Point your MCP client at the ai-memory binary with the mcp subcommand:

{
  "mcpServers": {
    "memory": {
      "command": "ai-memory",
      "args": ["--db", "path/to/memory.db", "mcp", "--tier", "semantic"]
    }
  }
}json

The MCP server exposes 17 tools over stdio using JSON-RPC. Any client that speaks MCP will discover them automatically. Adjust the --db path to your preferred location.

Verify it works

Check that your AI has access to memory tools.

# MCP: Ask your AI "What memory tools do you have?"
# HTTP: curl http://127.0.0.1:9077/api/v1/health
# CLI:  ai-memory statstext

Feature Tiers 4 Levels

Each tier builds on the one below it. Choose based on your resources and needs. Set via ai-memory mcp --tier <name> or in ~/.config/ai-memory/config.toml.

Tier	RAM	Embedding Model	LLM	Dependencies	Key Features
keyword	0 MB	—	—	None	FTS5 full-text search, 13 MCP tools
semantic	~256 MB	all-MiniLM-L6-v2 (384-dim, local via Candle)	—	None (model auto-downloads ~90MB)	+ Hybrid recall (FTS5 + cosine similarity), HNSW vector index, 14 MCP tools
smart	~1 GB	nomic-embed-text-v1.5 (768-dim, via Ollama)	Gemma 4 E2B (~1GB)	Ollama	+ LLM query expansion, auto-tagging, auto-consolidation, 17 MCP tools
autonomous	~4 GB	nomic-embed-text-v1.5 (768-dim, via Ollama)	Gemma 4 E4B (~2.3GB)	Ollama	+ Neural cross-encoder reranking (ms-marco-MiniLM), contradiction analysis, 17 MCP tools

Keyword Tier

Pure SQLite FTS5 full-text search. Zero ML dependencies, zero memory overhead. The binary is entirely self-contained. Ideal for low-resource environments, CI runners, or when you just need fast text matching.

Semantic Tier (default)

Adds dense vector embeddings via all-MiniLM-L6-v2 (384-dim), loaded locally through the Candle ML framework. Recall blends FTS5 keyword scores with cosine similarity using adaptive content-length weighting (50/50 for short memories, 85/15 FTS-weighted for long content). HNSW index for fast approximate nearest-neighbor search. The model auto-downloads from HuggingFace on first run (~90MB).

Smart Tier

Upgrades to nomic-embed-text-v1.5 (768-dim) via Ollama for higher-quality embeddings. Adds an on-device LLM (Gemma 4 Effective 2B) that powers three new tools: memory_expand_query (semantic query broadening), memory_auto_tag (content-aware tagging), and memory_detect_contradiction (conflict detection). Requires Ollama running locally.

Autonomous Tier

Upgrades the LLM to Gemma 4 Effective 4B for more nuanced reasoning. Adds a neural cross-encoder reranker (ms-marco-MiniLM-L-6-v2) that re-scores (query, document) pairs after hybrid retrieval for significantly better recall precision. Full autonomous memory reflection and contradiction resolution. Requires Ollama.

Capability Matrix

Every capability mapped to its minimum tier. Each tier includes all capabilities from the tiers below it.

Capability	keyword	semantic	smart	autonomous
Search & Recall
FTS5 keyword search (`memory_search`)	Yes	Yes	Yes	Yes
Semantic embedding (cosine similarity)	—	Yes	Yes	Yes
Hybrid recall (FTS5 + cosine, adaptive blend)	—	Yes	Yes	Yes
HNSW approximate nearest-neighbor index	—	Yes	Yes	Yes
LLM query expansion (`memory_expand_query`)	—	—	Yes	Yes
Neural cross-encoder reranking (ms-marco-MiniLM)	—	—	—	Yes
Memory Management
Store, update, delete, promote	Yes	Yes	Yes	Yes
Link memories (4 relation types)	Yes	Yes	Yes	Yes
Bulk forget by pattern/namespace/tier	Yes	Yes	Yes	Yes
Manual consolidation (user-provided summary)	Yes	Yes	Yes	Yes
Auto-consolidation (LLM-generated summary)	—	—	Yes	Yes
Auto-tagging (`memory_auto_tag`)	—	—	Yes	Yes
Contradiction detection (`memory_detect_contradiction`)	—	—	Yes	Yes
Autonomous memory reflection	—	—	—	Yes
Embedding Model
Model	—	all-MiniLM-L6-v2	nomic-embed-text-v1.5	nomic-embed-text-v1.5
Dimensions	—	384	768	768
Runtime	—	Candle (local)	Ollama	Ollama
Model size	—	~90 MB	~274 MB	~274 MB
LLM (Language Model)
Model	—	—	Gemma 4 Effective 2B	Gemma 4 Effective 4B
Ollama tag	—	—	`gemma4:e2b`	`gemma4:e4b`
Model size	—	—	~7.2 GB	~9.6 GB
Resources
Total RAM	0 MB	~256 MB	~1 GB	~4 GB
External dependencies	None	None	Ollama	Ollama
MCP tools exposed	13	14	17	17
Ollama models to pull	—	—	`nomic-embed-text` + `gemma4:e2b`	`nomic-embed-text` + `gemma4:e4b`

Tiers gate features, not models. The --tier flag controls which tools are exposed. The LLM model is independently configurable via llm_model in config.toml. For example, run autonomous tier (all features) with the faster e2b model: llm_model = "gemma4:e2b" (46 tok/s vs 26 tok/s for e4b). If Ollama is unavailable at startup, smart and autonomous tiers fall back to semantic automatically.

Configuration File

# ~/.config/ai-memory/config.toml
# Created automatically on first run with defaults commented out

tier = "autonomous"                   # keyword | semantic | smart | autonomous
db = "~/.claude/ai-memory.db"         # SQLite database path
ollama_url = "http://localhost:11434" # Ollama API endpoint
llm_model = "gemma4:e2b"             # independently configurable (e2b=46tok/s, e4b=26tok/s)
cross_encoder = true                 # Neural reranking (autonomous tier)
default_namespace = "global"         # Default namespace for new memoriestoml

17 MCP Tools Universal Integration

ai-memory runs as a Model Context Protocol (MCP) tool server over stdio. Any MCP-compatible AI client -- Claude, ChatGPT, Grok, Llama, or custom agents -- discovers these tools automatically.

memory_store

Store a new memory. Deduplicates by title+namespace. Detects contradictions with existing memories.

memory_recall

Fuzzy OR search with 6-factor ranking. Auto-touches recalled memories (extends TTL, may promote).

memory_search

Exact keyword AND search. Returns memories matching all terms.

memory_list

Browse memories with filters: namespace, tier, tags, date range.

memory_get

Retrieve a single memory by ID, including all its links.

memory_update

Update an existing memory: change title, content, tier, priority, or tags.

memory_delete

Delete a specific memory by ID. Links cascade automatically.

memory_promote

Promote a memory to long-term permanent storage. Clears expiry.

memory_forget

Bulk delete by pattern, namespace, or tier.

memory_link

Link two memories: related_to, supersedes, contradicts, or derived_from.

memory_get_links

Get all links for a memory by ID.

memory_consolidate

Merge multiple memories into one long-term summary.

memory_stats

Database statistics: counts by tier, namespaces, link count, DB size.

memory_capabilities

Returns available capabilities for the current feature tier. Lets the AI discover what tools and features are active.

memory_expand_query

LLM-powered query expansion. Broadens a recall query with synonyms and related terms for better recall coverage. (smart+ tiers)

memory_auto_tag

LLM-powered auto-tagging. Analyzes memory content and suggests relevant tags automatically. (smart+ tiers)

memory_detect_contradiction

LLM-powered contradiction analysis. Compares a memory against existing memories to detect conflicts and inconsistencies. (smart+ tiers)

Method	Endpoint	Description
GET	`/health`	Deep health check (DB + FTS5 integrity)
GET	`/memories`	List memories (filter: namespace, tier, priority, date range, tags)
POST	`/memories`	Create memory (dedup on title+namespace, contradiction detection)
POST	`/memories/bulk`	Bulk create (up to 1000 items per request)
GET	`/memories/{id}`	Get memory by ID (includes links)
PUT	`/memories/{id}`	Update memory (partial update, validated)
DELETE	`/memories/{id}`	Delete memory (links cascade)
POST	`/memories/{id}/promote`	Promote memory to long-term (clears expiry)
GET	`/search`	FTS5 AND search with 6-factor ranking
GET	`/recall`	Fuzzy OR recall + touch + auto-promote
POST	`/recall`	Recall via POST body (for longer queries)
POST	`/forget`	Bulk delete by pattern/namespace/tier
POST	`/consolidate`	Merge 2-100 memories into one long-term summary
POST	`/links`	Create memory link (4 relation types)
GET	`/links/{id}`	Get all links for a memory
GET	`/namespaces`	List namespaces with counts
GET	`/stats`	Aggregate statistics
POST	`/gc`	Run garbage collection on expired memories
GET	`/export`	Export all memories + links as JSON
POST	`/import`	Import memories + links from JSON

25 CLI Commands Universal

Global flags: --db <path> and --json. Scriptable, pipeable, works in any shell. Use directly or wrap in your AI's tool layer.

Category	Command	Description
Server	`mcp`	Run as MCP tool server over stdio (primary integration for MCP clients)
Server	`serve`	Start HTTP daemon (--host, --port, default 9077) -- universal API for any AI
Core	`store`	Store memory (-T title, -c content, --tier, --namespace, --tags, --priority, --confidence, --source)
Core	`update`	Update memory by ID (partial fields)
Core	`delete`	Delete memory by ID (links cascade)
Core	`promote`	Promote to long-term (clears expiry)
Query	`recall`	Fuzzy OR recall with 6-factor ranking (--namespace, --limit, --tags, --since)
Query	`search`	AND keyword search (--namespace, --tier, --limit, --since, --until, --tags)
Query	`get`	Get memory by ID (includes links)
Query	`list`	List with filters (--namespace, --tier, --limit, --since, --until, --tags)
Manage	`forget`	Bulk delete (--namespace, --pattern, --tier)
Manage	`link`	Link two memories (--relation: related_to, supersedes, contradicts, derived_from)
Manage	`consolidate`	Merge N memories into one (-T title, -s summary, --namespace)
Manage	`resolve`	Resolve contradiction: winner supersedes loser (demotes loser: priority=1, confidence=0.1)
Manage	`auto-consolidate`	Auto-group by namespace+tag and consolidate (--dry-run, --short-only, --min-count, --namespace)
Ops	`gc`	Run garbage collection on expired memories
Ops	`stats`	Show statistics (counts, tiers, namespaces, links, DB size)
Ops	`namespaces`	List all namespaces with memory counts
Ops	`sync`	Sync databases (--direction pull\|push\|merge, dedup-safe upsert)
I/O	`export`	Export all memories + links as JSON (stdout)
I/O	`import`	Import memories + links from JSON (stdin)
I/O	`completions`	Generate shell completions (bash, zsh, fish)
I/O	`man`	Generate roff man page to stdout
I/O	`mine`	Import memories from historical conversations (Claude, ChatGPT, Slack)
Ops	`shell`	Interactive REPL with color output (recall, search, list, get, stats, namespaces, delete)

6-Factor Recall Scoring

Every recall query computes a composite score entirely in SQLite. Higher scores rank first. No external ML or embedding service required.

score = fts_rank * -1 + priority * 0.5 + MIN(access_count, 50) * 0.1 + confidence * 2.0 + tier_boost + 1/(1 + days * 0.1)

FTS Relevance -- SQLite FTS5 rank (negated: lower = better)

Priority -- 1-10 weighted by 0.5 (range: 0.5 - 5.0)

Access Count -- weighted by 0.1 (unbounded, rewards frequent use)

Confidence -- 0.0-1.0 weighted by 2.0 (range: 0.0 - 2.0)

Tier Boost -- long=3.0, mid=1.0, short=0.0

Recency -- 1/(1 + days_since_update * 0.1), today=1.0, 10d=0.5

Recency Decay Curve

Architecture

Single Rust binary. Three universal interfaces. Four feature tiers with optional local LLMs via Ollama.

Capability	CLI (Universal)	HTTP API (Universal)	MCP (Universal)
Store memory	Yes	Yes	Yes
Update memory	Yes	Yes	Yes
Recall (fuzzy OR)	Yes	Yes	Yes
Search (AND)	Yes	Yes	Yes
Get by ID	Yes	Yes	Yes
List with filters	Yes	Yes	Yes
Delete	Yes	Yes	Yes
Promote	Yes	Yes	Yes
Forget (bulk delete)	Yes	Yes	Yes
Link memories	Yes	Yes	Yes
Get links	Yes	Yes	Yes
Consolidate	Yes	Yes	Yes
Stats	Yes	Yes	Yes
Bulk create	--	Yes	--
Resolve contradictions	Yes	--	--
Auto-consolidate	Yes	--	--
Sync databases	Yes	--	--
Interactive shell	Yes	--	--
Export / Import	Yes	Yes	--
Garbage collection	Yes	Yes	--
Namespaces list	Yes	Yes	--
Shell completions	Yes	--	--
Man page	Yes	--	--

Interactive Shell

ai-memory shell opens a REPL with color-coded output. Tiers are red/yellow/green, priority is visualized as bars, namespaces appear in cyan.

CI/CD Pipeline

GitHub Actions runs on every push and PR. Releases are automated on tag push with cross-platform binaries.

Config	R@1	R@5	R@10	R@20	Time	Speed
Parallel FTS5 (keyword)	86.2%	97.0%	98.2%	99.4%	2.2s	232 q/s
LLM-expanded + parallel FTS5	86.8%	97.8%	99.0%	99.8%	3.5s	142 q/s

Category	R@1	R@5	R@10	R@20
single-session-assistant	100.0%	100.0%	100.0%	100.0%
knowledge-update	91.0%	100.0%	100.0%	100.0%
single-session-user	88.6%	98.6%	100.0%	100.0%
multi-session	88.0%	97.7%	98.5%	100.0%
temporal-reasoning	79.7%	96.2%	98.5%	99.2%
single-session-preference	73.3%	93.3%	96.7%	100.0%
OVERALL	86.8%	97.8%	99.0%	99.8%

Persistent Memory for Any AI

Works With Any AI Platform

Claude Code

OpenAI Codex CLI

Google Gemini CLI

Cursor IDE

Windsurf

Continue.dev

xAI Grok

META Llama

OpenClaw

Any MCP Client

Install

macOS / Linux

Windows

Cargo (crates.io)

Docker

cargo-binstall

Optional: Ollama for Smart & Autonomous tiers Optional

Install Ollama Smart & Autonomous Tiers

macOS

Linux

Windows

Verify Ollama

Configure your AI platform

Verify it works

What It Does

Zero Token Cost

Store and Recall

Three-Tier Memory

Full-Text + Semantic Search

4 Feature Tiers

Memory Links

LLM-Powered Features

TOON Format

MCP Prompts

Feature Tiers 4 Levels

Keyword Tier

Semantic Tier (default)

Smart Tier

Autonomous Tier

Capability Matrix

Configuration File

17 MCP Tools Universal Integration

20 HTTP API Endpoints Universal Fallback

Integration Examples

25 CLI Commands Universal

Three-Tier Memory

Short-Term

Mid-Term

Long-Term

6-Factor Recall Scoring

Recency Decay Curve

Security

Transaction Safety

FTS5 Injection Prevention

Body Size Limits

CORS (Permissive for Localhost)

Sanitized Error Responses

Bulk Limits (1000)

AtomicBool Thread Safety

Link Validation in Sync

JSON-RPC Version Validation

Arguments Validation

Input Validation

Localhost-Only Binding

Architecture

Feature Matrix

Interactive Shell

Usage Examples

CLI Usage

HTTP API Usage

CI/CD Pipeline

LongMemEval Benchmark

Results

Per-Category Breakdown (LLM-expanded)

Reproduce