v0.6.4 · quiet-tools · shipped 2026-05-05

76% lighter on the wire.
Same 43-tool capability.

ai-memory v0.6.4 collapses the default MCP tool surface from 43 to 5 — saving ~4,700 input tokens per request on every eager-loading harness — without removing a single tool. The other 38 are still there, still callable, still functional. They just don't pre-pay their schema cost on every turn.

5 default · 43 full 76.4% reduction cl100k_base measured schema v19 → v20
Three audiences, one release

What v0.6.4 means for you

Same release, three framings — pick yours.

👤 If you USE AI

Your AI bill drops. Nothing else changes.

Every time your AI assistant (Claude, ChatGPT, Cursor, Codex, Grok, Gemini) reaches for ai-memory, it used to spend ~6,200 input tokens just describing the available memory tools before reading your message. v0.6.4 cuts that to ~1,500. Your AI still does everything it did before — it just doesn't pre-pay for tools it doesn't need every turn. First-message responses on Codex / Grok / Gemini should feel snappier; your subscription cost drops automatically.

brew upgrade ai-memory && ai-memory doctor --tokens

That second command shows you exactly how much you're saving.

🛠️ If you BUILD with AI

Profile-aware tool surface. SDKs published. NHI guardrails.

  • --profile {core,graph,admin,power,full,custom} flag (CLI + AI_MEMORY_PROFILE env + [mcp].profile config). Resolution order: CLI > env > config > core.
  • memory_capabilities extended with family=<name> + include_schema=true for runtime tool registration.
  • Unloaded-tool calls return JSON-RPC -32601 with an actionable diagnostic naming the family and suggesting both --profile and --include-schema recovery paths.
  • SDKs published OIDC-style: @alphaone/ai-memory on npm + ai-memory-mcp on PyPI. requireProfile helper raises ProfileNotLoaded with structured .hint.
  • Cross-harness installer for 10 MCP harnesses: ai-memory install <name>.
  • [mcp.allowlist] per-agent capability allowlist. audit_log table (schema v20) records every capability-expansion event.
🏢 If you DECIDE AI infrastructure

Token-tax leak closed. PII stays on-prem. No vendor risk.

Boris Cherny's published 90-day Claude Code instrumentation: 73% of tokens go to nine waste patterns. v0.6.4 closes Pattern 6 ("just-in-case tool definitions") in one release — 76.4% input-token reduction, ~$107/user/year on heavy single-user pricing, ~$107K/year per 1,000 daily-active agent seats. Empirically validated by a 4-tier discovery matrix vs. live xAI Grok 4.3 (6/6 PASS, GATE GREEN). Apache-2.0, single Rust binary, runs on the dev's laptop — no SaaS, no PII exfiltration, no vendor lock-in. The architecture decision you make today doesn't have to be revisited.

Measured, not estimated

v0.6.4 in numbers

Every number on this page is anchored to a publicly verifiable source. Click through to see the methodology.

6,198 → 1,465tokens advertised in tools/list
76.4%prefix-token reduction (cl100k_base)
5 / 43default visible / total tools
10harnesses with built-in installer
6/6Discovery Gate cells PASS
93.84%line coverage · CI-gated ≥93%
v19 → v20schema migration (audit_log)
2 of 2SDKs published (npm + PyPI)
5 of 5distribution channels (release / brew / ghcr / COPR / crates)

Honesty correction shipped with this release: the v0.6.4 RFC drafts originally claimed "~25,800 tokens / 87% reduction." Those numbers were measured against MiniLM (a sentence-embedder vocabulary that systematically over-counts JSON by ~4× vs. cl100k_base, the BPE Claude/GPT actually use for input accounting). Real measurement: 6,198 → 1,465 / 76.4%. We corrected the public claim before the release. Methodology →

The substrate, named

v0.6.4 unlocks NHI guardrails — phase 1

Most "AI memory" products are chat memory. ai-memory is the agent substrate. v0.6.4 ships the first tier of NHI (Non-Human Identity) guardrails: per-agent capability allowlists and a capability-expansion audit log. Phase 2 (Ed25519 attestation) lands in v0.7.

Per-agent capability allowlist

Define what each agent can reach for

The new [mcp.allowlist] config table maps agent_id patterns to allowed family sets. Pattern resolution: exact > longest-prefix > * wildcard. Default disabled for backward compat — flip it on for production.

[mcp.allowlist]
"ai:claude-code@*"   = ["full"]
"ai:codex-cli@*"     = ["core", "graph"]
"ai:grok-cli-*"      = ["core"]
"*"                  = ["core"]   # wildcard default
Capability-expansion audit log

Every memory_capabilities --include-schema call gets logged

Schema migration v19 → v20 adds the audit_log table with columns for agent_id, event_type, requested_family, granted, attestation_tier, timestamp. Three indexes (agent_id, timestamp, event_type) for SOC/SIEM-friendly queries. Idempotent migration; preserves every existing row.

SELECT agent_id, requested_family, granted
FROM   audit_log
WHERE  timestamp > datetime('now', '-24 hours')
   AND granted = 0;
Same binary, five tiers

Built for the agent era, not retrofitted from chat memory

ai-memory's architecture scales from one developer's laptop to a multi-region hive of agents without switching products. Every primitive listed below is in the v0.6.4 binary today; you turn flags on as you grow.

What no other "AI memory" product brings to the table

Capability Vendor memory
(Claude/ChatGPT)
SaaS memory
(mem0 / Letta)
Vector DBs
(Chroma / pgvector)
ai-memory v0.6.4
Local-first, no cloud roundtrip
Universal across AI vendors❌ vendor-locked⚠️ paid plans⚠️ glue code✅ MCP-native
Zero-token cost until recall⚠️ variesn/a
Self-curating (auto-tag, dedup, contradictions)⚠️ partial
Multi-agent federation built in
Quorum W-of-N writes + CRDT merge
Per-agent NHI capability allowlist✅ (v0.6.4 phase 1)
Capability-expansion audit log✅ (schema v20)
Webhook event bus (HMAC-signed)
Apache-2.0, no vendor risk
Single binary installn/an/a
Public per-release evidence pagesn/a
Trust the numbers

Why every claim on this page is anchored

Most product pages cite numbers without sources. Every claim above is anchored to a public, verifiable artifact:

Empirical LLM behavior

NHI Discovery Gate

Public 4-tier acceptance matrix — T1 Awareness / T2 Reactive / T3 Proactive / T4 Mesh. Run against a real OpenClaw harness driving live xAI Grok 4.3 against the v0.6.4 release binary. Per-cell evidence: full LLM transcripts, MCP wire logs, verdict JSON.

Verdict page → · Source repo →

Test coverage

Test Hub

Per-release CERT verdicts with green/red status, scenario-by-scenario evidence, schema-migration validation against real production-shaped DBs. v0.6.4 campaign: CERT GREEN across S25-S32 (new) + S1-S22 (functional replay under --profile full).

Live hub → · v0.6.4 campaign →

Token-cost methodology

Cross-harness benchmark

Documents how the 6,198 → 1,465 measurement was taken (which tokenizer, which BPE vocabulary, which harnesses), and includes the explicit honesty correction on the original RFC's 25,800 / 87% claim — the methodology gap explained, not papered over.

Methodology →

Release artifacts

Frozen claims

The evidence page tracks v0.6.x baselines (1,809 / 93.08% on v0.6.3, 1,886 / 93.84% on v0.6.3.1) alongside v0.6.4 metrics. CI-enforced ≥92% coverage gate per-module. Performance budgets fail PRs whose measured p95 exceeds the published target by >10%.

Evidence page →

Get v0.6.4 now

Install in 60 seconds

macOS / Linux

Homebrew

brew install alphaonedev/tap/ai-memory
ai-memory install claude-code --apply
Cargo (any host)

From crates.io

cargo install ai-memory --version 0.6.4
Docker

ghcr.io

docker pull ghcr.io/alphaonedev/ai-memory:0.6.4
Fedora / RHEL / Rocky

COPR

sudo dnf copr enable alpha-one-ai/ai-memory
sudo dnf install ai-memory
TypeScript SDK

npm

npm install @alphaone/ai-memory
Python SDK

PyPI

pip install ai-memory-mcp
# import name remains ai_memory