What's New in v0.6.4 — quiet-tools

Three audiences, one release

What v0.6.4 means for you

Same release, three framings — pick yours.

👤 If you USE AI

Your AI bill drops. Nothing else changes.

Every time your AI assistant (Claude, ChatGPT, Cursor, Codex, Grok, Gemini) reaches for ai-memory, it used to spend ~6,200 input tokens just describing the available memory tools before reading your message. v0.6.4 cuts that to ~1,500. Your AI still does everything it did before — it just doesn't pre-pay for tools it doesn't need every turn. First-message responses on Codex / Grok / Gemini should feel snappier; your subscription cost drops automatically.

brew upgrade ai-memory && ai-memory doctor --tokens

That second command shows you exactly how much you're saving.

🛠️ If you BUILD with AI

Profile-aware tool surface. SDKs published. NHI guardrails.

--profile {core,graph,admin,power,full,custom} flag (CLI + AI_MEMORY_PROFILE env + [mcp].profile config). Resolution order: CLI > env > config > core.
memory_capabilities extended with family=<name> + include_schema=true for runtime tool registration.
Unloaded-tool calls return JSON-RPC -32601 with an actionable diagnostic naming the family and suggesting both --profile and --include-schema recovery paths.
SDKs published OIDC-style: @alphaone/ai-memory on npm + ai-memory-mcp on PyPI. requireProfile helper raises ProfileNotLoaded with structured .hint.
Cross-harness installer for 10 MCP harnesses: ai-memory install <name>.
[mcp.allowlist] per-agent capability allowlist. audit_log table (schema v20) records every capability-expansion event.

🏢 If you DECIDE AI infrastructure

Token-tax leak closed. PII stays on-prem. No vendor risk.

Boris Cherny's published 90-day Claude Code instrumentation: 73% of tokens go to nine waste patterns. v0.6.4 closes Pattern 6 ("just-in-case tool definitions") in one release — 76.4% input-token reduction, ~$107/user/year on heavy single-user pricing, ~$107K/year per 1,000 daily-active agent seats. Empirically validated by a 4-tier discovery matrix vs. live xAI Grok 4.3 (6/6 PASS, GATE GREEN). Apache-2.0, single Rust binary, runs on the dev's laptop — no SaaS, no PII exfiltration, no vendor lock-in. The architecture decision you make today doesn't have to be revisited.

Measured, not estimated

v0.6.4 in numbers

Every number on this page is anchored to a publicly verifiable source. Click through to see the methodology.

6,198 → 1,465tokens advertised in tools/list

76.4%prefix-token reduction (cl100k_base)

5 / 43default visible / total tools

10harnesses with built-in installer

6/6Discovery Gate cells PASS

93.84%line coverage · CI-gated ≥93%

v19 → v20schema migration (audit_log)

2 of 2SDKs published (npm + PyPI)

5 of 5distribution channels (release / brew / ghcr / COPR / crates)

Honesty correction shipped with this release: the v0.6.4 RFC drafts originally claimed "~25,800 tokens / 87% reduction." Those numbers were measured against MiniLM (a sentence-embedder vocabulary that systematically over-counts JSON by ~4× vs. cl100k_base, the BPE Claude/GPT actually use for input accounting). Real measurement: 6,198 → 1,465 / 76.4%. We corrected the public claim before the release. Methodology →

The substrate, named

v0.6.4 unlocks NHI guardrails — phase 1

Most "AI memory" products are chat memory. ai-memory is the agent substrate. v0.6.4 ships the first tier of NHI (Non-Human Identity) guardrails: per-agent capability allowlists and a capability-expansion audit log. Phase 2 (Ed25519 attestation) lands in v0.7.

Per-agent capability allowlist

Define what each agent can reach for

The new [mcp.allowlist] config table maps agent_id patterns to allowed family sets. Pattern resolution: exact > longest-prefix > * wildcard. Default disabled for backward compat — flip it on for production.

[mcp.allowlist]
"ai:claude-code@*"   = ["full"]
"ai:codex-cli@*"     = ["core", "graph"]
"ai:grok-cli-*"      = ["core"]
"*"                  = ["core"]   # wildcard default

Capability-expansion audit log

Every `memory_capabilities --include-schema` call gets logged

Schema migration v19 → v20 adds the audit_log table with columns for agent_id, event_type, requested_family, granted, attestation_tier, timestamp. Three indexes (agent_id, timestamp, event_type) for SOC/SIEM-friendly queries. Idempotent migration; preserves every existing row.

SELECT agent_id, requested_family, granted
FROM   audit_log
WHERE  timestamp > datetime('now', '-24 hours')
   AND granted = 0;

Same binary, five tiers

Built for the agent era, not retrofitted from chat memory

ai-memory's architecture scales from one developer's laptop to a multi-region hive of agents without switching products. Every primitive listed below is in the v0.6.4 binary today; you turn flags on as you grow.

Single Agent

1 dev, 1 host. SQLite local. ai-memory mcp

Many Agents

N AI clients sharing one host via HTTP+mTLS

Multi-Node Cluster

Quorum W-of-N writes, vector-clock CRDT-lite merge

Data-Center Swarm

Distributed replicas, cross-region quorum

Global Hive

Per-agent NHI allowlist, audit trail, governance

What no other "AI memory" product brings to the table

Capability	Vendor memory (Claude/ChatGPT)	SaaS memory (mem0 / Letta)	Vector DBs (Chroma / pgvector)	ai-memory v0.6.4
Local-first, no cloud roundtrip	❌	❌	✅	✅
Universal across AI vendors	❌ vendor-locked	⚠️ paid plans	⚠️ glue code	✅ MCP-native
Zero-token cost until recall	❌	⚠️ varies	n/a	✅
Self-curating (auto-tag, dedup, contradictions)	❌	⚠️ partial	❌	✅
Multi-agent federation built in	❌	❌	❌	✅
Quorum W-of-N writes + CRDT merge	❌	❌	❌	✅
Per-agent NHI capability allowlist	❌	❌	❌	✅ (v0.6.4 phase 1)
Capability-expansion audit log	❌	❌	❌	✅ (schema v20)
Webhook event bus (HMAC-signed)	❌	❌	❌	✅
Apache-2.0, no vendor risk	❌	❌	✅	✅
Single binary install	n/a	n/a	❌	✅
Public per-release evidence pages	❌	❌	n/a	✅

Trust the numbers

Why every claim on this page is anchored

Most product pages cite numbers without sources. Every claim above is anchored to a public, verifiable artifact:

Empirical LLM behavior

NHI Discovery Gate

Public 4-tier acceptance matrix — T1 Awareness / T2 Reactive / T3 Proactive / T4 Mesh. Run against a real OpenClaw harness driving live xAI Grok 4.3 against the v0.6.4 release binary. Per-cell evidence: full LLM transcripts, MCP wire logs, verdict JSON.

Verdict page → · Source repo →

Test coverage

Test Hub

Per-release CERT verdicts with green/red status, scenario-by-scenario evidence, schema-migration validation against real production-shaped DBs. v0.6.4 campaign: CERT GREEN across S25-S32 (new) + S1-S22 (functional replay under --profile full).

Live hub → · v0.6.4 campaign →

Token-cost methodology

Cross-harness benchmark

Documents how the 6,198 → 1,465 measurement was taken (which tokenizer, which BPE vocabulary, which harnesses), and includes the explicit honesty correction on the original RFC's 25,800 / 87% claim — the methodology gap explained, not papered over.

Methodology →

Release artifacts

Frozen claims

The evidence page tracks v0.6.x baselines (1,809 / 93.08% on v0.6.3, 1,886 / 93.84% on v0.6.3.1) alongside v0.6.4 metrics. CI-enforced ≥92% coverage gate per-module. Performance budgets fail PRs whose measured p95 exceeds the published target by >10%.

Evidence page →

Get v0.6.4 now

Install in 60 seconds

macOS / Linux

Homebrew

brew install alphaonedev/tap/ai-memory
ai-memory install claude-code --apply

Cargo (any host)

From crates.io

cargo install ai-memory --version 0.6.4

Docker

ghcr.io

docker pull ghcr.io/alphaonedev/ai-memory:0.6.4

Fedora / RHEL / Rocky

COPR

sudo dnf copr enable alpha-one-ai/ai-memory
sudo dnf install ai-memory

TypeScript SDK

npm

npm install @alphaone/ai-memory

Python SDK

PyPI

pip install ai-memory-mcp
# import name remains ai_memory

76% lighter on the wire.Same 43-tool capability.