Three end-to-end animated diagrams: write path (memory_store), read path (memory_recall with hybrid fusion), and federation (W-of-N quorum sync). Every box is a real function in src/db.rs or src/handlers.rs; every arrow is a real call.
Validation → embedding → SQLite insert → HNSW index update → federation fanout. The full path takes ~9-86ms p95 depending on whether embedding is requested.
memory_recall combines three retrieval modalities and reconciles them into one ranked result set. Each modality has different strengths; the fusion model produces something none of them could alone. p95 hot-path: 18ms.
When the caller passes context_tokens=[...], the recall biases the query embedding toward recent conversation:
embed(query) — what the caller explicitly asked. Anchors the recall.
embed(context_tokens.join(" ")) — recent conversation tokens. Biases toward "what the agent is currently thinking about".
Final query vector = unit-norm(0.7·primary + 0.3·context). Same magnitude as either input — drop-in for HNSW search.
Each node accepts writes locally and fans out to peers via mTLS. Quorum threshold W of N (default 2 of 3) must ack before commit. Eventual consistency. Per-peer cursors for catchup after partition. Federation private endpoints under /api/v1/sync/*.
Reads always serve from local SQLite. Zero network call. Sub-50ms regardless of peer state.
Writes return 200 only after W peers ACK. Slowest ACK determines latency. Default W=2.
Detached peer catches up on rejoin via per-peer cursor. Convergent within one catchup_interval (default 30s).
QuorumNotMetPayload includes per-peer error classification (Unreachable / IdDrift / InFlight / InvalidPolicy / LocalWriteFailed).