The #1558 hardcoded-literal campaign (2026-06-09): a frozen baseline of 497 duplicated-literal entries across 2,847 production call sites was burned down in six batches to 28 irreducible entries on 108 sites — a −94% entry reduction, with roughly 2,700 production sites re-routed through named single-source-of-truth constants. The floor is enforced by a CI ratchet that HARD-BLOCKS new duplication, and every surviving entry carries a committed one-line justification.
This codebase is developed by AI coding agents under operator direction — that is the substrate's own AI-NHI workflow, documented, not hidden. The campaign exists because of an honest observation about that workflow, recorded verbatim in the gate's header comments: repeating the "no hardcoded literals" instruction to the agent did not stop the regression. A scattered magic string (the header's examples: format!("anonymous:req-{}", …) ×~8, "memory not found" ×~6) gets reproduced every time an agent pattern-matches surrounding, already-rotten code. The proven fix — the same one that ended the vendor-literal regression class before it — is a mechanical HARD-BLOCK in CI.
Deliberately not flagged, to keep the gate low-false-positive and therefore load-bearing: literals under 10 chars (short JSON keys), single-site literals, comment / use-path / attribute lines, const/static definition lines (those are the good pattern), and test code behind the shared production-vs-test boundary heuristic. Magic numbers are out of scope here — the SECS_PER_* class is already gated by the companion scripts/check-vendor-literals.sh, and a general numeric gate is too noisy to be load-bearing. The gate also ships --self-test: it injects a contrived new triplicated literal, verifies the HARD-BLOCK fires, and cleans up — proving in CI that the gate is enforcement, not decoration.
The baseline was frozen at campaign start: every double-quoted literal ≥10 chars appearing on ≥3 production sites, 497 entries totalling 2,847 sites. Six batches later, 28 entries on 108 sites remain — and the regenerated baseline file (scripts/qc-allowlists/hardcoded-literals-baseline.txt) is exactly those 28 lines. Everything else was a byte-preserving hoist: each routed const or helper produces the exact pre-sweep wire/SQL/log bytes.
From the campaign-start freeze to the post-batch-6 regeneration.
≈2,700 production call sites now reference a named const or shared helper instead of a repeated literal.
Quota DDL parity → identity sentinels → JSON-RPC wire consts → route paths → SQL/header/tracing/tool-name sweeps → the final field-name + census batch.
Every remaining entry classified with a one-line justification in the committed census.
Every internal/system principal string (DAEMON_PRINCIPAL, ANONYMOUS_INVALID, AI_CURATOR, …) as one named const — 82 production sites routed. These are authz-relevant: ownership gates exempt callers whose principal equals one of them. validate::RESERVED_AGENT_IDS is now built from the sentinel consts (it was a parallel literal list with a "MUST stay in sync" comment), pinned by an invariant test. One anonymous_request_id() helper replaced 10 divergent synthesis sites (#1560).
The JSON-RPC 2.0 version tag, reserved error codes (-32700/-32600/-32601/-32602), MCP method names, and the protocolVersion revision as named consts. The crate-root METHOD_* consts became aliases of the domain-canonical set.
One const per production HTTP route path — 74 consts. The router registers them, and the postgres surface gate (207 literals in postgres_gate.rs), the federation receiver, and the CLI doctor all match on them — so route gating structurally cannot drift from route registration.
Extended by +57 consts for wire/row keys (agent_pubkey … updated_since), routed across ~60 files including the full federation-sync response-key set, via the established json!-key / .get() / try_get() forms.
Filesystem-context helpers opening() / reading() / writing() plus reuse of the existing msg::invalid() — error prose synthesized in one place instead of re-typed per call site.
SQL transaction fragments (SQL_BEGIN_IMMEDIATE et al. — 49 copies collapsed), shared auth-header spellings, 14 duplicated tracing targets routed through consts, and tool-name / wire-enum values routed through their owning types (MemoryLinkRelation::as_str(), AttestLevel::as_str() — both link allowlist arrays are now built from the enum).
Full batch mechanics: CHANGELOG §"#1558 hardcoded-literal SSOT remediation campaign" and v0.7.0 release notes §#1558.
A gate that miscounts is worse than no gate: it either blocks honest work or silently waves duplication through. During the campaign the production-vs-test boundary heuristic was found wrong three times, and each finding was filed, fixed, and regression-pinned like any substrate defect:
The boundary only excluded mod tests blocks by name — a #[cfg(test)] mod l2_2_audit_tests with a non-standard name leaked test literals into the production baseline. Fixed: the attr+mod pairing is part of the boundary.
A file-level #![cfg(test)] inner attribute makes the entire file test code — those files no longer count toward production literal totals.
The boundary must not fire on a #[cfg(test)] mod x; declaration whose body lives in another file (one such line made the gate skip 13.9k production lines), and cfg(all(test, …)) modules are now caught.
The discipline point: enforcement tooling gets the same defect workflow as the substrate — issue filed at discovery, fix, regression pin, close. The gate's accuracy is itself under test (--self-test runs in CI).
The campaign did not end with "good enough." It ended with a committed census — scripts/qc-allowlists/hardcoded-literals-irreducible.md — classifying each of the 28 surviving entries with a one-line justification for why it cannot shrink further under the current boundaries. Four classes:
| Class | Entries | Meaning | Example survivor |
|---|---|---|---|
| CARVEOUT | 9 | Every production site lives in the 8 vendor carve-out files (llm.rs, config.rs, mine.rs, …) that hold the canonical vendor alias/default tables and are frozen for this campaign. |
http://localhost:11434 — all 9 sites are config.rs defaults/resolvers/template. |
| CARVEOUT-DOMINANT | 3 | ≥3 sites are carve-out-frozen; the residual sites are below the duplication threshold on their own and cannot reference a const the frozen owner does not export. | mini_lm_l6_v2 — 4 sites in the config.rs SSOT def; 1 residual match-arm pattern. |
| SEPARATE-CRATE | 11 | Sites live only in tools/* standalone QA/orchestration binaries — separate crates that cannot reference ai_memory:: consts. |
The T0-A1-CORE … T0-CONTRACT question ids in tools/t0-orchestrate. |
| HYBRID | 5 | Sites split between carve-out files and a tools binary; neither side can route to the other. | ANTHROPIC_API_KEY — the per-vendor key fallback table (frozen) + the t0 env table (separate crate). |
That census is the auditable answer to "why does any duplication remain?" — and because the baseline is a shrink-only ratchet, the floor can only ever go down from here.
RUST_LOG target filtering cannot match; converting them to real metadata targets means postgres SAL adapter events now emit under store::postgres / store::postgres::kg (an ai_memory=debug filter no longer matches them). #1560 — unifying 10 divergent anonymous-request-id synthesis sites onto one helper fixed 8 of them that stamped a full 36-char UUID against the documented uuid8 contract; anonymous principals in logs/audit rows now carry the 8-char suffix everywhere.
For an SME evaluating the substrate, the campaign is evidence on two axes:
Reserved principals, JSON-RPC wire constants, HTTP route paths, wire field names, SQL fragments, auth headers, tracing targets — each now has exactly one definition site, and several previously parallel lists (RESERVED_AGENT_IDS, both link allowlist arrays) are derived from their owning type instead of maintained alongside it. Drift between registration and gating, or between enum and allowlist, is now a compile-time impossibility rather than a review-time hope.
check-hardcoded-literals.sh runs in CI as a HARD-BLOCK beside fmt/clippy/test/audit and the companion vendor-literal gate. New duplication above the 28-entry floor fails the build; the baseline file can only shrink; --self-test proves the block actually fires. At campaign close the working tree held: cargo check (default + sal-postgres) clean, clippy -D warnings -D clippy::pedantic clean, cargo fmt --check clean, and both literal gates PASS.
The six batches were executed by AI coding agents under operator direction — the same AI-NHI workflow the substrate documents for itself. The honest engineering lesson the campaign encodes: agent pattern-matching reproduces whatever the surrounding code does, so quality directives must live in mechanical gates and committed artifacts (baseline, census, self-test), not in prompts. The operator sets the rule once; CI enforces it on every future session, human or NHI.
Related reading: Tracing atlas (the #1562 target fix in operational context) · Developer deep-dive · Engineering standards · Frozen claims.