Scenario 8 — Auto-tagging¶
Purpose¶
ai-memory's autonomous curator can auto-generate tags for memories that were written without them, using an on-node LLM (Ollama). This scenario validates that auto-generated tags round- trip through agent frameworks — specifically, that an agent can later recall by a tag it never explicitly wrote.
Opt-in. Requires the droplets to be sized s-4vcpu-16gb (Gemma
4 E2B needs the RAM). Default A2A campaigns skip this scenario
because the cost delta is significant.
Mechanics¶
When enabled (auto_tag=true workflow input):
- All four droplets are provisioned at
s-4vcpu-16gbwith Ollama and a pinned Gemma-4-E2B model. - Agent A writes 20 memories in
scenario8-autotagnamespace with no explicit tags. Content is varied enough that distinguishable tags can be generated. - The ai-memory curator is invoked on node-4 with
memory_auto_tagon the namespace. - Curator generates tags per memory via the local LLM.
- Agent B recalls with a tag filter matching one of the generated tags.
- Assertion: Agent B's recall returns at least one memory whose generated tags include the filter tag.
Pass criterion¶
- Auto-tagger runs without errors.
- At least 50% of the 20 memories receive at least one generated tag (reasonable-coverage heuristic).
- Agent B's tag-filter recall returns a non-empty set.
metadata.agent_idpreservation: the tags are generated, but the original writer'sagent_idstays put.
Report shape¶
{
"scenario": 8,
"pass": true,
"capability_enabled": true,
"memories_tagged": 18,
"tag_coverage_ratio": 0.9,
"recall_count_by_tag": 7,
"authorship_preserved": true,
"reasons": [""]
}
When scenario 8 is disabled (default): pass: null,
capability_enabled: false, reason = "auto_tag=false — scenario
skipped". Aggregator treats null as "not applicable" for
overall_pass computation.
What a green result proves¶
- The autonomous curator's tag-generation pipeline works end-to- end with local LLM inference on real DigitalOcean droplets.
- Agents can recall by tags they didn't write, enabling emergent discovery across agent boundaries.
What a red result would mean¶
- Ollama/Gemma integration regression — tags not generated.
- Tag storage path regression — tags generated but not returned on recall.
- Agent identity regression — tag generation overwriting the
original writer's
agent_id.
For three audiences¶
When enabled, your agents can discover each other's memories through automatically-generated tags, without anyone having to explicitly label what they wrote. It's the autoclassification layer that makes a shared memory actually navigable as it grows.
Differentiating feature: most memory stores require manual taxonomy management. ai-memory's optional LLM-backed auto- tagging removes that overhead. This scenario is the end-to-end evidence that the feature works on real infrastructure. The cost/benefit — 3× droplet cost for LLM-capable sizing — is documented so operators can decide per-campaign whether to exercise it.
Exercises memory_auto_tag + Ollama integration + the tag-
filter read path in sequence. Coverage threshold (≥ 50%) is
heuristic because LLMs are non-deterministic; we're testing
that the pipeline RUNS, not that every memory gets a tag.
Retry/rerun strategy documented in the scenario script
for handling transient LLM failures.