Prerequisites
Install these before running the seed script. Quick check:
cypher-shell "RETURN 1" && python3 -c "import neo4j; print('✅ all good')"| Dependency | Version | macOS | Ubuntu / Debian | Fedora |
|---|---|---|---|---|
| Neo4j | 2026.01+ | brew install neo4j | See below | See below |
| Python | 3.10+ | pre-installed or brew install python | sudo apt install python3 | sudo dnf install python3 |
| neo4j driver | 5.x | pip install neo4j | ||
| OpenClaw | latest | github.com/openclaw/openclaw | ||
macOS — step by step
# 1. Install Neo4j brew install neo4j brew services start neo4j # 2. Install Python neo4j driver pip install neo4j
Ubuntu / Debian — step by step
# 1. Add Neo4j repository and install wget -O - https://debian.neo4j.com/neotechnology.gpg.key | sudo gpg --dearmor -o /usr/share/keyrings/neo4j.gpg echo "deb [signed-by=/usr/share/keyrings/neo4j.gpg] https://debian.neo4j.com stable latest" | sudo tee /etc/apt/sources.list.d/neo4j.list sudo apt-get update && sudo apt-get install -y neo4j sudo systemctl start neo4j # 2. Install Python neo4j driver pip install neo4j
Fedora — step by step
# 1. Add Neo4j repository and install sudo rpm --import https://debian.neo4j.com/neotechnology.gpg.key cat << EOF | sudo tee /etc/yum.repos.d/neo4j.repo [neo4j] name=Neo4j RPM Repository baseurl=https://yum.neo4j.com/stable/5 gpgcheck=1 gpgkey=https://debian.neo4j.com/neotechnology.gpg.key EOF sudo dnf install -y neo4j sudo systemctl start neo4j # 2. Install Python neo4j driver pip install neo4j
Installation
Quick install
# 1. Clone the repository git clone https://github.com/alphaonedev/openclaw-graph.git cd openclaw-graph # 2. Run the seed script (creates all nodes, relationships, constraints) python3 seed.py --dry-run # preview first python3 seed.py # run it # → 316 Skills, 27 SkillClusters, 217 RELATED_TO, 316 IN_CLUSTER — 0.9s # 3. Deploy workspace stubs cp workspace-stubs/*.md ~/.openclaw/workspace/ # 4. Verify cypher-shell "MATCH (s:Skill) RETURN count(s)"
What Neo4j contains (v1.5)
| Label | Nodes | Description |
|---|---|---|
| Skill | 316 | Skills across 27 clusters |
| SkillCluster | 27 | Semantic skill groupings (first-class nodes) |
| Soul | 4 | Prime Directive, Identity, Safety, Heartbeat Protocol |
| OCMemory | 2 | Principal (placeholder), Infrastructure (placeholder) |
| AgentConfig | 9 | Every Session, Delegation, Safety, Heartbeats, Memory, TOON, Search Resilience, Schema Rules, Path Aliases |
| OCTool | 26 | All standard OpenClaw tools |
| OCAgent | 8 | Agent definitions |
| Bootstrap | 1 | Boot identity |
Relationships: 316 IN_CLUSTER (Skill→SkillCluster) + 217 RELATED_TO (Skill→Skill). Total footprint: ~10 MB. All queries are sub-millisecond via Neo4j's native graph indexing.
Import Your Workspace
Already running OpenClaw with flat markdown files? The seed script loads them into Neo4j.
Quick start
# Run the seed script — imports all workspace data into Neo4j python3 seed.py # Preview what will be imported without writing anything python3 seed.py --dry-run # Deploy GRAPH stubs to workspace cp workspace-stubs/*.md ~/.openclaw/workspace/
File → Node mapping
| File | Node type | Section → field | Notes |
|---|---|---|---|
SOUL.md | Soul | ## heading → section | Ordered by priority (file order) |
MEMORY.md | Memory | ## heading → domain | Timestamped on import date |
USER.md | Memory | ## heading → User: <heading> | Queryable separately via STARTS WITH 'User:' |
TOOLS.md | Tool | ## heading → name | No workspace column — tools are global |
AGENTS.md | AgentConfig | ## heading → key | Drives AGENTS.md stub query |
Format requirement: Each file is parsed by ## headings — each heading becomes one graph node. Files with no ## headings are imported as a single node titled Main. Files that already contain a GRAPH stub directive are automatically skipped.
What --write-stubs does
# Before: flat file (~1,800 bytes) ~/.openclaw/workspace/SOUL.md # After --write-stubs: ~/.openclaw/workspace/SOUL.md # ← single-line GRAPH stub (144 bytes) ~/.openclaw/workspace/SOUL.md.bak # ← original backed up automatically
OpenClaw resolves the stub at runtime from Neo4j — your content is preserved in the graph, not the file.
Database Layout
Neo4j (bolt://localhost:7687) ├── Labels: │ ├── Skill (316) # Skills with full SKILL.md content │ ├── SkillCluster (27) # Semantic skill groupings │ ├── Soul (4) # Agent personality/behavior │ ├── OCMemory (2) # Namespaced memory nodes │ ├── AgentConfig (9) # Runtime config directives │ ├── OCAgent (8) # Namespaced agent nodes │ ├── OCTool (26) # Namespaced tool nodes │ └── Bootstrap (1) # Boot identity ├── Relationships: │ ├── IN_CLUSTER (316) # Skill → SkillCluster membership │ └── RELATED_TO (217) # Skill → Skill dependency graph └── Scripts: └── seed.py # Self-contained Neo4j seeder
Neo4j stores all graph data with proper graph modeling. Namespaced labels (OCAgent, OCMemory, OCTool) allow coexistence with other graph domains in a shared Neo4j instance. All nodes include a workspace property for multi-tenant isolation.
Query Reference
All queries use Cypher via cypher-shell or the Python neo4j driver:
Skill queries
# List all skills in a cluster cypher-shell "MATCH (s:Skill)-[:IN_CLUSTER]->(c:SkillCluster {name:'devops-sre'}) RETURN s.name, s.description" # Skill graph traversal — show a skill and its relationships cypher-shell "MATCH (s:Skill {name:'cloudflare'})-[:RELATED_TO*1..2]-(t:Skill) RETURN DISTINCT t.name, t.cluster" # Stats — total skills, clusters cypher-shell "MATCH (s:Skill) RETURN count(s) AS skills" cypher-shell "MATCH (c:SkillCluster) RETURN count(c) AS clusters"
Raw Cypher queries
# Via cypher-shell cypher-shell "MATCH (s:Skill) RETURN count(s) AS total" # Via Python python3 -c " from neo4j import GraphDatabase driver = GraphDatabase.driver('bolt://localhost:7687') with driver.session() as session: result = session.run('MATCH (s:Skill) RETURN count(s) AS total') print(result.single()['total']) "
Workspace node queries
# List all Soul nodes cypher-shell "MATCH (s:Soul) WHERE s.workspace = 'openclaw' RETURN s.section, s.priority ORDER BY s.priority" # List all AgentConfig nodes cypher-shell "MATCH (a:AgentConfig) WHERE a.workspace = 'openclaw' RETURN a.key ORDER BY a.id" # List available tools cypher-shell "MATCH (t:OCTool) WHERE t.available = true AND t.workspace = 'openclaw' RETURN t.name, t.notes ORDER BY t.name"
Output formats
The --workspace flag activates clean markdown output (no labels), which is what workspace.ts uses when resolving GRAPH directives. Without it, output is labeled for human reading.
| Data shape | Format |
|---|---|
section + content | Soul markdown (## Section\n\nContent) |
domain + content | Memory markdown (## Domain\n\nContent) |
key + value | AgentConfig markdown (- **Key**: Value) |
name + notes | Tool list (- **Name**: Notes) |
| Other | JSON (debug/dev use) |
Workspace Setup
1. Apply the workspace.ts patch
The patch modifies OpenClaw's workspace.ts to resolve GRAPH directives:
cd /path/to/openclaw git apply path/to/openclaw-graph/patches/workspace-cache-fix.patch pnpm build
What the patch does:
- Intercepts
readFileWithCache()to detect<!-- GRAPH: ... -->directives - Executes Cypher queries via
execFileAsync(non-blocking) - Caches results with adaptive TTL:
60s × log₁₀(hit_count + 10) - In-flight deduplication prevents thundering herd on cache expiry
2. Deploy workspace stubs
cp workspace-stubs/*.md ~/.openclaw/workspace/
Each stub is a single-line file like:
<!-- GRAPH: MATCH (s:Soul) WHERE s.workspace = 'openclaw' RETURN s.section AS section, s.content AS content ORDER BY s.priority ASC -->
3. Verify resolution
# Test that the directive resolves correctly cypher-shell "MATCH (s:Soul) WHERE s.workspace = 'openclaw' RETURN s.section, s.content ORDER BY s.priority" # Expected: 4 rows — Prime Directive, Identity, Safety, Heartbeat Protocol
4. Customize your workspace
See the User Guide → Customizing for step-by-step personalization.
Cron Job Integration
openclaw-graph supports scheduled tasks (cron jobs) that query the graph database. Cron jobs typically run as lightweight OpenClaw sessions that:
- Load workspace context via GRAPH directives (same as interactive sessions)
- Execute domain-specific prompts (intel, monitoring, alerts)
- Write output to files or send notifications
Cron architecture
Cron Scheduler (crontab / launchd / systemd) └── Every N minutes: run openclaw session │ ▼ OpenClaw Session (cron mode) ├── Load AGENTS.md → GRAPH: AgentConfig ├── Load TOOLS.md → GRAPH: Tool ├── Execute prompt (task-specific) └── Write output / send alerts │ --workspace --cypher ▼ Neo4j (bolt://localhost:7687) ├── AgentConfig → session behavior ├── Skill → task-relevant skills └── SkillCluster → semantic grouping
Cron session bootstrap
Cron/sub-agent sessions use a minimal bootstrap. OpenClaw's MINIMAL_BOOTSTRAP_ALLOWLIST loads only:
- AGENTS.md (AgentConfig nodes — session behavior, delegation rules)
- TOOLS.md (available tools and usage notes)
SOUL.md and MEMORY.md are skipped for cron sessions to reduce prompt size.
AgentConfig nodes relevant to cron
| Node | Purpose |
|---|---|
agentcfg-*-every-session | Bootstrap instructions (read SOUL, USER, memory) |
agentcfg-*-delegation | Sub-agent spawning rules |
agentcfg-*-search-resilience | Retry/fallback when web_search fails |
agentcfg-*-schema-rules | Output validation requirements |
agentcfg-*-path-aliases | Compact path notation ($PY, $DB, etc.) |
agentcfg-*-toon | JSON compression for large payloads |
Example: cron job definition
{
"name": "daily-report",
"schedule": "0 8 * * *",
"workspace": "openclaw",
"prompt": "Generate today's daily report. Query relevant skills from the 'financial' cluster.",
"output": "~/.openclaw/workspace/reports/daily-$(date +%Y-%m-%d).md"
}Querying skills from cron prompts
# Find relevant skills by cluster cypher-shell "MATCH (s:Skill)-[:IN_CLUSTER]->(c:SkillCluster {name:'devops-sre'}) RETURN s.name, s.description" # Get specific skill and its relationships cypher-shell "MATCH (s:Skill {name:'kubernetes-ops'})-[:RELATED_TO*1..2]-(t:Skill) RETURN DISTINCT t.name, t.cluster"
Fleet Management
For multi-instance deployments, each OpenClaw instance gets its own workspace ID. All instances share the same Neo4j instance — workspace-scoped queries ensure isolation.
Seed separate workspaces
# Run seed script with different workspace names
python3 seed.py --workspace intel-agent
python3 seed.py --workspace code-agent
python3 seed.py --workspace ops-agentPoint stubs to the correct workspace
# Instance A stubs (~/.openclaw/workspace-intel/) # SOUL.md: # <!-- GRAPH: MATCH (s:Soul) WHERE s.workspace = 'intel-agent' RETURN s.section AS section, s.content AS content ORDER BY s.priority ASC -->
Fleet topology example
| Port | Instance | Workspace ID |
|---|---|---|
| 18790 | A — Intel/OSINT | intel-agent |
| 18791 | B — Compute | compute-agent |
| 18792 | C — Browser | browser-agent |
| 18793 | D — Code/GitHub | code-agent |
| 18794 | E — Infra | infra-agent |
| 18795 | F — Memory/Graph | graph-agent |
| 18796 | G — General | general-agent |
| 18797 | H — Standby | standby-agent |
All instances connect to the same Neo4j instance. Workspace-scoped Cypher queries ensure complete isolation between agents.
Per-workspace tool overrides
# Disable delegation for the browser-only instance cypher-shell "MATCH (t:OCTool {id:'tool-sessions-spawn', workspace:'browser-agent'}) SET t.available = false"
Service Management
Architecture note: openclaw-graph runs on Neo4j, which is a separate service. "Restarting the graph" means restarting Neo4j. The OpenClaw gateway connects to Neo4j via bolt://localhost:7687.
macOS (launchd)
OpenClaw installs a launchd service with label ai.openclaw.gateway.
# Start launchctl bootstrap gui/$UID \ ~/Library/LaunchAgents/ai.openclaw.gateway.plist # Stop launchctl bootout gui/$UID/ai.openclaw.gateway # Restart (stop + start in one command) launchctl kickstart -k gui/$UID/ai.openclaw.gateway # Status — look for "state = running" and "pid =" launchctl print gui/$UID/ai.openclaw.gateway | grep -E "state|pid"
macOS log files
# Structured log (stdout) — primary log tail -f ~/.openclaw/logs/gateway.log # Error log (stderr) tail -f ~/.openclaw/logs/gateway.err.log # macOS unified log stream (real-time, subsystem ai.openclaw) log stream --predicate 'subsystem == "ai.openclaw"' --level info # Or use the bundled clawlog utility from the openclaw directory bash scripts/clawlog.sh
Ubuntu (systemd)
OpenClaw installs a user-scope systemd unit: openclaw-gateway.service.
# Start systemctl --user start openclaw-gateway.service # Stop systemctl --user stop openclaw-gateway.service # Restart systemctl --user restart openclaw-gateway.service # Status systemctl --user status openclaw-gateway.service
Ubuntu log files
# Live log stream via journald journalctl --user -u openclaw-gateway.service -f # Or read the flat log files directly tail -f ~/.openclaw/logs/gateway.log tail -f ~/.openclaw/logs/gateway.err.log
Fedora (systemd)
Same commands as Ubuntu — both use user-scope systemd.
# Start systemctl --user start openclaw-gateway.service # Stop systemctl --user stop openclaw-gateway.service # Restart systemctl --user restart openclaw-gateway.service # Status systemctl --user status openclaw-gateway.service
Fedora log files
# Live log stream via journald journalctl --user -u openclaw-gateway.service -f # Or read the flat log files directly tail -f ~/.openclaw/logs/gateway.log tail -f ~/.openclaw/logs/gateway.err.log
DB health check (all platforms)
These run directly against Neo4j — no gateway required.
# Quick connectivity check cypher-shell "RETURN 1" # Node count by label cypher-shell "MATCH (n) WHERE n.workspace = 'openclaw' RETURN labels(n)[0] AS type, count(n) AS n ORDER BY type" # Full stats: skills, clusters, relationships cypher-shell "MATCH (s:Skill) RETURN count(s) AS skills" cypher-shell "MATCH (c:SkillCluster) RETURN count(c) AS clusters" cypher-shell "MATCH ()-[r:RELATED_TO]->() RETURN count(r) AS relationships"
Log file locations (all platforms)
| File | Contains |
|---|---|
~/.openclaw/logs/gateway.log | Gateway stdout — normal operational output |
~/.openclaw/logs/gateway.err.log | Gateway stderr — errors and warnings |
Query Metrics
cypher-shell queries can be profiled using Neo4j's built-in PROFILE prefix to inspect execution plans and timing. QueryMetrics nodes are populated automatically every time a GRAPH directive resolves from a workspace stub. No configuration needed — it just works after install.
New install? QueryMetrics starts empty and fills itself automatically. The first entry appears after OpenClaw loads any workspace stub for the first time.
View query performance
# Neo4j query profile — shows execution plan and timing cypher-shell "PROFILE MATCH (s:Skill) WHERE s.workspace = 'openclaw' RETURN count(s)" # Check Neo4j metrics via browser # Open http://localhost:7474 → :sysinfo
What the dashboard shows
| Column | Meaning |
|---|---|
hits | Times this Cypher query returned ≥1 row |
miss | Times this Cypher query returned 0 rows |
rate | Hit rate percentage (hits / total) |
avg | Rolling mean execution time (hits only, Welford algorithm) |
p95 | High-water mark — rises instantly on slow queries, decays slowly |
last_hit | UTC timestamp of most recent successful hit |
Upgrading
When you upgrade to a new release, the DB is replaced. Workspace nodes live in the DB.
Upgrade workflow
# 1. Pull latest seed script cd openclaw-graph && git pull # 2. Re-run seed script (idempotent — uses MERGE) python3 seed.py # 3. Verify cypher-shell "MATCH (s:Skill) WHERE s.workspace = 'openclaw' RETURN count(s)"
Recommended: The seed script uses MERGE (not CREATE) — it is safe to re-run. Existing nodes are updated, new nodes are created.
Backup & Restore
Full DB backup
# Neo4j dump (requires neo4j-admin)
neo4j-admin database dump neo4j --to-path=/tmp/neo4j-backup/Export workspace nodes to JSON
# Export via cypher-shell cypher-shell "MATCH (s:Skill) WHERE s.workspace = 'openclaw' RETURN s" > skills-backup.json cypher-shell "MATCH (s:Soul) WHERE s.workspace = 'openclaw' RETURN s" > soul-backup.json
Environment Variables
| Variable | Default | Description |
|---|---|---|
NEO4J_URI | bolt://localhost:7687 | Neo4j connection URI |
NEO4J_USER | neo4j | Neo4j username (if auth enabled) |
NEO4J_PASSWORD | | Neo4j password (if auth enabled) |
PYTHON | python3 | Path to Python binary |
Troubleshooting
GRAPH directive returns empty
Symptom: Workspace file loads but content is empty.
Check:
# 1. Is Neo4j running? cypher-shell "RETURN 1" # 2. Does the workspace have nodes? cypher-shell "MATCH (a:AgentConfig) WHERE a.workspace = 'openclaw' RETURN count(a) AS n" # 3. Is the directive syntax correct? # Must be: <!-- GRAPH: <cypher> --> # Must be the FIRST LINE of the file # Cypher must be a single line
"Label X does not exist"
Cause: The seed script hasn't been run, or nodes weren't created for this label.
Fix:
python3 seed.py
Stale workspace content
Cause: Graph query cache hasn't expired (60s TTL).
Fix: Wait 60 seconds, or restart the OpenClaw process. The cache is in-memory only.
Neo4j connection refused
Cause: Neo4j is not running or not listening on bolt://localhost:7687.
Fix:
# macOS brew services start neo4j # Ubuntu / Fedora sudo systemctl start neo4j # Verify cypher-shell "RETURN 1"
Performance Reference
| Query | Neo4j (bolt) |
|---|---|
| Skill PK lookup | <1ms |
| AgentConfig (9 nodes) | <1ms |
| Tool query (26 nodes) | <1ms |
| Cluster traversal (IN_CLUSTER) | <1ms |
| Multi-hop reasoning (2-hop RELATED_TO) | ~3ms |
| Full skill scan (316) | ~2ms |
All queries are sub-millisecond via Neo4j's index-free adjacency. Traversal times stay constant regardless of total graph size. All 5 workspace GRAPH directives resolve in parallel.
Full benchmark data: benchmarks/results.md