Architecture
How OpenClaw’s Gateway, Agent Runtime, and Memory system work under the hood.
System Overview
OpenClaw is a three-layer system: Clients at the top, the Gateway daemon in the middle, and the Agent Runtime at the bottom. Everything communicates over WebSocket with JSON frames.
┌─────────────────────────────────────────────────────┐
│ Clients │
│ macOS App · CLI · Web Admin · Automations │
├──────────────────────┬──────────────────────────────┤
│ │ WebSocket (JSON frames) │
│ ┌───────────────────▼───────────────────┐ │
│ │ Gateway (daemon) │ │
│ │ Provider connections · WS API · Validation │ │
│ │ Events: agent, chat, presence, health, cron │ │
│ └───────┬────────────────────┬──────────────────┘ │
│ │ │ │
│ ┌───────▼──────┐ ┌───────▼──────────────────┐ │
│ │ Channels │ │ Nodes │ │
│ │ WhatsApp │ │ macOS · iOS · Android │ │
│ │ Telegram │ │ canvas · camera · screen │ │
│ │ Discord │ │ location · voice │ │
│ │ Slack · ... │ │ │ │
│ └──────────────┘ └────────────────────────────┘ │
├─────────────────────────────────────────────────────┤
│ Agent Runtime (derived from pi-mono) │
│ ┌──────────┐ ┌──────────┐ ┌────────┐ ┌──────────┐ │
│ │ Tools │ │ Skills │ │ Memory │ │ Models │ │
│ └──────────┘ └──────────┘ └────────┘ └──────────┘ │
└─────────────────────────────────────────────────────┘
Gateway (Daemon)
The Gateway is the heart of OpenClaw. A single long-lived process that:
- Owns all messaging surfaces — WhatsApp via Baileys, Telegram via grammY, Slack, Discord, Signal, iMessage, WebChat
- Exposes a typed WebSocket API — requests, responses, and server-push events
- Validates inbound frames against JSON Schema
- Emits events:
agent,chat,presence,health,heartbeat,cron - Binds to
localhost:18789by default - One Gateway per host — it’s the only place that opens a WhatsApp session
Wire Protocol
Transport: WebSocket, text frames with JSON payloads.
- First frame MUST be
connect - Requests:
{type:"req", id, method, params}→{type:"res", id, ok, payload|error} - Events:
{type:"event", event, payload, seq?, stateVersion?} - Token auth via
OPENCLAW_GATEWAY_TOKEN - Idempotency keys required for side-effecting methods (
send,agent)
// Example: send a message through the Gateway
{
"type": "req",
"id": "msg-001",
"method": "send",
"params": {
"channel": "whatsapp",
"to": "+1234567890",
"body": "Hello from OpenClaw!"
}
}
// Response
{
"type": "res",
"id": "msg-001",
"ok": true,
"payload": { "messageId": "wa-abc123" }
}
Agent Runtime
OpenClaw runs a single embedded agent runtime derived from pi-mono. Key characteristics:
- Session management, discovery, and tool wiring are OpenClaw-owned (not pi-mono)
- Uses a single workspace directory as the agent’s working directory
- Bootstrap files (
AGENTS.md,SOUL.md, etc.) injected on first turn of each session - Skills loaded from three locations with precedence rules
Agent Loop
The core perception-action cycle:
- Receive — Message arrives via channel
- Route — Gateway determines target agent
- Context — Session loaded, bootstrap files injected
- Think — LLM processes context + available tools
- Act — Tool calls executed (read, exec, browser, etc.)
- Observe — Results fed back to LLM
- Respond — Reply sent back through channel
- Memory — Session updated, compaction if needed
Steering while streaming: inbound messages can be injected into a running agent turn. Queue modes include steer, followup, or collect.
Sessions & Memory
Sessions
Transcripts stored as JSONL at:
~/.openclaw/agents/<agentId>/sessions/<sessionId>.jsonl
Session ID is stable and chosen by OpenClaw. Each line in the JSONL file represents one message or tool result in the conversation history.
Memory
Persistent memory via AGENTS.md — the agent can read and write its own operating instructions. This is the primary long-term memory mechanism. Think of it as the agent’s editable “brain file.”
Compaction
When sessions grow beyond the model’s context window, auto-compaction kicks in:
- Silent memory flush — durable notes saved to disk
- Older conversation summarized into a compact entry
- Recent messages kept intact
- Summary persists in session JSONL
Session Pruning
Separate from compaction. Trims old tool results from in-memory context before each LLM call without rewriting on-disk history. Useful with Anthropic API cache-TTL mode.
Nodes
Remote devices (macOS/iOS/Android/headless) connect via WebSocket with role: node. Nodes expose device-specific commands:
canvas.*— render agent-editable HTMLcamera.*— capture photos/videoscreen.record— screen recordinglocation.get— device location
Pairing is device-based with an approval store. Local connections can be auto-approved.
Model Configuration
Model refs use provider/model format:
anthropic/claude-sonnet-4
openai/gpt-4o
openrouter/moonshotai/kimi-k2
Supports OAuth subscriptions and API keys with failover rotation. If one provider returns an error or is rate-limited, the runtime transparently falls back to the next configured provider.
Security Model
- Gateway binds to localhost by default
- Token auth (256-bit minimum) or password auth
- File permissions:
chmod 600for config,chmod 700for directories - Remote access via Tailscale/SSH tunnel
- Tool sandboxing via Docker containers
- Skills treated as untrusted code by default
Design Principles
Local-First
Your data stays on your hardware. No cloud dependency for core functionality.
Data Sovereignty
Three-layer architecture prioritizes keeping all context and memory local.
Single Gateway
One daemon controls all channels. Simplicity over microservices.
Conversation-First
Configure through natural language, not YAML files.
Production Agent Design Patterns
OpenClaw’s architecture embodies production agent design: Gateway pattern, session management, memory compaction, tool sandboxing. These are the exact patterns covered in AgentWay’s Architecture learning path.
Explore Agent Architecture Concepts