Skip to main content

Architecture

How OpenClaw’s Gateway, Agent Runtime, and Memory system work under the hood.

System Overview

OpenClaw is a three-layer system: Clients at the top, the Gateway daemon in the middle, and the Agent Runtime at the bottom. Everything communicates over WebSocket with JSON frames.

┌─────────────────────────────────────────────────────┐
│  Clients                                             │
│  macOS App · CLI · Web Admin · Automations           │
├──────────────────────┬──────────────────────────────┤
│                      │ WebSocket (JSON frames)       │
│  ┌───────────────────▼───────────────────┐  │
│  │            Gateway (daemon)                    │  │
│  │  Provider connections · WS API · Validation    │  │
│  │  Events: agent, chat, presence, health, cron   │  │
│  └───────┬────────────────────┬──────────────────┘  │
│          │                    │                       │
│  ┌───────▼──────┐    ┌───────▼──────────────────┐   │
│  │   Channels    │    │   Nodes                   │   │
│  │  WhatsApp     │    │  macOS · iOS · Android    │   │
│  │  Telegram     │    │  canvas · camera · screen  │   │
│  │  Discord      │    │  location · voice          │   │
│  │  Slack · ...  │    │                            │   │
│  └──────────────┘    └────────────────────────────┘   │
├─────────────────────────────────────────────────────┤
│  Agent Runtime (derived from pi-mono)                │
│  ┌──────────┐ ┌──────────┐ ┌────────┐ ┌──────────┐ │
│  │  Tools   │ │  Skills  │ │ Memory │ │  Models  │ │
│  └──────────┘ └──────────┘ └────────┘ └──────────┘ │
└─────────────────────────────────────────────────────┘

Gateway (Daemon)

The Gateway is the heart of OpenClaw. A single long-lived process that:

  • Owns all messaging surfaces — WhatsApp via Baileys, Telegram via grammY, Slack, Discord, Signal, iMessage, WebChat
  • Exposes a typed WebSocket API — requests, responses, and server-push events
  • Validates inbound frames against JSON Schema
  • Emits events: agent, chat, presence, health, heartbeat, cron
  • Binds to localhost:18789 by default
  • One Gateway per host — it’s the only place that opens a WhatsApp session

Wire Protocol

Transport: WebSocket, text frames with JSON payloads.

  • First frame MUST be connect
  • Requests: {type:"req", id, method, params}{type:"res", id, ok, payload|error}
  • Events: {type:"event", event, payload, seq?, stateVersion?}
  • Token auth via OPENCLAW_GATEWAY_TOKEN
  • Idempotency keys required for side-effecting methods (send, agent)
// Example: send a message through the Gateway
{
  "type": "req",
  "id": "msg-001",
  "method": "send",
  "params": {
    "channel": "whatsapp",
    "to": "+1234567890",
    "body": "Hello from OpenClaw!"
  }
}

// Response
{
  "type": "res",
  "id": "msg-001",
  "ok": true,
  "payload": { "messageId": "wa-abc123" }
}

Agent Runtime

OpenClaw runs a single embedded agent runtime derived from pi-mono. Key characteristics:

  • Session management, discovery, and tool wiring are OpenClaw-owned (not pi-mono)
  • Uses a single workspace directory as the agent’s working directory
  • Bootstrap files (AGENTS.md, SOUL.md, etc.) injected on first turn of each session
  • Skills loaded from three locations with precedence rules

Agent Loop

The core perception-action cycle:

  1. Receive — Message arrives via channel
  2. Route — Gateway determines target agent
  3. Context — Session loaded, bootstrap files injected
  4. Think — LLM processes context + available tools
  5. Act — Tool calls executed (read, exec, browser, etc.)
  6. Observe — Results fed back to LLM
  7. Respond — Reply sent back through channel
  8. Memory — Session updated, compaction if needed

Steering while streaming: inbound messages can be injected into a running agent turn. Queue modes include steer, followup, or collect.

Sessions & Memory

Sessions

Transcripts stored as JSONL at:

~/.openclaw/agents/<agentId>/sessions/<sessionId>.jsonl

Session ID is stable and chosen by OpenClaw. Each line in the JSONL file represents one message or tool result in the conversation history.

Memory

Persistent memory via AGENTS.md — the agent can read and write its own operating instructions. This is the primary long-term memory mechanism. Think of it as the agent’s editable “brain file.”

Compaction

When sessions grow beyond the model’s context window, auto-compaction kicks in:

  1. Silent memory flush — durable notes saved to disk
  2. Older conversation summarized into a compact entry
  3. Recent messages kept intact
  4. Summary persists in session JSONL

Session Pruning

Separate from compaction. Trims old tool results from in-memory context before each LLM call without rewriting on-disk history. Useful with Anthropic API cache-TTL mode.

Nodes

Remote devices (macOS/iOS/Android/headless) connect via WebSocket with role: node. Nodes expose device-specific commands:

  • canvas.* — render agent-editable HTML
  • camera.* — capture photos/video
  • screen.record — screen recording
  • location.get — device location

Pairing is device-based with an approval store. Local connections can be auto-approved.

Model Configuration

Model refs use provider/model format:

anthropic/claude-sonnet-4
openai/gpt-4o
openrouter/moonshotai/kimi-k2

Supports OAuth subscriptions and API keys with failover rotation. If one provider returns an error or is rate-limited, the runtime transparently falls back to the next configured provider.

Security Model

  • Gateway binds to localhost by default
  • Token auth (256-bit minimum) or password auth
  • File permissions: chmod 600 for config, chmod 700 for directories
  • Remote access via Tailscale/SSH tunnel
  • Tool sandboxing via Docker containers
  • Skills treated as untrusted code by default

Design Principles

Local-First

Your data stays on your hardware. No cloud dependency for core functionality.

Data Sovereignty

Three-layer architecture prioritizes keeping all context and memory local.

Single Gateway

One daemon controls all channels. Simplicity over microservices.

Conversation-First

Configure through natural language, not YAML files.

Production Agent Design Patterns

OpenClaw’s architecture embodies production agent design: Gateway pattern, session management, memory compaction, tool sandboxing. These are the exact patterns covered in AgentWay’s Architecture learning path.

Explore Agent Architecture Concepts