Curriculum

3 Tracks, 12 Paths. Frequent milestones, branching choices, daily quests.

3

Tracks

12

Paths

~40h

Total

6,000

Bonus XP

~5d

Avg Completion

~40h includes lessons, exercises, flashcard review, and reflection time

Track 1 · Awakening

3 paths~8.5 hoursLinear, required

L1 Explorer → L2 Apprentice

You discover the power of agents. Master the core primitives: what an agent is, how tools work, and how to control LLM output. Everything else builds on this.

P1

Agent Basics

~2h3 lessons + 1 exercise

+500 XP

SDK query() for await..of message.type session_id

What is an Agent

LLM ≠ Agent; Loop, State, Tools, Termination. query() as the simplest agent.

+50

The Minimal Agent Loop

Input → Reason → Act → Output. for await streaming maps message types to loop stages.

+50

Failure Taxonomy

Hallucination, tool misuse, unverifiable output. error message handling.

+50

Turn a chatbot into a stateful loop

Capture session_id, resume conversation, observe state persistence.

+100

Gate: run query(), capture session_id, explain loop vs one-shot call

P2

Tool Use

~3h3 lessons + 3 exercises

+500 XP

SDK allowedTools disallowedTools tools config canUseTool

Function Calling Mental Model

Schema → Call → Result → Continue. Observe tool_call and tool_result in stream.

+50

Tool Design Principles

Atomicity, clear descriptions, helpful errors

+50

Tool Orchestration

When to call, when not to, result into context

+50

Implement Calculator Tool

allowedTools: ["Bash"] agent uses node -e for computation

+100

File & Search Tool Mocks

Parameter validation + error handling

+100

Build a canUseTool Gate

Permission callback: log, filter, control via canUseTool

+100

Gate: tool calls logged, canUseTool blocks dangerous ops, multi-tool sequences observed

P3

Prompts & Structured Output

~3h5 lessons + 2 exercises

+500 XP

SDK systemPrompt outputFormat Zod schema structured_output maxThinkingTokens

System Prompt Boundaries

Behavioral constraints vs task instructions

+50

Structured Output with Schemas

Pydantic / Zod → JSON Schema → type-safe output. outputFormat + structured_output

+50

Reducing Hallucination

Retrieval, constraints, verification

+50

Prompt ↔ Tool Interaction

"Only output executable content" strategy

+50

Extended Thinking & Reasoning Control

Adaptive thinking, effort parameter. maxThinkingTokens cost/quality tradeoff.

+50

Schema-constrained LLM Output

Define model, validate 99% parseable

+100

Compare 2 Prompt Strategies

Generic vs tool-dense, measure failure rate

+100

Gate: structured output validates 100%, system prompt preset understood, thinking budget tradeoff measured

Track 2 · Forging

3 paths + capstone~13 hoursLinear, required

L2 Apprentice → L3 Builder

You forge your own tools. Memory, context engineering, evaluation, model routing, design patterns, and your first complete agent framework. Completing this track means you can build agent loops independently.

P4

Memory & Context Engineering

~3.5h5 lessons + 3 exercises

+500 XP

SDK resume forkSession continue v2 Session APIs betas: context-1m

Short-term vs Long-term Memory

Conversation summary, vectors, events. Short-term = session; Long-term = resume.

+50

Write Strategy

What to write, when, how to compress

+50

Memory Corruption & Repair

Stale info, wrong facts, injection attacks

+50

Context Window Management

Sliding window, threshold compression, prefix caching (92% reuse). betas: context-1m

+50

Prefix Reuse & Cost Optimization

Cache-friendly prompt structure, long-context pricing, batching

+50

Session Summarizer

Key field extraction: constraints, preferences

+100

Task Memory Cards

Task → decision → outcome → lesson learned

+100

Build a Context Compressor

Auto-compress at threshold, preserve key info, measure token savings

+100

Gate: resume, fork, continue sessions demonstrated. V2 vs V1 session model explained.

P5

Evaluation, Routing & Recovery

~4h7 lessons + 4 exercises

+500 XP

SDK fallbackModel setModel() supportedModels() error codes accountInfo()

Why Agents Must Be Evaluated

No eval = no iteration

+50

Eval Types Spectrum

Parse → typecheck → unit → golden → LLM-as-judge

+50

Statistical View: N Trials

Agent is stochastic — single pass ≠ reliable

+50

Cost-Aware Evaluation

Token budget vs success rate tradeoff

+50

Error Recovery Strategies

Retry, minimal fix, rollback, degrade. fallbackModel for auto-recovery.

+50

Model Selection Strategy

Opus/Sonnet/Haiku tradeoffs, cost vs capability matrix

+50

Intelligent Model Routing

Small model classifies → big model reasons, dynamic escalation via setModel()

+50

Output Parser + Schema Validator

Build a grader that checks structure

+100

Run 20 Trials, Compute Stats

Success rate, variance, avg token cost

+100

Generate → Execute → Verify → Fix

Error fed back until pass or max retries

+100

Build a Cost-Aware Model Router

Route by complexity, compare cost/quality across models via supportedModels()

+100

Gate: 20-trial eval report, model router picks optimal model, cost tracked per route

P6

Design Patterns & Capstone

~5h4 lessons + 1 exercise + capstone

+500 XP

SDK hooks.PreToolUse hooks.PostToolUse hooks.Stop settingSources permissionMode: plan

5 Workflow Patterns

Chaining, Routing, Parallelization, Orchestrator-Worker, Evaluator-Optimizer

+50

When NOT to Use an Agent

If-else → Workflow → Agent decision tree

+50

Observability

Structured logging, tracing, cost tracking via hooks.PreToolUse + hooks.PostToolUse

+50

Agent Bootstrap & Initialization

Cold start vs warm start, repo warmup, permissionMode: plan for planning-before-execution

+50

Routing vs Orchestrator-Worker

Implement both, compare tradeoffs

+100

Capstone: Mini Agent Framework

Tool registry + loop + eval + fix + logging — 7+ SDK features, 5 tests ≥ 4/5 pass

+500

Gate: L3 Builder unlocked — capstone uses hooks, structured output, fallback, sessions, settings

Track 3 · Expedition

6 paths (choose 2+)~20 hoursBranching

L3 Builder → L5 Expert

You set out into uncharted territory. Deep specializations — choose your own adventure and complete any 2 paths to unlock the Production capstone.

Choose 2+ paths below. Different choices = different Badge Shelves.

Recommended Routes

Full-Stack Agent

Build & ship complete agent systems

P7P10P12

AI Engineer

Knowledge systems & multi-agent orchestration

P8P9P12

Safety-First

Trust, control & production readiness

P7P11P12

P7

MCP & Tool Ecosystem

~2h2 lessons + 1 exercise

+500 XP

SDK createSdkMcpServer() tool() mcpServers mcp__*__* mcpServerStatus()

MCP Protocol Structure

Server / client / tool / resource. createSdkMcpServer() + tool() in-process.

+50

Using Existing MCP Servers

GitHub, filesystem, database integrations via mcpServers stdio/http config.

+50

Build a Minimal MCP Server

Expose custom tools via mcp__*__* naming, verify with mcpServerStatus()

+100

Gate: MCP server runs, agent calls mcp__*__* tools, mcpServerStatus() connected

P8

Agentic RAG

~3h2 lessons + 2 exercises

+500 XP

SDK MCP + subagents outputFormat for relevance scoring multi-step tool calls

Traditional RAG vs Agentic RAG

Agent decides when, what, whether to re-retrieve. MCP search tool + outputFormat relevance scoring.

+50

Multi-step Retrieval Strategy

Decompose → retrieve → evaluate → re-retrieve

+50

Self-deciding Search Agent

Judge if retrieval needed or answer directly

+100

Iterative RAG with Quality Check

Retrieve → check relevance → re-query if insufficient

+100

Gate: multi-step retrieval, relevance scores above threshold, re-query count tracked

P9

Multi-Agent Systems

~3h3 lessons + 2 exercises

+500 XP

SDK agents{} AgentDefinition model: haiku|sonnet|opus maxTurns hooks.SubagentStart/Stop

When to Go Multi-Agent

Specialization, parallelism, separation of concerns. agents{} with model overrides.

+50

Orchestration Patterns

Orchestrator-Worker, Peer-to-Peer, Pipeline. maxTurns + model per agent.

+50

Communication & State Sharing

Message passing, shared memory, handoff

+50

Planner + Executor Duo

One agent plans, another executes tools

+100

Writer + Reviewer Pipeline

Generate → critique → revise cycle

+100

Gate: 2+ agents run, SubagentStart/Stop hooks fire, cleanup implemented

P10

Code Generation Agent

~4h3 lessons + 3 exercises

+500 XP

SDK Read/Write/Edit/Bash outputFormat for plans enableFileCheckpointing rewindFiles()

Template-based Generation

Fill template > full text; DSL + strict interface

+50

Compile / Typecheck / Render Tools

Layered verification for generated code

+50

Self-healing Strategies

Compile → runtime → output assertion, prioritized. enableFileCheckpointing + rewindFiles() for rollback.

+50

Strict Scene Schema

Schema → compiler → LLM fills schema only

+100

Typecheck + Render Preview

Structured error output + low-res preview

+100

3-layer Self-healing Loop

TS error → runtime exception → visual assertion

+100

Gate: 10 scripts, ≥ 70% success, repair logs with diffs, file checkpointing demonstrated

P11

Human-in-the-Loop Design

~2h3 lessons + 1 exercise

+500 XP

SDK permissionMode (all 4) canUseTool (advanced) AskUserQuestion hooks.PermissionRequest

Approval Flows & Permission Tiers

When to auto-execute vs ask. permissionMode: default → acceptEdits → bypass → plan.

+50

Checkpoint & Rollback

Undo-safe execution, AskUserQuestion for confirmation, audit trails.

+50

Calibrating Autonomy

Progressive trust: fully supervised → semi-auto → full auto

+50

Build a Permission-Gated Agent

Classify actions by risk via canUseTool, confirm destructive ops, log decisions.

+100

Gate: 3-tier permission system works, destructive ops blocked, AskUserQuestion demonstrated

P12

Production & Final Capstone

~6h3 lessons + 1 exercise + capstoneRequires 2+ above

+500 XP

SDK sandbox{} all 12 hooks plugins[] known issues (14)

Sandboxing & Safety

Isolated execution, sandbox config, network control, command exclusion.

+50

Rate Limiting & Cost Control

Token budgets, retry budgets, circuit breakers via hooks control plane.

+50

Monitoring & Alerting

Success rate dashboards, anomaly detection

+50

Add Production Guardrails

Token budget + timeout + structured logging

+100

Final Capstone: End-to-End Agent

MCP tools + subagents + eval harness + sandbox + all 12 hooks + N-trial report + public repo

+500

Gate: reproducible demo, eval report, cost report, sandbox verified — L5 Expert unlocked

Progression

Level ↔ Path Map

L1

Explorer — 0 XP

Start P1 Agent Basics

L2

Apprentice — 1,500 XP

Complete P1+P2, working on P3

L3

Builder — 6,000 XP

Complete Track 1+2 through P6 Capstone

L4

Architect — 8,000 XP

Complete 2+ Advanced paths

L5

Expert — 10,000 XP

Complete P12 Production + Final Capstone

L6

Master — 20,000 XP

Create courses, MCP servers, open-source

Economics

XP Economy

Path XP Breakdown

Each path awards 500 XP total = content XP (lessons + exercises) + completion bonus. The 500 XP is granted as a one-time reward upon path completion, not subject to daily cap.

Metric

Value

Path completion rewards

12 × 500 = 6,000 XP

Daily cap (regular activities)

300 XP

Daily quest XP

+20–30 XP

Reflection day bonus

3 × 50 = 150 XP

First Path complete

~Day 3

Avg completion interval

~5 days

Branch choices (Autonomy)

5 options, pick 2+

Days to L3 Builder (6,000 XP)

~20 days

Days to L5 Expert (10,000 XP)

~35 days

Streak protection

1 rest day/week

Streak protection: 1 rest day per week won't break your streak. Sustainable learning over grinding.

Achievements

Badge Triggers

Knowledge

Initiated1st lesson

Versed10 lessons

ScholarAll lessons (40+)

Practice

Tried1st exercise

Skilled10 exercises

MasterAll exercises (25+)

Habit

Started3-day streak

Consistent7-day streak

Unstoppable30-day streak

Social badges (Reviewer, Mentor, Pillar) are in the Special Badges category.

Micro-engagement

Daily Quests

5-minute rituals to keep the streak alive. One quest per day — lower the entry barrier, raise the DAU.

Quick Prompt

Given a scenario, write the system prompt or pick the right tool

+20 XP

Failure Spot

Read an agent trace, find the failure point and classify the error type

+20 XP

Tool Match

Given a task description, select the correct tool + parameters from a set

+20 XP

Code Review

Review another learner's exercise submission, leave constructive feedback

+30 XP

Code Review Flow: Peer-to-peer review with AI-assisted quality checks. Submit your exercise → get matched with another learner's submission → leave feedback → earn XP when your feedback is marked helpful.

Validation

Design Rationale

Frequent milestones

Gamification meta-analysis: "Rules/Goals + Challenge" yields highest impact (d=0.566). 12 path completions at ~5-day intervals vs 2 at ~25-day intervals.

Autonomy via branching

SDT research: Autonomy effect g=0.638. Track 3's "choose 2 of 5" with recommended routes provides guided choice without decision fatigue.

Streak Freeze for sustainability

1 rest day/week without breaking streak. Aligns with "sustainable growth" principle.

Zen breathing rhythm

Reflection Days between tracks create deliberate pauses — compress knowledge before expanding. Mirrors context window management in agent design.

Relatedness via Social track

SDT's third pillar. Daily code reviews + Social badges address the biggest gap in solo-learning gamification: community connection.

Competence perception

Gamification has minimal impact on competence (g=0.277). Mitigated by showing real execution results at every gate — not just XP.

Summary

Before vs After

Before

2 mega-paths (~16–20h each)

After

12 focused paths (~2–6h each)

Before

First milestone at ~Day 21

After

First milestone at ~Day 3

Before

No branching, same order

After

5 branches + recommended routes

Before

1,000 XP from path bonuses

After

6,000 XP from path bonuses

Before

Solo progression only

After

Daily quests + social badges + reflections

Before

No context / model routing content

After

Context engineering + model routing + HITL

Curriculum

Agent Basics

Tool Use

Prompts & Structured Output

Reflection Day

Memory & Context Engineering

Evaluation, Routing & Recovery

Design Patterns & Capstone

Reflection Day

Full-Stack Agent

AI Engineer

Safety-First

MCP & Tool Ecosystem

Agentic RAG

Multi-Agent Systems

Code Generation Agent

Human-in-the-Loop Design

Production & Final Capstone

Graduation Reflection

Level ↔ Path Map

XP Economy

Path XP Breakdown

Badge Triggers

Knowledge

Practice

Habit

Daily Quests

Quick Prompt

Failure Spot

Tool Match

Code Review

Design Rationale

Frequent milestones

Autonomy via branching

Streak Freeze for sustainability

Zen breathing rhythm

Relatedness via Social track

Competence perception

Before vs After