LLM & Prompts

Large language models are the agent's "brain." Prompt engineering is a key skill for building effective agents.

SDK Focus systemPrompt outputFormat Zod schema structured_output maxThinkingTokens

The LLM's Role in an Agent

The LLM serves as the core decision engine in an agent system:

Understanding - Parse user intent and context
Reasoning - Analyze problems and devise solutions
Decision-making - Choose which tools to use and how to act
Generation - Produce code, documents, replies, and more

System Prompt Design

The system prompt defines the agent's "persona" and behavioral boundaries. A good system prompt should include:

System Prompt Config system-prompt.ts

import { query } from "@anthropic-ai/claude-agent-sdk";

// Form 1: Simple string
const agent1 = query({
  prompt: "Review this PR",
  options: {
    systemPrompt: "You are a senior code reviewer. Be thorough but constructive."
  }
});

// Form 2: Preset with append (preserves Claude Code defaults)
const agent2 = query({
  prompt: "Review this PR",
  options: {
    systemPrompt: {
      type: "preset",
      preset: "claude_code",
      append: "\n\nFocus on security vulnerabilities and performance."
    }
  }
});

1. Identity Definition

You are an expert software engineer assistant. 
You help users write, debug, and improve code.
You are precise, helpful, and safety-conscious.

2. Capability Description

You have access to the following tools:
- read_file: Read contents of a file
- write_file: Create or modify files
- run_command: Execute shell commands
- search_code: Search codebase for patterns

3. Behavioral Guidelines

Guidelines:
- Always read files before modifying them
- Explain your reasoning before taking actions
- Ask for clarification when requirements are unclear
- Never execute destructive commands without confirmation

4. Output Format

Response Format:
1. First, analyze the request
2. Then, explain your approach
3. Execute necessary actions
4. Summarize what was done

Prompting Techniques

Chain of Thought (CoT)

Ask the model to show its reasoning to improve accuracy on complex tasks:

Think step by step:
1. What is the user asking for?
2. What information do I need?
3. What's the best approach?
4. What tools should I use?

Few-shot Learning

Provide examples to guide the model's behavior:

Example 1:
User: "Create a Python function to calculate fibonacci"
Action: write_file("fib.py", "def fibonacci(n):...")

Example 2:
User: "Fix the bug in app.js"
Action: read_file("app.js")
Action: write_file("app.js", "// fixed version...")

Structured Output

Use JSON or a specific format to keep outputs parsable:

Respond in this JSON format:
{
  "thought": "your reasoning",
  "action": "tool_name",
  "action_input": { ... }
}

Structured Output structured-output.ts

import { query } from "@anthropic-ai/claude-agent-sdk";
import { z } from "zod";
import { zodToJsonSchema } from "zod-to-json-schema";

// Define output schema with Zod
const ReviewSchema = z.object({
  summary: z.string().describe("One-line summary"),
  issues: z.array(z.object({
    severity: z.enum(["critical", "warning", "info"]),
    file: z.string(),
    line: z.number(),
    message: z.string()
  })),
  approved: z.boolean()
});

const response = query({
  prompt: "Review the code in src/auth.ts",
  options: {
    model: "claude-sonnet-4-5",
    outputFormat: {
      type: "json_schema",
      json_schema: {
        name: "CodeReview",
        strict: true,
        schema: zodToJsonSchema(ReviewSchema)
      }
    }
  }
});

for await (const message of response) {
  if (message.type === "result" && message.structured_output) {
    // Guaranteed to match schema — type-safe!
    const review = ReviewSchema.parse(message.structured_output);
    console.log(`Approved: ${review.approved}`);
    review.issues.forEach(i =>
      console.log(`[${i.severity}] ${i.file}:${i.line} — ${i.message}`)
    );
  }
}

SDK Insight: Schema Guarantees

outputFormat with strict: true guarantees the response matches your Zod schema exactly. Access the validated result via message.structured_output. This eliminates the need for manual JSON parsing and error handling — the SDK handles it.

Context Management

LLMs have context length limits, so you need to manage context strategically:

Context Window

Model	Context length	Notes
Claude 3.5	200K tokens	Great for large codebases
GPT-4 Turbo	128K tokens	Strong general capability
Gemini 1.5	1M tokens	Ultra-long context

Management Strategies

Sliding window - Keep recent conversations, drop older ones
Summary compression - Compress history into summaries
Retrieval augmentation - Fetch relevant context on demand
Tiered storage - Store important info in long-term memory

Model Selection

Use different models for different tasks:

Complex reasoning - Claude 3.5 Sonnet, GPT-4
Code generation - Claude 3.5 Sonnet, Codex
Fast responses - Claude 3 Haiku, GPT-3.5
Local deployment - Llama 3, CodeLlama

Extended Thinking Control

Use maxThinkingTokens to control the reasoning budget. Higher values improve quality on complex tasks but increase cost and latency. Start with the default and increase only when tasks require deep reasoning.

Best Practices

Keep the system prompt concise but complete
Use concrete examples instead of abstract descriptions
Define the output format clearly
Set a reasonable temperature (usually 0-0.3)
Test edge cases and error handling

Next Steps

Tools & Actions - Learn how agents call tools
Memory Systems - Understand agent memory mechanisms
Claude Code Prompts - See real system prompt examples

Try It: Schema-Constrained Output

Build an agent that always returns validated, type-safe JSON.

Define a Zod schema for a task analysis: { task, complexity, estimatedMinutes, requiredTools }
Use outputFormat to enforce the schema
Send 5 different task descriptions and verify all outputs validate
Compare: run the same prompts without outputFormat — how often does the output break?

Gate: P3 Complete — Structured output validates 100%, system prompt preset understood, thinking budget tradeoff measured.

Next Steps

Exercise: Structured Output

Build type-safe agents with Zod schemas and outputFormat

Continue to P4 · Memory

Session management with resume, forkSession, and continue