- My professional website

Planning is the agentic pattern in which an LLM autonomously decides what sequence of steps to execute to accomplish a larger task. Rather than executing a predefined workflow, the agent creates its own workflow based on the specific goal and available tools.

This is fundamentally different from prompt chaining (Chapter 7), where the developer hardcodes the sequence of steps. With planning, the LLM itself determines the decomposition.

Why Planning Matters

Many real-world tasks cannot be decomposed into steps ahead of time. Consider these scenarios:

In all these cases, the number and nature of steps depend on the input. Planning lets the agent figure this out dynamically.

The Basic Planning Pattern

    ┌──────────────────┐
    │    User Goal     │
    └────────┬─────────┘
             │
    ┌────────▼─────────┐
    │  LLM: Create     │
    │  step-by-step    │
    │  plan            │
    └────────┬─────────┘
             │
    ┌────────▼─────────┐
    │ Execute Step 1   │──► Observe result
    └────────┬─────────┘
             │
    ┌────────▼─────────┐
    │ Execute Step 2   │──► Observe result
    └────────┬─────────┘
             │
    ┌────────▼─────────┐
    │  (Optional)      │
    │  Revise plan     │
    │  based on        │
    │  observations    │
    └────────┬─────────┘
             │
    ┌────────▼─────────┐
    │ Execute Step N   │──► Observe result
    └────────┬─────────┘
             │
    ┌────────▼─────────┐
    │  Synthesize      │
    │  final result    │
    └──────────────────┘

# See code/planning.py for the full implementation

def plan_and_execute(llm, goal, tools, max_replans=3):
    """Create a plan, execute it step by step, replan if needed."""
    
    # Step 1: Generate initial plan
    plan = llm.generate(
        f"You have these tools: {format_tools(tools)}\n\n"
        f"Create a step-by-step plan to accomplish:\n{goal}\n\n"
        f"Output as a numbered list. Each step should specify "
        f"which tool to use and what arguments to provide."
    )
    
    results = []
    for replan_attempt in range(max_replans):
        steps = parse_plan(plan)
        
        for i, step in enumerate(steps):
            result = execute_step(step, tools)
            results.append({"step": step, "result": result})
            
            # Check if we need to replan
            if result.get("error"):
                plan = llm.generate(
                    f"Original goal: {goal}\n"
                    f"Plan so far: {plan}\n"
                    f"Step {i+1} failed: {result['error']}\n"
                    f"Results so far: {results}\n"
                    f"Create a revised plan to complete the goal."
                )
                break
        else:
            break  # All steps completed
    
    # Synthesize final result
    return llm.generate(
        f"Goal: {goal}\n"
        f"Steps executed and results: {results}\n"
        f"Synthesize a final response."
    )

Chain-of-Thought as Micro-Planning

Chain-of-Thought (CoT) prompting (Wei et al., 2022) can be viewed as planning at the reasoning level. By asking the LLM to “think step by step,” we encourage it to decompose its reasoning into explicit intermediate steps before producing an answer.

CoT is the simplest form of planning and requires no tools or infrastructure — just a prompt modification:

# Without CoT
response = llm.generate("What is 127 × 43?")

# With CoT
response = llm.generate(
    "What is 127 × 43? Think step by step."
)
# LLM: "127 × 43 = 127 × 40 + 127 × 3 = 5080 + 381 = 5461"

ReAct: Interleaved Planning and Execution

The ReAct pattern (Yao et al., 2022) interleaves reasoning (planning) with action (tool use). Rather than creating a complete plan upfront, the agent plans one step at a time, executes it, observes the result, and then plans the next step.

This is more robust than upfront planning because the agent can adapt its plan based on what it discovers:

Thought: I need to find the population of Tokyo.
Action: web_search("Tokyo population 2025")
Observation: Tokyo has approximately 13.96 million people...
Thought: Now I need the population of New York for comparison.
Action: web_search("New York City population 2025")
Observation: New York City has approximately 8.3 million people...
Thought: I now have both numbers and can make the comparison.
Action: Return final answer comparing both cities.

Plan-and-Solve

The Plan-and-Solve approach (Wang et al., 2023) improves upon basic CoT by first asking the LLM to devise a plan, then following that plan to solve the problem:

# See code/planning.py for the full implementation

def plan_and_solve(llm, problem):
    """First plan, then solve following the plan."""
    plan = llm.generate(
        f"Let's first understand the problem and devise a plan.\n"
        f"Problem: {problem}\n\n"
        f"Plan:"
    )
    
    solution = llm.generate(
        f"Problem: {problem}\n"
        f"Plan: {plan}\n\n"
        f"Now execute the plan step by step:\n"
    )
    
    return solution

Adaptive Planning with Replanning

Real-world plans often need to change. A research agent might discover that a particular subtopic is irrelevant, or a coding agent might find that its initial approach won’t work. Adaptive planning allows the agent to revise its plan based on what it learns during execution:

# See code/planning.py for the full implementation

def adaptive_plan_execute(llm, goal, tools):
    """Execute with dynamic replanning."""
    context = {"goal": goal, "completed": [], "observations": []}
    
    while True:
        # Generate or revise plan
        plan = llm.generate(
            f"Goal: {context['goal']}\n"
            f"Completed steps: {context['completed']}\n"
            f"Observations: {context['observations']}\n"
            f"What should be the next step? "
            f"Reply 'DONE' if the goal is achieved."
        )
        
        if "DONE" in plan:
            break
        
        # Execute the next step
        result = execute_step(plan, tools)
        context["completed"].append(plan)
        context["observations"].append(result)
    
    return synthesize(llm, context)

HuggingGPT: Planning Across Model Capabilities

The HuggingGPT approach (Shen et al., 2023) demonstrated planning at an even higher level — using an LLM to plan which AI models to invoke from the Hugging Face ecosystem:

This shows planning as a meta-capability — the agent plans not just actions, but which specialized tools (entire models) to deploy.

When to Use Planning

Planning is powerful but less predictable than Reflection or Tool Use. Use it when:

Chapter 5 – Planning Pattern

Teaching Agents to Think Before They Act