Chapter 6: Path Planning and Decision Making

Planning is the cognitive core of an autonomous vehicle. Given a complete understanding of the world (perception), the vehicle’s precise location (localization), and predictions of how other agents will behave (prediction), the planner must determine what the vehicle should do — from high-level route decisions down to the exact trajectory the vehicle will follow.

The Planning Hierarchy

Planning operates at multiple levels of abstraction:

1. Route Planning (Mission Planning)

Question: Which roads should I take to get from origin to destination?

This is essentially the same problem solved by Google Maps or Waze: find the optimal path through a road network, considering distance, traffic, and road type.

Algorithms: Dijkstra’s algorithm, A*, or their variants on a road graph. The graph nodes are intersections, edges are road segments, and edge weights encode distance, travel time, or a cost function combining both.

2. Behavioral Planning (Decision Making)

Question: What high-level maneuver should I execute right now?

Given the route and the current traffic situation, the behavioral planner decides:

Should I follow the car ahead?
Should I change lanes?
Should I stop for this yellow light or proceed?
Should I yield to this pedestrian?
Should I merge onto the highway?

3. Motion Planning (Trajectory Planning)

Question: What exact path and speed profile should I follow?

Given the high-level decision, the motion planner computes a smooth, feasible, safe, and comfortable trajectory — a sequence of (x, y, heading, speed, time) waypoints.

Graph Search Methods

A* Algorithm

A* is the classic informed search algorithm. It finds the shortest path in a graph by expanding the most promising nodes first:

\[f(n) = g(n) + h(n)\]

where:

$g(n)$ = actual cost from start to node $n$
$h(n)$ = heuristic estimate of cost from $n$ to goal (must be admissible: never overestimates)
$f(n)$ = estimated total cost

A* is used primarily for route planning on road networks. For motion planning in continuous space, it is adapted by discretizing the space.

Hybrid A*

Standard A* operates on a discrete grid, producing paths with sharp turns that a car cannot physically follow. Hybrid A* addresses this:

States include continuous (x, y, heading) rather than just grid cells
Expansion uses kinematically feasible motions (forward, left-forward, right-forward) with a fixed step size
The Reeds-Shepp or Dubins curves provide a lower bound for the heuristic (shortest path for a car-like robot)

Hybrid A* is widely used for parking and low-speed maneuvering.

State Lattice Planning

A state lattice pre-computes a set of feasible trajectories (primitives) that connect states at regular intervals. During planning, the algorithm searches through combinations of these primitives using A* or dynamic programming.

Advantages: all trajectories are kinematically feasible by construction. The resolution and coverage of the lattice can be tuned for different scenarios (dense lattice for parking, sparse for highway).

Sampling-Based Methods

Rapidly-Exploring Random Trees (RRT)

RRT builds a tree of feasible trajectories by randomly sampling the configuration space:

Sample a random configuration $q_\text{rand}$
Find the nearest node $q_\text{near}$ in the existing tree
Extend from $q_\text{near}$ toward $q_\text{rand}$ by a step, creating $q_\text{new}$
If the path from $q_\text{near}$ to $q_\text{new}$ is collision-free, add $q_\text{new}$ to the tree
Repeat until the goal is reached

RRT*: An asymptotically optimal variant that rewires the tree when a shorter path is found.

Advantages and Limitations

RRT is good at finding feasible paths in complex environments quickly, but the resulting paths are typically non-smooth and suboptimal. Post-processing (smoothing, optimization) is usually required.

Optimization-Based Planning

The dominant approach in modern autonomous driving: formulate trajectory planning as a mathematical optimization problem.

Trajectory Optimization

Define the planned trajectory as a sequence of states $\mathbf{s}_0, \mathbf{s}_1, …, \mathbf{s}_N$ at discrete time steps, and solve:

\[\min_{\mathbf{s}_{0:N}} \sum_{t=0}^{N} c(\mathbf{s}_t, \mathbf{a}_t) + c_\text{terminal}(\mathbf{s}_N)\]

subject to:

Dynamics: $\mathbf{s}_{t+1} = f(\mathbf{s}_t, \mathbf{a}_t)$ (vehicle motion model)
Collision avoidance: Minimum distance to all obstacles and predicted agent positions
Road boundaries: Stay within the lane (or the drivable area)
Kinematic limits: Maximum steering angle, acceleration, deceleration
Comfort: Limits on lateral acceleration and jerk

Cost Function Components

A typical cost function includes:

Progress cost: Encourage forward progress along the route $c_\text{progress} = -w_p \cdot s_\text{along}$

Lane centering: Penalize deviation from lane center $c_\text{center} = w_c \cdot d_\text{lateral}^2$

Speed tracking: Follow the desired speed (speed limit or traffic flow) $c_\text{speed} = w_v \cdot (v - v_\text{desired})^2$

Smoothness: Penalize jerky motion $c_\text{smooth} = w_j \cdot (\ddot{v}^2 + \dot{\kappa}^2)$

where $\ddot{v}$ is longitudinal jerk and $\dot{\kappa}$ is curvature rate.

Safety: Penalize proximity to other agents $c_\text{safety} = w_s \cdot \sum_i \max(0, d_\text{safe} - d_i)^2$

Traffic rules: Hard or soft constraints for traffic laws (red lights, stop signs, yield rules).

Solvers

The resulting optimization is typically non-convex (due to collision avoidance constraints). Common approaches:

Sequential Quadratic Programming (SQP): Approximates the non-convex problem as a sequence of quadratic subproblems
Interior Point Methods: Handle inequality constraints (collision avoidance) via barrier functions
ILQR (Iterative Linear Quadratic Regulator): A shooting method that linearizes dynamics and approximates costs as quadratic around a trajectory, then iterates

Frenet Frame Planning

A popular approach in structured roads: plan in the Frenet frame, which decomposes motion into:

Longitudinal (s): Along the lane center line
Lateral (d): Perpendicular to the lane center line

This decoupling simplifies the problem:

Lateral planning: Generate candidate d(t) profiles (quintic polynomials for lane changes, constant d for lane keeping)
Longitudinal planning: Generate candidate s(t) profiles (acceleration, constant speed, deceleration)
Combine: Evaluate all combinations of lateral and longitudinal profiles
Select: Choose the combination with the lowest cost that satisfies all constraints

Decision Making Under Uncertainty

The POMDP Framework

Autonomous driving can be formulated as a Partially Observable Markov Decision Process (POMDP):

States: The full world state (all agents, traffic lights, road conditions)
Actions: The ego vehicle’s possible actions (accelerate, brake, change lane, etc.)
Observations: What the sensors can perceive (noisy, incomplete)
Transitions: How the world evolves (including other agents’ behaviors)
Rewards: Negative for collisions, positive for progress, negative for discomfort

Solving POMDPs exactly is computationally intractable for realistic driving scenarios. Practical approaches use approximations:

Monte Carlo Tree Search (MCTS)

Build a search tree of possible future scenarios:

Select: Traverse the tree using UCB (Upper Confidence Bound) to balance exploration and exploitation
Expand: Add a new child node
Simulate: Roll out a random policy to estimate the value
Backpropagate: Update node values up the tree

Scenario-Based Planning

Generate a discrete set of “scenarios” representing different possible behaviors of other agents:

Scenario 1: The car ahead maintains speed
Scenario 2: The car ahead brakes hard
Scenario 3: The pedestrian crosses the road

Plan a trajectory that is safe across all scenarios, or plan the best trajectory for each scenario and select based on risk tolerance.

The Responsibility-Sensitive Safety (RSS) Model

Developed by Mobileye, RSS provides a formal mathematical framework for safe driving:

Key Principles

Safe following distance: The ego vehicle must maintain enough distance to stop safely if the car ahead brakes at its maximum deceleration
Safe lateral distance: Minimum gap when passing or being passed
Right of way: Rules for intersections, merges, and priority situations
Limited visibility: Conservative behavior when perception is limited (blind corners)

Mathematical Formulation

For longitudinal safety, the safe following distance is:

\[d_\text{safe} = v_r \cdot \rho + \frac{v_r^2}{2 a_\text{min,brake}} - \frac{v_f^2}{2 a_\text{max,brake}} + \frac{1}{2} a_\text{max,accel} \cdot \rho^2 + v_r \cdot \rho \cdot a_\text{max,accel} / a_\text{min,brake}\]

where $v_r$ is the rear vehicle’s speed, $v_f$ is the front vehicle’s speed, $\rho$ is the reaction time, and the acceleration parameters represent worst-case assumptions.

RSS provides a provable safety guarantee under its assumptions. However, overly conservative RSS parameters can make the vehicle too cautious to make progress in dense traffic.

Comfort and Naturalism

A technically safe trajectory is not enough — it must also feel natural and comfortable to passengers. Key comfort metrics:

Lateral acceleration: < 2 m/s² for normal driving, < 4 m/s² for emergency maneuvers
Longitudinal acceleration: < 2 m/s² for comfort, < 4 m/s² for normal braking
Jerk (rate of acceleration change): < 1.5 m/s³ — smooth acceleration and braking transitions
Heading rate: Smooth, predictable steering

Ride comfort optimization is a significant engineering challenge. Early autonomous vehicles were often criticized for jerky, robotic behavior. Modern systems explicitly optimize for smoothness and human-like driving patterns.

Real-Time Requirements

The planner must produce trajectories fast enough to react to the dynamic world:

Replan frequency: 10–20 Hz (every 50–100 ms)
Planning horizon: 5–10 seconds ahead
Latency budget: 50 ms or less for the planning computation
Fallback: If the planner fails to find a solution in time, execute an emergency stop or the previous plan

Summary

Planning is where perception meets action. The key insights:

Planning operates at multiple levels: route → behavior → trajectory
Optimization-based methods dominate modern systems, formulating planning as constrained trajectory optimization
Safety models like RSS provide formal guarantees but must be balanced with practical drivability
Uncertainty from prediction must be incorporated through scenario-based or probabilistic planning
Comfort is an explicit design objective, not just a side effect
Real-time performance is critical — the planner must deliver solutions at 10–20 Hz

The next chapter covers how planned trajectories are executed: vehicle control systems.

← Previous: Prediction and Behavior Modeling

Next: Vehicle Control Systems →

← Back to Table of Contents