Chapter 1: Sensor Technologies and Hardware

An autonomous vehicle’s ability to drive safely depends entirely on its ability to perceive the world. This chapter explores the sensor technologies that serve as the vehicle’s eyes, ears, and spatial awareness — and the trade-offs between them.

The Sensor Suite

Modern autonomous vehicles use a combination of complementary sensors, each with distinct strengths and weaknesses. No single sensor is sufficient for safe autonomous driving; redundancy and diversity are essential.

Sensor Range Precision Works in Dark Works in Rain/Fog Cost Data Type
Camera 50–250 m High (2D) Poor Moderate Low Color images
LiDAR 10–300 m Very High (3D) Yes Moderate Medium–High 3D point clouds
Radar 10–300 m Moderate Yes Yes Low Velocity + range
Ultrasonic 0.1–5 m Moderate Yes Yes Very Low Range
IMU N/A High Yes Yes Low–Medium Acceleration, rotation
GPS/GNSS Global 1–10 m Yes Yes Low Global position

Cameras

Cameras are the most information-rich sensor. A single camera frame contains millions of pixels with color, texture, and contextual information that no other sensor can match.

How They Work

A camera sensor converts photons into electrical signals using a CMOS (Complementary Metal-Oxide-Semiconductor) or CCD (Charge-Coupled Device) imaging chip. Each pixel measures the intensity of light at its location, typically across three color channels (RGB). Modern automotive cameras operate at resolutions from 1 to 8 megapixels, capturing frames at 30–60 Hz.

Camera Configurations

Autonomous vehicles typically use multiple cameras with overlapping fields of view to achieve 360° coverage:

Tesla uses 8 cameras for its vision-only system. Waymo’s 6th-generation sensor suite includes 13 cameras providing overlapping 360° coverage with both near-field and far-field resolution.

Stereo Vision

A stereo camera pair (two cameras separated by a known baseline distance) can estimate depth through triangulation, similar to human binocular vision. Given a point visible in both cameras, the disparity (pixel offset between left and right images) is inversely proportional to depth:

\[d = \frac{f \cdot B}{\text{disparity}}\]

where $f$ is the focal length and $B$ is the baseline distance.

Stereo vision is computationally expensive and struggles with textureless surfaces, but provides dense depth maps without active illumination.

Strengths and Limitations

Strengths:

Limitations:

LiDAR (Light Detection and Ranging)

LiDAR is often considered the “gold standard” sensor for autonomous driving due to its precise 3D measurement capabilities. It works by emitting laser pulses and measuring the time for each pulse to return after bouncing off an object.

How It Works

A LiDAR sensor emits rapid pulses of near-infrared laser light (typically at wavelengths of 905 nm or 1550 nm). For each pulse, the sensor measures:

  1. Time of flight (ToF): The round-trip time gives precise distance:
\[d = \frac{c \cdot t}{2}\]

where $c$ is the speed of light and $t$ is the round-trip time.

  1. Angle: The direction of the laser beam (azimuth and elevation).
  2. Intensity: How much light was reflected (depends on surface reflectivity).

By scanning thousands of pulses per second across different angles, LiDAR builds a 3D point cloud — a collection of (x, y, z, intensity) points representing the surfaces in the environment.

Types of LiDAR

Mechanical spinning LiDAR: The original design, used by Velodyne. A rotating assembly spins the laser emitters 360°. Provides excellent 360° coverage but has moving parts that can wear out. The iconic Velodyne VLP-16 (16 channels) and VLP-64 are classic examples.

Solid-state LiDAR: No moving parts. Uses techniques like MEMS mirrors, optical phased arrays (OPA), or flash illumination. More compact, cheaper, and more reliable. Companies: Luminar (MEMS-based), Innoviz, Hesai.

FMCW LiDAR: Frequency-Modulated Continuous Wave LiDAR measures the frequency shift of reflected light, providing instantaneous velocity in addition to range. This is a significant advantage for detecting and tracking moving objects. Aeva and Aurora are developing FMCW LiDAR systems.

Key Specifications

Modern automotive LiDAR sensors typically offer:

Strengths and Limitations

Strengths:

Limitations:

The Camera vs. LiDAR Debate

One of the most contentious debates in autonomous driving: is LiDAR necessary, or can cameras alone suffice?

Tesla’s position: Cameras only. CEO Elon Musk has called LiDAR a “crutch” and argued that since humans drive with vision alone, machines should be able to as well. Tesla removed radar from its vehicles in 2021 and ultrasonic sensors in 2022, relying entirely on 8 cameras.

Waymo’s position: Multi-sensor (cameras + LiDAR + radar). CEO Dmitri Dolgov has argued that LiDAR’s direct 3D measurement provides a critical safety layer that cameras cannot match. Waymo develops its own custom LiDAR sensors optimized for cost and performance.

The technical truth is nuanced. Cameras provide richer semantic information; LiDAR provides more reliable geometric information. The best-performing systems as of 2026 use both. However, the rapid improvement of vision-only systems (especially with end-to-end learning) is narrowing the gap.

Radar (Radio Detection and Ranging)

Radar uses radio waves to detect objects, measure their distance, and — crucially — their radial velocity via the Doppler effect.

How It Works

A radar transmitter emits radio waves (typically at 77 GHz for automotive long-range radar, 24 GHz for short-range). When these waves hit an object, they are reflected back. The system measures:

  1. Range: From the round-trip time of the signal.
  2. Velocity: From the Doppler frequency shift. An approaching object shifts the frequency up; a receding object shifts it down:
\[v_r = \frac{\Delta f \cdot c}{2 f_0}\]

where $\Delta f$ is the frequency shift, $c$ is the speed of light, and $f_0$ is the transmit frequency.

  1. Angle: Modern MIMO (Multiple-Input, Multiple-Output) radar uses multiple antenna elements to estimate the angle of arrival.

Types of Automotive Radar

4D Imaging Radar

The latest generation of automotive radar — 4D imaging radar — represents a significant advancement. By using large antenna arrays (MIMO configurations with hundreds of virtual channels), 4D imaging radar can resolve objects in range, velocity, azimuth, and elevation. This produces radar “point clouds” that begin to approach LiDAR-like spatial resolution, while retaining radar’s all-weather capability and direct velocity measurement.

Companies developing 4D imaging radar: Arbe, Continental, ZF, Vayyar.

Strengths and Limitations

Strengths:

Limitations:

Ultrasonic Sensors

Ultrasonic sensors emit high-frequency sound waves (typically 40–48 kHz) and measure the echo return time. They are the simplest and cheapest ranging sensor.

Specifications

Use Cases

Primarily used for:

Note: Tesla removed ultrasonic sensors from its vehicles in late 2022, replacing their functionality with camera-based depth estimation. Most other manufacturers continue to use them.

Inertial Measurement Unit (IMU)

An IMU combines:

High-end IMUs also include magnetometers (measuring Earth’s magnetic field for heading).

How They Work

MEMS (Micro-Electro-Mechanical Systems) accelerometers use a tiny proof mass suspended by springs. When the device accelerates, the proof mass moves relative to the housing, and the displacement is measured capacitively. MEMS gyroscopes use the Coriolis effect on a vibrating structure to measure rotation.

IMU in Autonomous Vehicles

The IMU is critical for:

The main limitation is drift: small measurement errors accumulate over time, causing position estimates to diverge. This is why IMU is always fused with other sensors (GPS, LiDAR, cameras).

GPS / GNSS

Global Navigation Satellite Systems (GNSS), including GPS (US), GLONASS (Russia), Galileo (EU), and BeiDou (China), provide global position by triangulating signals from multiple satellites.

Standard GPS Accuracy

Consumer GPS provides accuracy of ~2–5 meters, which is insufficient for lane-level positioning.

RTK-GPS

Real-Time Kinematic (RTK) GPS uses a fixed base station with a known position to provide correction signals, achieving accuracy of 1–2 centimeters. RTK is used in some autonomous vehicle systems for precise localization, though it requires infrastructure (base stations) and clear sky visibility.

Limitations

The Sensor Fusion Imperative

No single sensor provides a complete picture. The real power of an AV’s perception comes from sensor fusion — combining data from all sensors to create a unified, robust understanding of the environment.

Consider a scenario: a car is approaching an intersection in light rain at dusk.

By fusing all these inputs, the system achieves a reliable understanding that no single sensor could provide alone. We will explore sensor fusion algorithms in detail in Chapter 4.

Waymo’s 6th-Generation Sensor Suite

As a concrete example of a production sensor suite, Waymo’s 6th-generation system (deployed commercially starting February 2026) includes:

The entire suite is designed for redundancy: if any single sensor fails, the remaining sensors can still provide sufficient information for safe operation. The 6th-generation system features significantly improved resolution, range, and field of view compared to previous generations, while reducing the number of individual sensors (from 29 to 23) through better integration.

Compute Hardware

Processing sensor data in real time requires significant computational power. Modern autonomous vehicles use:

The compute requirements are enormous: a typical AV generates 1–2 TB of raw sensor data per hour. Processing this data — running neural networks for detection, tracking, prediction, and planning — requires 50–500+ TOPS of AI compute, depending on the approach.

Summary

The sensor suite is the foundation of an autonomous vehicle. Each sensor brings unique strengths:

The next chapter explores how computer vision algorithms extract meaning from this raw sensor data — turning pixels and point clouds into objects, lanes, and traffic lights.


← Previous: Introduction to Autonomous Vehicles Next: Computer Vision and Deep Learning →

← Back to Table of Contents