Chapter 1: Sensor Technologies and Hardware

An autonomous vehicle’s ability to drive safely depends entirely on its ability to perceive the world. This chapter explores the sensor technologies that serve as the vehicle’s eyes, ears, and spatial awareness — and the trade-offs between them.

The Sensor Suite

Modern autonomous vehicles use a combination of complementary sensors, each with distinct strengths and weaknesses. No single sensor is sufficient for safe autonomous driving; redundancy and diversity are essential.

Sensor	Range	Precision	Works in Dark	Works in Rain/Fog	Cost	Data Type
Camera	50–250 m	High (2D)	Poor	Moderate	Low	Color images
LiDAR	10–300 m	Very High (3D)	Yes	Moderate	Medium–High	3D point clouds
Radar	10–300 m	Moderate	Yes	Yes	Low	Velocity + range
Ultrasonic	0.1–5 m	Moderate	Yes	Yes	Very Low	Range
IMU	N/A	High	Yes	Yes	Low–Medium	Acceleration, rotation
GPS/GNSS	Global	1–10 m	Yes	Yes	Low	Global position

Cameras

Cameras are the most information-rich sensor. A single camera frame contains millions of pixels with color, texture, and contextual information that no other sensor can match.

How They Work

A camera sensor converts photons into electrical signals using a CMOS (Complementary Metal-Oxide-Semiconductor) or CCD (Charge-Coupled Device) imaging chip. Each pixel measures the intensity of light at its location, typically across three color channels (RGB). Modern automotive cameras operate at resolutions from 1 to 8 megapixels, capturing frames at 30–60 Hz.

Camera Configurations

Autonomous vehicles typically use multiple cameras with overlapping fields of view to achieve 360° coverage:

Forward-facing cameras: Long-range (~250 m) narrow field-of-view (FoV) cameras for distant object detection, plus wide-FoV cameras (~120°) for nearby objects and intersections.
Side cameras: Cover blind spots and adjacent lanes.
Rear cameras: Monitor following traffic and assist with reverse maneuvers.
Surround cameras: Fish-eye lenses (~180° FoV) for near-field perception during parking and low-speed maneuvers.

Tesla uses 8 cameras for its vision-only system. Waymo’s 6th-generation sensor suite includes 13 cameras providing overlapping 360° coverage with both near-field and far-field resolution.

Stereo Vision

A stereo camera pair (two cameras separated by a known baseline distance) can estimate depth through triangulation, similar to human binocular vision. Given a point visible in both cameras, the disparity (pixel offset between left and right images) is inversely proportional to depth:

\[d = \frac{f \cdot B}{\text{disparity}}\]

where $f$ is the focal length and $B$ is the baseline distance.

Stereo vision is computationally expensive and struggles with textureless surfaces, but provides dense depth maps without active illumination.

Strengths and Limitations

Strengths:

Rich semantic information (color, texture, text on signs)
Low cost ($10–$200 per camera module)
Mature deep learning pipelines for object detection
Can read traffic lights, signs, lane markings

Limitations:

No direct depth measurement (monocular cameras)
Degraded performance in darkness, rain, glare, fog
Sensitive to exposure settings and lens artifacts
Cannot measure velocity directly

LiDAR (Light Detection and Ranging)

LiDAR is often considered the “gold standard” sensor for autonomous driving due to its precise 3D measurement capabilities. It works by emitting laser pulses and measuring the time for each pulse to return after bouncing off an object.

How It Works

A LiDAR sensor emits rapid pulses of near-infrared laser light (typically at wavelengths of 905 nm or 1550 nm). For each pulse, the sensor measures:

Time of flight (ToF): The round-trip time gives precise distance:

\[d = \frac{c \cdot t}{2}\]

where $c$ is the speed of light and $t$ is the round-trip time.

Angle: The direction of the laser beam (azimuth and elevation).
Intensity: How much light was reflected (depends on surface reflectivity).

By scanning thousands of pulses per second across different angles, LiDAR builds a 3D point cloud — a collection of (x, y, z, intensity) points representing the surfaces in the environment.

Types of LiDAR

Mechanical spinning LiDAR: The original design, used by Velodyne. A rotating assembly spins the laser emitters 360°. Provides excellent 360° coverage but has moving parts that can wear out. The iconic Velodyne VLP-16 (16 channels) and VLP-64 are classic examples.

Solid-state LiDAR: No moving parts. Uses techniques like MEMS mirrors, optical phased arrays (OPA), or flash illumination. More compact, cheaper, and more reliable. Companies: Luminar (MEMS-based), Innoviz, Hesai.

FMCW LiDAR: Frequency-Modulated Continuous Wave LiDAR measures the frequency shift of reflected light, providing instantaneous velocity in addition to range. This is a significant advantage for detecting and tracking moving objects. Aeva and Aurora are developing FMCW LiDAR systems.

Key Specifications

Modern automotive LiDAR sensors typically offer:

Range: 150–300 meters (Luminar Iris: 250 m for 10% reflectivity objects)
Points per second: 300,000 to 6+ million
Angular resolution: 0.05°–0.2°
Frame rate: 10–20 Hz
Field of view: Up to 360° horizontal, 25°–40° vertical

Strengths and Limitations

Strengths:

Precise 3D geometry (centimeter-level accuracy)
Works in darkness (active illumination)
Direct depth measurement — no ambiguity
Excellent for detecting geometric structure (road edges, curbs, building facades)

Limitations:

Expensive ($500–$10,000+ per unit, though costs are declining rapidly)
Degraded by heavy rain, fog, and snow (laser scattering)
No color information
Sparse data compared to cameras (thousands vs. millions of pixels)
Cannot easily read text or detect traffic light colors

The Camera vs. LiDAR Debate

One of the most contentious debates in autonomous driving: is LiDAR necessary, or can cameras alone suffice?

Tesla’s position: Cameras only. CEO Elon Musk has called LiDAR a “crutch” and argued that since humans drive with vision alone, machines should be able to as well. Tesla removed radar from its vehicles in 2021 and ultrasonic sensors in 2022, relying entirely on 8 cameras.

Waymo’s position: Multi-sensor (cameras + LiDAR + radar). CEO Dmitri Dolgov has argued that LiDAR’s direct 3D measurement provides a critical safety layer that cameras cannot match. Waymo develops its own custom LiDAR sensors optimized for cost and performance.

The technical truth is nuanced. Cameras provide richer semantic information; LiDAR provides more reliable geometric information. The best-performing systems as of 2026 use both. However, the rapid improvement of vision-only systems (especially with end-to-end learning) is narrowing the gap.

Radar (Radio Detection and Ranging)

Radar uses radio waves to detect objects, measure their distance, and — crucially — their radial velocity via the Doppler effect.

How It Works

A radar transmitter emits radio waves (typically at 77 GHz for automotive long-range radar, 24 GHz for short-range). When these waves hit an object, they are reflected back. The system measures:

Range: From the round-trip time of the signal.
Velocity: From the Doppler frequency shift. An approaching object shifts the frequency up; a receding object shifts it down:

\[v_r = \frac{\Delta f \cdot c}{2 f_0}\]

where $\Delta f$ is the frequency shift, $c$ is the speed of light, and $f_0$ is the transmit frequency.

Angle: Modern MIMO (Multiple-Input, Multiple-Output) radar uses multiple antenna elements to estimate the angle of arrival.

Types of Automotive Radar

Long-range radar (LRR): 77 GHz, range up to 250+ meters, narrow FoV (~15°–30°). Used for adaptive cruise control and highway collision avoidance.
Medium-range radar (MRR): 76–77 GHz, range 50–150 m, moderate FoV (~60°). Used for cross-traffic alerts and intersection detection.
Short-range radar (SRR): 77–81 GHz, range 20–50 m, wide FoV (~150°). Used for blind-spot monitoring and parking assistance.

4D Imaging Radar

The latest generation of automotive radar — 4D imaging radar — represents a significant advancement. By using large antenna arrays (MIMO configurations with hundreds of virtual channels), 4D imaging radar can resolve objects in range, velocity, azimuth, and elevation. This produces radar “point clouds” that begin to approach LiDAR-like spatial resolution, while retaining radar’s all-weather capability and direct velocity measurement.

Companies developing 4D imaging radar: Arbe, Continental, ZF, Vayyar.

Strengths and Limitations

Strengths:

Works in all weather conditions (rain, fog, snow, dust)
Direct velocity measurement via Doppler
Long range (250+ meters)
Low cost ($10–$100 per unit)
Works in complete darkness

Limitations:

Low spatial resolution compared to cameras or LiDAR
Difficulty distinguishing stationary objects from the ground (clutter)
Cannot detect color, text, or fine details
Multipath reflections in complex environments (tunnels, guardrails)

Ultrasonic Sensors

Ultrasonic sensors emit high-frequency sound waves (typically 40–48 kHz) and measure the echo return time. They are the simplest and cheapest ranging sensor.

Specifications

Range: 0.1–5 meters
Update rate: 10–40 Hz
Accuracy: ~1 cm

Use Cases

Primarily used for:

Parking assistance (detecting nearby obstacles)
Low-speed maneuvering
Curb detection

Note: Tesla removed ultrasonic sensors from its vehicles in late 2022, replacing their functionality with camera-based depth estimation. Most other manufacturers continue to use them.

Inertial Measurement Unit (IMU)

An IMU combines:

Accelerometers: Measure linear acceleration along 3 axes (x, y, z)
Gyroscopes: Measure angular velocity around 3 axes (roll, pitch, yaw)

High-end IMUs also include magnetometers (measuring Earth’s magnetic field for heading).

How They Work

MEMS (Micro-Electro-Mechanical Systems) accelerometers use a tiny proof mass suspended by springs. When the device accelerates, the proof mass moves relative to the housing, and the displacement is measured capacitively. MEMS gyroscopes use the Coriolis effect on a vibrating structure to measure rotation.

IMU in Autonomous Vehicles

The IMU is critical for:

Dead reckoning: Estimating position changes when GPS is unavailable (tunnels, urban canyons)
Attitude estimation: Knowing the vehicle’s orientation (pitch on hills, roll in turns)
High-frequency state estimation: IMU operates at 100–1000 Hz, filling gaps between slower sensor updates
Sensor fusion: Providing motion priors that improve the accuracy of visual and LiDAR-based localization

The main limitation is drift: small measurement errors accumulate over time, causing position estimates to diverge. This is why IMU is always fused with other sensors (GPS, LiDAR, cameras).

GPS / GNSS

Global Navigation Satellite Systems (GNSS), including GPS (US), GLONASS (Russia), Galileo (EU), and BeiDou (China), provide global position by triangulating signals from multiple satellites.

Standard GPS Accuracy

Consumer GPS provides accuracy of ~2–5 meters, which is insufficient for lane-level positioning.

RTK-GPS

Real-Time Kinematic (RTK) GPS uses a fixed base station with a known position to provide correction signals, achieving accuracy of 1–2 centimeters. RTK is used in some autonomous vehicle systems for precise localization, though it requires infrastructure (base stations) and clear sky visibility.

Limitations

Multipath errors: Signals bouncing off buildings in urban environments degrade accuracy
Signal loss: Tunnels, parking garages, dense tree canopy
Atmospheric delays: Ionospheric and tropospheric effects
Update rate: Typically 1–20 Hz (too slow for dynamic control without fusion)

The Sensor Fusion Imperative

No single sensor provides a complete picture. The real power of an AV’s perception comes from sensor fusion — combining data from all sensors to create a unified, robust understanding of the environment.

Consider a scenario: a car is approaching an intersection in light rain at dusk.

Cameras can see the traffic light color but struggle with glare and water droplets
LiDAR precisely maps the 3D geometry of the intersection but cannot see the traffic light color
Radar detects a car approaching from the cross street and directly measures its speed, even through the rain
IMU tracks the vehicle’s deceleration as it approaches the intersection
GPS provides a rough position, matched against the HD map to know which intersection this is

By fusing all these inputs, the system achieves a reliable understanding that no single sensor could provide alone. We will explore sensor fusion algorithms in detail in Chapter 4.

Waymo’s 6th-Generation Sensor Suite

As a concrete example of a production sensor suite, Waymo’s 6th-generation system (deployed commercially starting February 2026) includes:

13 cameras providing 360° coverage with overlapping fields of view
4 LiDAR sensors (custom-designed by Waymo) including a long-range perimeter LiDAR and a short-range dome LiDAR for near-field awareness
6 radar units covering all directions
External microphones for detecting sirens and other audio cues

The entire suite is designed for redundancy: if any single sensor fails, the remaining sensors can still provide sufficient information for safe operation. The 6th-generation system features significantly improved resolution, range, and field of view compared to previous generations, while reducing the number of individual sensors (from 29 to 23) through better integration.

Compute Hardware

Processing sensor data in real time requires significant computational power. Modern autonomous vehicles use:

GPUs (Graphics Processing Units): NVIDIA’s Drive platform is widely used, with the NVIDIA Orin SoC providing up to 254 TOPS (Tera Operations Per Second) of AI compute.
Custom ASICs: Tesla designed its own Full Self-Driving Computer (HW3/HW4) with a custom neural network accelerator optimized for their specific workloads.
FPGAs: Used by some companies for their flexibility in prototyping and low-latency processing.

The compute requirements are enormous: a typical AV generates 1–2 TB of raw sensor data per hour. Processing this data — running neural networks for detection, tracking, prediction, and planning — requires 50–500+ TOPS of AI compute, depending on the approach.

Summary

The sensor suite is the foundation of an autonomous vehicle. Each sensor brings unique strengths:

Cameras: Rich semantics, low cost, but weather-dependent
LiDAR: Precise 3D geometry, but expensive and weather-sensitive
Radar: All-weather, direct velocity, but low resolution
Ultrasonic: Close-range, low cost, but very short range
IMU: High-frequency motion tracking, but drifts over time
GPS: Global position, but insufficient accuracy alone

The next chapter explores how computer vision algorithms extract meaning from this raw sensor data — turning pixels and point clouds into objects, lanes, and traffic lights.

← Previous: Introduction to Autonomous Vehicles

Next: Computer Vision and Deep Learning →

← Back to Table of Contents