Generated using AI. Be aware that everything might not be accurate.

Chapter 6: Build Cache and BuildKit

The previous chapters focused on image size — making the final image smaller. This chapter focuses on a related but distinct concern: build speed. Fast builds are not just a developer convenience; in CI pipelines that run on every commit, slow builds translate directly into slower feedback loops and higher compute costs.

Docker’s layer cache and BuildKit’s advanced caching primitives are the tools for addressing this.

Layer Ordering for Cache Efficiency

As established in Chapter 1, Docker invalidates a layer’s cache when its instruction or inputs change, and invalidates all downstream layers as a consequence. The rule is simple:

Put layers that change infrequently at the top. Put layers that change frequently at the bottom.

For a typical application:

Layer content	Change frequency
Base OS image	Months
System packages	Weeks
Language runtime config	Weeks
Dependency manifests (`requirements.txt`)	Days to weeks
Dependency installation	On manifest change
Application source code	Every commit

A Dockerfile that honours this order gets a cache hit on all expensive layers (system packages, dependency installation) on the vast majority of builds, and only rebuilds from the source copy layer onward.

BuildKit

BuildKit is Docker’s next-generation build engine. It is enabled by default in Docker 23.0+, and can be explicitly invoked with:

DOCKER_BUILDKIT=1 docker build .
# or, always available:
docker buildx build .

BuildKit provides several features relevant to image optimisation:

Parallel stage execution — independent stages in a multi-stage build run in parallel
Cache mounts — a persistent cache that survives across builds without being baked into layers
Bind mounts — mount build-context files without copying them into a layer
Secrets — inject sensitive values at build time without them appearing in the image

The most important for size optimisation is the cache mount.

BuildKit Cache Mounts

The fundamental tension in package management is:

pip --no-cache-dir produces clean layers but re-downloads every package on every build.
pip with cache produces fast builds but bakes the cache into the layer, adding size.

BuildKit cache mounts resolve this tension. A cache mount is a directory that persists between builds on the same machine, but is never included in the image layer:

RUN --mount=type=cache,target=/root/.cache/pip \
    pip install -r requirements.txt

The pip cache at /root/.cache/pip persists across builds (fast) but does not appear in the final image (small). You get the speed of caching and the cleanliness of --no-cache-dir.

apt cache mount

RUN --mount=type=cache,target=/var/cache/apt \
    --mount=type=cache,target=/var/lib/apt/lists \
    apt-get update && \
    apt-get install -y --no-install-recommends libpq5

Note: with cache mounts for apt, you no longer need rm -rf /var/lib/apt/lists/* in the same RUN — the lists are in the cache mount, not the layer.

npm cache mount

RUN --mount=type=cache,target=/root/.npm \
    npm ci

Full BuildKit Python example

FROM python:3.12-slim-bookworm AS builder
WORKDIR /app
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
COPY requirements.txt .
RUN --mount=type=cache,target=/root/.cache/pip \
    pip install -r requirements.txt

Full example: code/dockerfiles/buildkit_cache_example.Dockerfile

Remote Cache for CI Pipelines

On CI runners, each job typically starts from a clean environment with no local Docker layer cache. Without remote caching, every CI build starts from scratch — including re-downloading and re-installing all dependencies.

BuildKit supports remote cache backends. The most convenient for GitHub Actions is the GHA cache backend:

docker buildx build \
  --cache-from type=gha \
  --cache-to type=gha,mode=max \
  -t myapp:latest .

mode=max exports all layer caches, including intermediate stages from multi-stage builds. mode=min exports only the final stage’s layers.

For registry-based caching (works on any CI platform):

docker buildx build \
  --cache-from type=registry,ref=registry.example.com/myapp:buildcache \
  --cache-to type=registry,ref=registry.example.com/myapp:buildcache,mode=max \
  -t myapp:latest .

This pulls the cache image before building and pushes an updated cache image after building. The first build is slow; subsequent builds are fast.

Inline Cache for Simple Cases

If you push your images to a registry and want simple cache reuse without a dedicated cache image, use inline cache:

docker buildx build \
  --build-arg BUILDKIT_INLINE_CACHE=1 \
  --cache-from myapp:latest \
  -t myapp:latest .

BUILDKIT_INLINE_CACHE=1 embeds cache metadata in the image manifest. Subsequent builds can use --cache-from pointing at the existing image. This is simpler than a dedicated cache image but less efficient (mode=max is not available).

Bind Mounts for Build Inputs

For files that are only needed during a RUN step and should not appear in the layer, use a bind mount instead of COPY:

RUN --mount=type=bind,source=requirements.txt,target=/tmp/requirements.txt \
    pip install --no-cache-dir -r /tmp/requirements.txt

The requirements.txt is never copied into a layer — it is only available to the pip install command. This is a minor size optimisation (manifest files are small) but a useful pattern when combined with cache mounts.

Key Takeaways

Order Dockerfile instructions from least-changing to most-changing; expensive reinstalls should only happen when their inputs actually change.
BuildKit (DOCKER_BUILDKIT=1 or docker buildx build) is required for cache mounts and parallel stages.
Cache mounts (--mount=type=cache) provide build-time caching without bloating layers.
Use GHA or registry cache backends in CI pipelines to avoid cold builds on every run.
mode=max caches all intermediate stages; prefer it for multi-stage builds in CI.

← Chapter 5: Package Manager Best Practices

Table of Contents

Chapter 7: Language-Specific Optimizations →

>> You can subscribe to my mailing list here for a monthly update. <<

Gaëlle Candel