Chapter 5: Package Manager Best Practices


Package managers are where most image bloat originates. Every package manager has caches, recommended packages, documentation, and other artifacts that it installs alongside what you asked for. None of these belong in a production container image. This chapter covers the flags and patterns that eliminate them.


apt / apt-get (Debian and Ubuntu)

apt-get is the package manager in Debian and Ubuntu based images, including python:*-slim-bookworm and all ubuntu:* images.

The canonical pattern

RUN apt-get update && \
    apt-get install -y --no-install-recommends \
        libpq5 \
        ca-certificates && \
    rm -rf /var/lib/apt/lists/*

Three rules, all in one RUN:

  1. apt-get update and install together — always run update immediately before install. If they are in separate layers, a cached update layer may be stale while the install layer requests a package version that no longer exists in the remote index.

  2. --no-install-recommends — apt installs “recommended” packages by default. These are packages the maintainer suggests but does not require. On a full system they improve the user experience; in a container they add 50–150 MB of utilities you will never use.

  3. rm -rf /var/lib/apt/lists/* — apt stores its package index (the data downloaded by apt-get update) in /var/lib/apt/lists/. This index can be 30–80 MB. It is only needed during installation. Deleting it in the same RUN instruction that created it keeps it out of the layer.

Listing packages explicitly

Install only what you need, with explicit version pins for reproducibility where appropriate:

# Explicit: clear and auditable
RUN apt-get update && \
    apt-get install -y --no-install-recommends libpq5=15.* && \
    rm -rf /var/lib/apt/lists/*

Use apt-cache show <package> to check what a package recommends before installing it with --no-install-recommends.


pip (Python)

--no-cache-dir

pip maintains an HTTP cache of downloaded packages in ~/.cache/pip. This cache speeds up repeated installs on the same machine but serves no purpose inside a build layer — the next build starts from a clean layer anyway.

RUN pip install --no-cache-dir -r requirements.txt

Without --no-cache-dir, the pip cache is baked into the layer alongside the installed packages, adding hundreds of megabytes for nothing. This is the most common Python image bloat cause.

--only-binary :all:

This flag instructs pip to only install pre-built binary wheels and refuse to compile from source. It prevents surprise compilation in the runtime image (which would require build tools) and speeds up installation:

RUN pip install --no-cache-dir --only-binary :all: -r requirements.txt

The downside: if a wheel is not available for your platform, the install fails. Use in runtime stages where you want to guarantee no build tools are needed.

Separate build and runtime dependencies

In multi-stage builds, install only what the runtime needs in the runtime stage:

# builder: installs everything including build tools
RUN pip install --no-cache-dir -r requirements.txt

# runtime: only the packages in requirements.txt; no build deps

Keep a requirements-dev.txt for test and development dependencies and never install it in the runtime stage.


apk (Alpine Linux)

Alpine’s package manager is faster and more size-efficient than apt by default.

--no-cache

The equivalent of combining apk update with cache cleanup:

RUN apk add --no-cache libpq

This performs the update, installs the package, and discards the package index — all in one operation, without needing a separate cleanup command.

The virtual package pattern

For packages needed only at build time (compilers, headers), use --virtual to group them under a named metapackage that can be removed atomically after compilation:

RUN apk add --no-cache --virtual .build-deps \
        gcc \
        musl-dev \
        postgresql-dev && \
    pip install --no-cache-dir psycopg2 && \
    apk del .build-deps

After apk del .build-deps, gcc, musl-dev, and postgresql-dev are all removed in the same layer. The .build-deps name is a convention; it can be anything. The compiled psycopg2 .so remains because it was installed into the Python path, not tracked by apk.


npm / yarn (Node.js)

npm ci instead of npm install

npm ci is the production-focused install command:

COPY package*.json ./
RUN npm ci

Omit dev dependencies

RUN npm ci --omit=dev

This excludes packages in devDependencies from the install. For applications built in a separate stage (using webpack, esbuild, etc.), the runtime stage only needs production dependencies.

Clean the npm cache

RUN npm ci && npm cache clean --force

Or use a temp cache directory:

RUN npm ci --cache /tmp/npm-cache && rm -rf /tmp/npm-cache

Summary Table

Manager Install flag Cache cleanup Build-only deps
apt-get --no-install-recommends rm -rf /var/lib/apt/lists/* Manual list + cleanup in same RUN
pip --no-cache-dir Built-in when flag used Separate requirements.txt files
apk --no-cache Built-in when flag used --virtual .build-deps + apk del
npm --omit=dev npm cache clean --force Multi-stage build

Key Takeaways


← Chapter 4: Dockerfile Instruction Optimization Table of Contents Chapter 6: Build Cache and BuildKit →