The problem with shipping the whole kitchen
You write a Go service. It compiles to a single binary. You wrap it in a Docker container to ship it to production. The image finishes building at 820 megabytes. You stare at the terminal. Your actual binary is 14 megabytes. The rest is the Go compiler, the standard library, build tools, and development dependencies that your application never touches at runtime.
Shipping a container with the entire build environment is like buying a single novel and paying for the freight of the entire printing press. It slows down deployments, increases attack surface, and wastes registry storage. Multi-stage builds solve this by separating the compilation environment from the runtime environment. You compile in a heavy image, extract the artifact, and drop it into a clean, minimal image.
What a multi-stage build actually does
A multi-stage Dockerfile contains multiple FROM instructions. Each FROM starts a fresh build context with its own filesystem, environment variables, and layer history. Docker does not merge them automatically. You explicitly copy artifacts from one stage to another using COPY --from=<stage_name>.
Think of a car manufacturing plant. The assembly floor has industrial cranes, welding robots, paint booths, and diagnostic computers. Those tools are essential to build the car. The finished vehicle does not need the factory to drive. It only needs an engine, wheels, and a driver. The first Docker stage is the factory floor. The second stage is the driveway. You leave the heavy machinery behind and only load the finished product.
Go makes this pattern especially effective. The language compiles to statically linked binaries by default. The binary contains everything it needs to run. It does not require a system-wide Go installation, a package manager, or a dynamic linker. You can run a Go binary on an image that contains absolutely nothing but the executable itself.
Multi-stage builds are cheap. Leave the compiler behind.
The minimal working example
Here is the simplest two-stage Dockerfile that compiles a Go module and ships only the binary.
# Stage 1: heavy build environment with compiler and source
FROM golang:1.23 AS builder
WORKDIR /src
COPY . .
RUN go build -o myapp .
# Stage 2: lightweight runtime with only the compiled binary
FROM alpine:latest
WORKDIR /app
COPY --from=builder /src/myapp .
CMD ["./myapp"]
The first stage pulls the official Go image. It sets the working directory, copies your source code, and runs the compiler. The AS builder clause assigns a name to this stage so you can reference it later.
The second stage pulls a minimal Linux distribution. It creates a working directory, copies the compiled binary from the builder stage, and sets the default command. Docker discards the entire filesystem from the first stage after the copy completes. The final image contains only the Alpine base plus your binary.
How the build process unfolds
Docker processes a multi-stage Dockerfile sequentially. It creates an intermediate image for each FROM instruction. These intermediate images exist only during the build. They are not pushed to your registry unless you explicitly tag them.
When Docker hits COPY --from=builder, it looks up the filesystem state of the stage named builder at the exact moment that stage finished. It extracts the requested path and injects it into the current stage's filesystem. The source stage's layers are never included in the final image. This keeps the output small and secure.
Layer caching works independently per stage. If you change a source file, Docker rebuilds the builder stage from the point of change. The runtime stage only rebuilds if the binary path or the runtime base image changes. This separation prevents unnecessary rebuilds of the final image when you only tweak build arguments or compiler flags.
Build caching respects stage boundaries. Optimize each stage separately.
A realistic production setup
Production Dockerfiles need more than two stages. They need deterministic caching, security hardening, and explicit dependency management. Go's module system interacts directly with Docker's layer cache. If you copy your entire source tree before downloading dependencies, Docker invalidates the cache on every commit. The build downloads modules from scratch every time.
The convention is to copy go.mod and go.sum first, run go mod download, and then copy the rest of the source. This isolates dependency resolution from source changes. You also disable CGO for pure Go binaries, switch to the scratch image instead of Alpine, and run the process as a non-root user.
Here is a production-ready pattern that applies those conventions.
# Stage 1: download dependencies in isolation for cache stability
FROM golang:1.23 AS deps
WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
# Stage 2: compile the binary with deterministic flags
FROM deps AS builder
WORKDIR /src
COPY . .
# Disable CGO to produce a fully static binary
ENV CGO_ENABLED=0
# Strip debug info and optimize for size
RUN go build -trimpath -ldflags="-s -w" -o /out/myapp .
# Stage 3: empty runtime image with no OS libraries
FROM scratch
WORKDIR /app
# Copy only the binary from the builder stage
COPY --from=builder /out/myapp .
# Create a non-root user for runtime security
USER nobody
CMD ["./myapp"]
The deps stage isolates module downloads. Docker caches this layer until go.mod or go.sum changes. The builder stage inherits the cached dependencies, copies the source, and compiles. The -trimpath flag removes local filesystem paths from the binary, improving reproducibility. The -ldflags="-s -w" flags strip the symbol table and DWARF debugging information, shaving megabytes off the final binary.
The final stage uses scratch. This is a Docker-provided empty image. It contains no shell, no package manager, and no libc. It only holds what you explicitly copy into it. Running as USER nobody prevents privilege escalation if the application is compromised. The scratch image is the smallest possible runtime for a static Go binary.
Static binaries on scratch are the gold standard. Ship nothing but the executable.
Common pitfalls and build failures
Multi-stage builds introduce a few specific failure modes. Most stem from misunderstanding how Docker layers interact with Go's compilation model.
The most frequent runtime panic is exec format error. This happens when you compile with CGO enabled and try to run the binary on scratch or a mismatched libc version. CGO links against the host system's C library. If the runtime image lacks that exact library, the kernel refuses to execute the binary. Always set CGO_ENABLED=0 when targeting scratch or minimal images.
Another common failure is no such file or directory during the COPY --from step. This usually means you referenced the wrong stage name, used an absolute path that does not exist, or forgot that WORKDIR changes the base path. Docker does not warn you about missing source files in earlier stages. It simply fails the build with a missing-path error. Verify the stage name matches the AS clause exactly.
Cache invalidation surprises cause slow builds. If you place COPY . . before COPY go.mod go.sum ./, Docker sees a changed filesystem on every commit. It discards the module cache and redownloads everything. The build time jumps from ten seconds to two minutes. Keep dependency downloads isolated from source copies.
Module proxy timeouts occur when the build environment lacks network access or hits a corporate firewall. Docker does not inherit your local GOPROXY settings unless you explicitly pass them. Add ENV GOPROXY=https://proxy.golang.org,direct to the build stage if your network requires it.
The compiler rejects missing dependencies with go: cannot find main module if you forget to copy go.mod. It rejects unused imports with imported and not used if you accidentally leave debug code in the final build. Treat the Dockerfile as part of your build pipeline. Test it locally before pushing.
Cache layers are fragile. Copy dependencies first, then source.
When to use multi-stage builds
Multi-stage builds are not required for every project. They add complexity to the Dockerfile and introduce an extra stage to debug. Pick the right tool for your deployment target.
Use a multi-stage build when your language requires a compiler or build toolchain that is too large to ship to production. Use a multi-stage build when you need to isolate dependency resolution from source code to optimize Docker layer caching. Use a multi-stage build when you want to run the final container on a minimal or empty image to reduce attack surface. Use a single-stage build when your application is interpreted and the runtime image already contains everything needed to execute the code. Use a single-stage build when you are prototyping locally and want the fastest feedback loop without managing stage boundaries.
Build complexity should match deployment risk. Keep it simple until you need the separation.