Introduction
Docker builds images automatically, by reading instructions from a Dockerfile
. This is a plaintext file that contains commands, in order, needed to build your image. This file must comply with the requirements - it uses specific format and specific set of instructions. Should you have any questions (or you are new to docker), do not hesitate to read Dockerfile Reference page. If you are new to writing Dockerfiles, you should start there.
When building Docker images you should always aim for the smallest size possible. Furthermore, if you share layers among images it will be easier (and faster) to deploy targeted apps.
As you know, every Dockerfile’s command creates new layer. You may have noticed that many Dockerfiles available to the public uses following trick:
FROM debian
RUN set -x && apt-get update && apt-get install -y --no-install-recommends bzip gcc && rm -rf /var/lib/{apt,dpkg,cache,log}/ /tmp/* /var/tmp/*
which is reasonable solution. Adding bzip and gcc to the base image of debian as a single command creates targeted app in size of 212MB (at the time of writing):
$ docker build -t="debian-bzip-gcc-v1" .
Sending build context to Docker daemon 2.048kB
Step 1/2 : FROM debian
---> 1b3ec9d977fb
Step 2/2 : RUN apt-get -qq update && apt-get install -qq -y --no-install-recommends bzip2 gcc && rm -rf /var/lib/{apt,dpkg,cache,log}/ /tmp/* /var/tmp/*
---> Running in e89204303fe4
debconf: delaying package configuration, since apt-utils is not installed
(Reading database ... 6487 files and directories currently installed.)
.
.
.
Setting up bzip2 (1.0.6-8.1) ...
Setting up gcc (4:6.3.0-4) ...
Processing triggers for libc-bin (2.24-11+deb9u1) ...
Removing intermediate container e89204303fe4
---> b83f4616276b
Successfully built b83f4616276b
Successfully tagged debian-bzip-gcc-v1:latest
$ docker history debian-bzip-gcc-v1
IMAGE CREATED CREATED BY SIZE
b83f4616276b 28 seconds ago /bin/sh -c apt-get -qq update && apt-get ins… 112MB
1b3ec9d977fb 12 days ago /bin/sh -c #(nop) CMD ["bash"] 0B
<missing> 12 days ago /bin/sh -c #(nop) ADD file:7d3b21b18d7bc6d6d… 100MB
$ docker images debian-bzip-gcc-v1
REPOSITORY TAG IMAGE ID CREATED SIZE
debian-bzip-gcc-v1 latest b83f4616276b About a minute ago 212MB
Every layer use disk space. You can see it by yourself when pulling images from registry. To see another example in action, check out Dockerfile for buildpack-deps.
OK, that’s it. If you are interested in smaller docker images let’s go for another example, shall we?
Use multi-stage builds
Since Docker 17.05 you can use multi-stage build to reduce size of your final image. It works without the need to jump through hoops to reduce the number of intermediate layers or remove intermediate files during the build.
Most of the time you should benefit both the build cache and minimize image layers.
Example of build stages:
1. Install tools you need to build you app.
2. Install / update library dependencies
3. Generate your application.
In this example we will build Go app image.
Start with main.go
:
package main
import "fmt"
func main() {
fmt.Println("Hello, world!")
}
First, let’s containerize this app with following Dockerfile:
FROM golang:1.8
WORKDIR /go/src/app
ADD . /go/src/app
RUN go-wrapper download
RUN go-wrapper install
CMD ["/go/bin/app"]
Build and run the image with:
$ docker build -t="go-v1" .
Sending build context to Docker daemon 3.072kB
Step 1/6 : FROM golang:1.8 as build
---> 0d283eb41a92
Step 2/6 : WORKDIR /go/src/app
---> Using cache
---> 179e54f72c42
Step 3/6 : ADD . /go/src/app
---> 497ef265fb8b
Step 4/6 : RUN go-wrapper download
---> Running in db28e8cebfc3
+ exec go get -v -d
Removing intermediate container db28e8cebfc3
---> a2ff96353469
Step 5/6 : RUN go-wrapper install
---> Running in 6f9c999af90b
+ exec go install -v
app
Removing intermediate container 6f9c999af90b
---> 990d17245be8
Step 6/6 : CMD ["/go/bin/app"]
---> Running in be3400a7efef
Removing intermediate container be3400a7efef
---> 11ba6fa5350b
Successfully built 11ba6fa5350b
Successfully tagged go-v1:latest
$ docker run go-v1
Hello, world!
Well, let’s check how big is our image:
$ docker images go-v1
REPOSITORY TAG IMAGE ID CREATED SIZE
go-v1 latest 11ba6fa5350b 14 seconds ago 715MB
OK, let’s try multi-stage Docker build. With this aproach you use multiple FROM
commands in your Dockerfile. Each FROM can use a different base, which begins a new stage of build. You selectively copy artifacts from one stage to another, leaving behind everything you don’t want in final app. Let’s adapt Go example to use multi-stage builds.
FROM golang:1.8 as stage1
WORKDIR /go/src/app
ADD . /go/src/app
RUN go-wrapper download
RUN go-wrapper install
FROM golang:1.8
COPY --from=stage1 /go/bin/app /
CMD ["/app"]
Build and run:
$ docker build -t="go-v2" .
Sending build context to Docker daemon 4.096kB
Step 1/8 : FROM golang:1.8 as build
---> 0d283eb41a92
Step 2/8 : WORKDIR /go/src/app
---> Using cache
---> 179e54f72c42
Step 3/8 : ADD . /go/src/app
---> ba0b86b4db8e
Step 4/8 : RUN go-wrapper download
---> Running in 8c35165c884f
+ exec go get -v -d
Removing intermediate container 8c35165c884f
---> c9b852cd2bb6
Step 5/8 : RUN go-wrapper install
---> Running in be3d7d1bdcb2
+ exec go install -v
app
Removing intermediate container be3d7d1bdcb2
---> 4a26015829f9
Step 6/8 : FROM golang:1.8
---> 0d283eb41a92
Step 7/8 : COPY --from=build /go/bin/app /
---> 11a3bf1274ce
Step 8/8 : CMD ["/app"]
---> Running in 737e17331cd5
Removing intermediate container 737e17331cd5
---> 47b521392bfc
Successfully built 47b521392bfc
Successfully tagged go-v2:latest
$ docker run --rm -it go-v2
Hello, world!
It works! Go ahead and inspect the image:
$ docker images go-v2
REPOSITORY TAG IMAGE ID CREATED SIZE
go-v2 latest 47b521392bfc 17 seconds ago 710MB
go-v2 target image consists of fewer intermediate layers, so we saved few MB. Can we save more?
Use alpine version of base images
Alpine Linux is a security-oriented, lightweight Linux distribution based on musl libc and busybox.
There are many base images available to the public using alpine (instead of debian, ubuntu, centos…) as a ‘core’. If you are looking for one, just make sure it contains ‘-alpine’ sufix in tag.
Alpine Linux image is only 5MB, it has access to package manager (so it should cover 98% real world cases ;))
There is alpine version of golang:1.8 image, compare size of both:
$ docker images "golang:1.8*"
REPOSITORY TAG IMAGE ID CREATED SIZE
golang 1.8 0d283eb41a92 10 days ago 713MB
golang 1.8-alpine 4cb86d3661bf 2 weeks ago 257MB
257MB instead of 713MB! We only add -alpine to FROM command!
Going back to our case, replace regular golang image with alpine version:
FROM golang:1.8-alpine
WORKDIR /go/src/app
ADD . /go/src/app
RUN go-wrapper download
RUN go-wrapper install
CMD ["/go/bin/app"]
Build & run:
$ docker build -t="go-v3" -f Dockerfile-v3 .
Sending build context to Docker daemon 5.12kB
Step 1/6 : FROM golang:1.8-alpine
---> 4cb86d3661bf
Step 2/6 : WORKDIR /go/src/app
Removing intermediate container e52de6fd1e15
---> 6d0cacab27a1
Step 3/6 : ADD . /go/src/app
---> 5e631e28f90e
Step 4/6 : RUN go-wrapper download
---> Running in d302fc0274ff
+ exec go get -v -d
Removing intermediate container d302fc0274ff
---> ff6e6651161b
Step 5/6 : RUN go-wrapper install
---> Running in a728130421c1
+ exec go install -v
app
Removing intermediate container a728130421c1
---> 820d37d80391
Step 6/6 : CMD ["/go/bin/app"]
---> Running in 7a539c1adf69
Removing intermediate container 7a539c1adf69
---> c5b936627ac1
Successfully built c5b936627ac1
Successfully tagged go-v3:latest
$ docker run --rm -it go-v3
Hello, world!
It worked. Where is the catch? Vanilla images use full glibc as a standard C library, alpine use muslc package. Muslc use less space, but some dependencies in your project may not work when compiled against glibc. So before you replace all FROM commands in your Dockerfiles repo be sure it is fine for your case.
Remove all OS related things from base image (even smaller base images!)
Google’s distroless base images contains only runtime dependencies from your app. No package managers, shells or whatever you would expect to find in regular Linux distro (vanilla images).
How to get them? Distroless project use gcr.io
docker registry, at the time of writing following images were published:
- gcr.io/distroless/base
- gcr.io/distroless/python2.7
- gcr.io/distroless/python3
- gcr.io/distroless/nodejs
- gcr.io/distroless/java
- gcr.io/distroless/java/jetty
- gcr.io/distroless/cc
- gcr.io/distroless/dotnet
Going back to our example, let’s use distroless base image now. As it is written in the documentation, it is used to run Go apps.
FROM golang:1.8 as stage1
WORKDIR /go/src/app
ADD . /go/src/app
RUN go-wrapper download
RUN go-wrapper install
FROM gcr.io/distroless/base
COPY --from=stage1 /go/bin/app /
CMD ["/app"]
Build & run to see if we are good or not:
docker build -t="go-v4" .
Sending build context to Docker daemon 6.144kB
Step 1/8 : FROM golang:1.8 as stage1
---> 0d283eb41a92
Step 2/8 : WORKDIR /go/src/app
---> Using cache
---> 179e54f72c42
Step 3/8 : ADD . /go/src/app
---> cbfdbc805980
Step 4/8 : RUN go-wrapper download
---> Running in 1440550687ea
+ exec go get -v -d
Removing intermediate container 1440550687ea
---> 6e26b4a6d177
Step 5/8 : RUN go-wrapper install
---> Running in 8f0ba0fd5008
+ exec go install -v
app
Removing intermediate container 8f0ba0fd5008
---> 23edaf0d592e
Step 6/8 : FROM gcr.io/distroless/base
latest: Pulling from distroless/base
bb8371eaf726: Pull complete
Digest: sha256:4f28178a3746a9145742c5802e4a2479b2cd39f6359db5ec8b7e7f7b4a592039
Status: Downloaded newer image for gcr.io/distroless/base:latest
---> 89c6ea43854e
Step 7/8 : COPY --from=stage1 /go/bin/app /
---> 15a4c3f5b291
Step 8/8 : CMD ["/app"]
---> Running in 52154b62c25f
Removing intermediate container 52154b62c25f
---> 8c2fcc22abbc
Successfully built 8c2fcc22abbc
Successfully tagged go-v4:latest
$ docker run --rm -it go-v4
Hello, world!
Well, that went fine. What about final size?
$ docker images "go-v*"
REPOSITORY TAG IMAGE ID CREATED SIZE
go-v4 latest 8c2fcc22abbc 16 seconds ago 18.1MB
go-v3 latest c5b936627ac1 About an hour ago 259MB
go-v2 latest 47b521392bfc 18 hours ago 710MB
go-v1 latest 11ba6fa5350b 19 hours ago 715M
18.1MB final app size when using multi-stage build with distroless image from Google. Almost 700MB less than our first image. Excellent!
Besides size of final image there is something more you should notice. There are no extra binaries, libs etc. There is no shell available in this image. So debugging would be harder. But attack surface area is as minimal as possible.
We recommend multi-stage approach, with different stages, integrated with your CI environment. For example:
- debug stage with all debugging symbols, tools enabled
- testing stage with your application that gets populated with test data
- production stage with your app working on real data, no extra dependencies, shells