learning-docker/notes/3_building_docker_images.md

7.3 KiB
Raw Blame History

3. Building Docker Images

What are Dockerfile?

Dockerfile:

  • is a small "program" to create an image
  • Run Dockerfile using docker build -t name_of_container .
    • where . means Dockerfile is here
    • -t name_of_contain mean tag the container
  • When finished, the result will be in local docker registry, ready to be run

Producing the Next Image with Each Step

  • Each line (step) in Dockerfile takes the image from the previous line and make another image
  • The previous images is unchanged
  • state is not carried forward from line to line
    • Multiple command in oneline is different from multiple commands in separate line
  • Hence, you don't want large files to span lines, otherwise, the image is too large
    • e.g. Download a large file, edit it, and delete it; If done in oneline, the image is small, otherwise, it's big

Details of working with Dockerfile are available in Dockerfile reference

Caching with Each Step

  • docker build using Dockerfile will save output of each step in cache
    • Watch build output for "using cache"
  • Docker will skips lines that have not changed since the last build. Time/resource saved
  • Caching saves huge amounts of time
  • Tip of editing Dockerfile: always put the parts that make change at the end of Dockerfile

Dockerfile != Shell Scripts

  • Dockerfiles look like shell scripts
  • But they are not same
    • Process in one line won't be running on next line
    • Each line run for the duration of that container, then container gets shutdown, saved into an image. Fresh start on next line
  • If two programs need passing values in same container, they have to be in same line
  • Environment variables can be passed to next line using ENV cmd

Summary to notify: each line in Dockerfile is its own call to docker run

Building Dockerfiles

The Most Basic Dockerfile

Create a simple Dockerfile with following lines

FROM busybox
RUN echo "building simple docker images."
CMD echo "Hello Container"

Build image using this Dockerfile docker build -t hello ., result in build output as shown below:

Sending build context to Docker daemon  2.048kB
Step 1/3 : FROM busybox
latest: Pulling from library/busybox
5f5dd3e95e9f: Pull complete 
Digest: sha256:9f1c79411e054199210b4d489ae600a061595967adb643cd923f8515ad8123d2
Status: Downloaded newer image for busybox:latest
 ---> dc3bacd8b5ea
Step 2/3 : RUN echo "building simple docker image"
 ---> Running in 1990ae4f8398
building simple docker image
Removing intermediate container 1990ae4f8398
 ---> cf5a3650fa24
Step 3/3 : CMD echo "hello container"
 ---> Running in 3e3b64874c2c
Removing intermediate container 3e3b64874c2c
 ---> 4a0a8c1c2d1b
Successfully built 4a0a8c1c2d1b
Successfully tagged hello:latest
  1. In step 1/3, a image dc3bac... is created.
  2. In step 2/3, a container 1990ae... is created using the image as shown; And echo "building ..." is executed using RUN command.
    1. The container is also removed at the end of Step 2/3 as no body is using this container any more.
  3. In step 3/3, a new container 3e3b... is created, added command echo using CMD, which is then saved as new image 4a0a...

Running the image 4a0a... via: docker run --rm hello, will print "hello container"

Installing a Program with Docker Build

Create a new Dockerfile:

FROM debian:sid
RUN apt-get -y update
RUN apt-get -y upgrade
RUN apt-get -y install nano
CMD "nano" "/tmp/notes"

Adding a File through Docker Build

Start from previous built image

FROM example/nanoer
ADD notes.txt /notes.txt
CMD "nano" "/notes.txt"

In the same directory, create a notes.txt with inputted words. This dockerfile will add required file into image

Dockerfile syntax

The FROM statement

  • Indicate which image to download and start from
  • Must be the first cmd in Dockerfile

The MAINTAINER Statement

  • Defines the author of this Dockerfile
MAINTAINER Firstname Lastname <email@example.com>

The RUN Statement

  • Runs the command line, waits for it to finish, and saves the result
RUN unzip install.zip /opt/install/

The ADD Statement

  • Adds local files ADD run.sh /run.sh
  • Adds the contents of tar archives
    • ADD project.tar.gz /install/, it will un-compress tar.gz and add to container
  • Works with URLs as well
    • ADD https://project.example.com/download/1.0/project.rpm /project/

The ENV Statement

  • Sets environment variables
  • Both during the build and when running the image
ENV DB_HOST=db.production.example.com
ENV DB_PORT=5432

The ENTRYPOINT and CMD Statement

  • ENTRYPOINT specifies the start of the command to run
  • CMD specifies the whole command to run
  • If container acts like a cmd-line program, you can use ENTRYPOINT
  • If you are unsure, CMD is more used

Shell Form vs. Exec FORM

  • ENTRYPOINT & CMD can use both forms
  • Shell form looks like normal shell script:
    • nano notes.txt
  • Exec form looks like:
    • ["/bin/nano", "notes.txt"]

The EXPOSE Statement

  • Maps a port into the container
    • EXPOSE 8080, same as -p 8080 in docker run

VOLUME Statement

  • Defines shared or ephemeral volumes
    • VOLUME ["/host/path/" "/container/path/"] map host path to container
    • VOLUME ["/shared-data" create a volumes can be inherited by later containers

Tips: Avoid defining shared folders in Dockerfile, as it makes Dockerfile only work with the current computer

WORKDIR Statement

  • Sets the directory the container starts in after docker run
WORKDIR /install/

The USER Statement

  • Sets which user the container will run as
  • Useful when have a shared network directory involved a fixed username/number
USER arthur
USER 1000

TODO: Read docker reference guid

Multi-project Docker files

It was actually very common to have one Dockerfile to use for development (which contained everything needed to build your application), and a slimmed-down one to use for production, which only contained your application and exactly what was needed to run it. This has been referred to as the “builder pattern”. Maintaining two Dockerfiles is not ideal.

With multi-stage builds, you use multiple FROM statements in your Dockerfile. Each FROM instruction can use a different base, and each of them begins a new stage of the build. You can selectively copy artifacts from one stage to another, leaving behind everything you dont want in the final image.

FROM ubuntu:16.04 as builder
RUN apt-get -y update
RUN apt-get -y install curl
RUN curl https://google.com | wc -c > google-size 

FROM alpine
COPY --from=builder /google-size /google-size
ENTRYPOINT echo google is this big; cat google-size

Avoid golden images

Golden images: A locally-modified image (like legacy of previous developer that nobody dare to modify)

Preventing the Golden Image Problem

  • Include installers in the project. If any dependencies needed for building the image, check it in image
  • Have a canonical (权威) build system that builds everything from scratch.
    • From a base image
    • Build until final stage
  • Tag builds with git has of the code that built it
  • Use small base images, e.g. Alpine
  • Build images you share publicly from Dockerfiles, always
  • Don't leave password in layers.