Optimizing Docker Images

Developers like working with Docker for its flexibility and ease-of-use. When creating applications, it’s worth investing the extra time optimizing Docker Images and Dockerfiles. Optimization will help teams share smaller images, improve performance, and make it easier to debug problems. Below are some recommendations for creating better images and Dockerfiles.

Optimizing Docker Images

Large Docker images can make it difficult to share. Also, large images slow down execution. So, optimizing the images can help with the overall development and production process.

Select Proper Base Images

The images available on Docker Hub are already optimized. Instead of building your own, it’s a good idea to use the available optimized images. For example, if you need a Redis image, you have the option to build it on an Ubuntu image or directly download the redis one. Using the already built redis image is a better option because the developers have already taken care of any redundant packages.

Use Multi-stage Builds

The new multi-stage option in Docker (since version 17.05) can help you create clever ways to optimize your images. You can build an application and then transfer it to a new clean environment to deploy. It will ensure that only the necessary runtime libraries and dependencies are part of the final image.

Reduce Number of Layers

When building an image, pay attention to the layers created by Dockerfiles. Each RUN command creates a new layer. So combining the layers can reduce the image size. A simple example is the apt-get. Generally, users run the command like this:

RUN apt-get -y update
RUN apt-get install -y python

It will create two layers. But combining the commands will create a single layer in the final image:

RUN apt-get -y update && apt-get install -y python

So, smart combinations of commands can lead to smaller images.

Build Custom Base Images

Docker caches images. If you need multiple instances of the same layers, it’s a good idea to look at optimizing the layers and creating a custom base image. It will speed up load times and make it easier to track.

Build on Top of Production Images

Test images require more tools and libraries to test out features. It’s a good idea to use the production image as the base and create test images on top of it. The unnecessary test files will be outside of the base. So production images will stay small and clean for deployment.

Avoid Storing Application Data

Storing application data in the container will balloon up your images. For production environments, always use the volume feature to keep the container separate from the data.

Best Practices for Writing Dockerfiles

Dockerfiles allow developers to codify processes. So, it’s a great tool to improve the Docker image building process. Here are a few practices that will help you improve your development.

Design Ephemeral Containers

Try to design containers that are easy to create and destroy. If containers are too dependent on peripheral environments and configurations, they are harder to maintain. So designing stateless containers can help simplify the system.

Use .dockerignore to Optimize Images

If you have a complicated build that goes through multiple directories recursively, all the files and directories are sent to the Docker daemon. It can result in larger images and slower build times. You can use the .dockerignore to exclude unnecessary files and folders that complicate the build process.

Use Multi-stage Builds

Multi-stage builds are a new Docker feature since version 17.05. It allows developers to build multiple images in the same Dockerfile and move artifacts from one container to another in the Dockerfile itself. So you can have smaller and optimized artifacts in your final image without using complicated scripts to attain the same results.

Install Required Packages Only

Dockerfile should install only the bare minimum packages necessary to run the services. Every package requires space in the image. So certain applications like ping or text editor might be unnecessary in the context of the service that will run on the container. Understanding the requirements of a particular service can help you write better Dockerfiles that can create optimized images.

Think Microservices

Designing Dockerfiles with Microservices architecture in mind can be helpful. It’s not always possible to deploy one process per container. But developers can think how to distribute their processes more proactively and make decisions that will help deploy services in a decoupled manner. Containers are a natural fit for modular design. So your Dockerfiles should take advantage of the opportunities Docker provides.

Consider the Effect of Instructions on Layers

Only RUN, COPY and ADD in Dockerfiles create new layers since version 1.10. Other instructions don’t directly impact the size of the final images. So you should be vigilant when they use these commands. Also, combining multiple commands can decrease the number of layers. Fewer layers mean smaller sizes.

Sort Multi-line Arguments

Whenever you have a multi-line argument, sort the arguments alphanumerically to improve maintenance of the code. Haphazard arguments can lead to duplications. They are also harder to update. A good example:

RUN apt-get update && apt-get install -y \
  apache2 \
  git \
  iputils-ping \
  python \

Avoid Using :latest

If you are using From [imagename]:latest, you can run into problems whenever the image changes. It can become a difficult problem to trace. Using specific tags can ensure that you know the exact image being used from the Docker registry.

Add Only Required Files from Directory

Dockerfile commands are executed consecutively to build images and it only builds layers that aren’t already present. Suppose, you have a package.json for npm and requirements.txt for pip. You can write the following Dockerfile where package.json and requirements.txt are in mycode folder:

COPY ./mycode/ /home/program/
RUN npm install
RUN pip install -r requirements

However, every time there is a change in any of the files in mycode, both RUN commands have to be rebuilt. Instead, if the code is written in the following way:

COPY ./mycode/package.json /home/program/package.json
WORKDIR /home/program
RUN npm install

COPY ./mycode/requirements.txt /home/program/requirements.txt
WORKDIR /home/program
RUN pip install -r requirements

Then, the RUN commands will be independent of each other and change in a single file in the mycode folder will not affect both npm and pip RUN commands. Looking at dependencies like this can help you write better Dockerfiles.

Further Study

The above techniques and best practices should help you build smaller Docker images and write better Dockerfiles. Here are links to help you find out more information about different topics:


About the author

Zak H

Zak H. lives in Los Angeles. He enjoys the California sunshine and loves working in emerging technologies and writing about Linux and DevOps topics.