Lessons learned with Docker

For the last projects I built, I decided to take a shot at Docker for several reasons which seemed good to me.
I wanted to go live fast with a production environment easily reproducible on my dev laptop.
I was alone, scripting the installation and configuration of all the layers (application, proxy, cache, dB) was prohibitive because I would spend too much time on it. Docker seemed really attractive as each layer is in a separate container.
Plus, deploying today a « dockerized » application is cheap thanks to providers like Digital Ocean (I am a customer and fan).
Usually, I would buy a book or follow thoroughly recommended tutorials before using the technology. The main drawback is that it is often time consuming and you don’t really get immediately usable skills. So this time I decided to directly get my hands dirty. As expected this was an interesting journey and these are the lessons I learnt along the way.

Layer caching is awesome

This is the first thing that you learn, especially when you have a really low bandwidth.
At first, when building my docker images, I would naively write the following Dockerfile (voluntarily truncated to highlight the problem).

FROM python:3.6
ADD . .
RUN pip install -r requirements.txt

But everytime I would build the image, it would take way too long to finish. Especially when your connection is 2 mbps at best, I currently live in Senegal and realized I was a spoiled kid in France with a fiber connection and at least 50mbps. You quickly run out of patience and have to do better.
Fortunately, this was extremely easy to solve, thanks again StackOverflow. I quickly learnt that a better version of what I tried to do was the following.

FROM python:3.6
ADD requirements.txt .
RUN pip install -r requirements.txt
ADD . .

Docker creates a layer for each instruction, except the metadata one like MAINTAINER for example, and caches it. If the conditions leading to the creation of the layer are unchanged, it can use the cached version. In this specific case, if requirements.txt is unchanged, the 2nd layer linked to the first ADD instruction is retrieved in the cache. Thus the condition to create the 3rd layer have not changed as the filesystem is changed by the first ADD, so it will also hit the cache if requirements.txt is not changed. So no more costly downloads for each change. It will only occurs when it has to i.e. when my requirements/dependencies change.

Takeaway: in your Dockerfile, try to write the parts that are less likely to change first.

ADD and COPY

Then I noticed that doing what you saw above was not really wise. I was copying every source to build the project in the Docker image. But that implied also copying tests and static assets that were not needed. So I decided to build externally the project and copy it then in the image so that I only get what I need into it. I would do the following in my Dockerfile.

ADD myproject.tar.gz .
RUN pip install -q myproject.tar.gz

But the second instruction would not work. I was a bit perplex as it was working on the host OS (my laptop) but not in the image. After some debugging, typically I built the image without the pip install part and check what was inside it, I found that the archive was in fact unarchived. Then, I found out that ADD has an auto unarchive feature for some supported compression formats unlike its sibling COPY.

Takeaway: avoid using ADD if you don’t want the auto-unarchive feature.

I, unfortunately, encountered a major issue with those two. I did not like the fact that most images use root user as default. So I changed it in my images before doing anything else with the USER instruction. You would then expect that the other instructions including ADD/COPY would take the user into account. But this was not the case. It was really problematic because I would change the user in the image, copy some static assets in a nginx image and the container would not be able to serve them because the user has no right to read the files. This one took me some hours before finding the cause as I was really not expecting this behavior. The solution is that you had to manually change the owner on the copied files in the Dockerfile. Not cool, especially as you would have to write a RUN, so at least create another layer in your image. It took years but it has finally been solved recently and you can now use the –chown flag with ADD and COPY.

Build in the appropriate environment

Building a project externally before copying it in the image is not the ideal way to proceed. I would build the artifact to deploy in MacOS or whatever OS my CI was based on and then copy it in a container based on Ubuntu. It worked well in this case because the artifact and the build are cross-platform. But what if it wasn’t ? I would spend a hell lot of time to make it work. At the time, I was not aware of the multi-stage build feature in Docker. This is really convenient when you want reproducible builds and build minimal images. In the first stage, you build the artifact, and in the second step, you copy the built artifact in a minimal base image, et voilà. Concretely this is what it could look like.

FROM python:3.6 as build
COPY requirements.txt .
RUN pip install -q -r requirements.txt
COPY . .  
RUN python setup.py -q sdist
...
FROM python:3.6-slim-jessie
COPY --from=build /usr/local/lib/python3.6/site-packages/ /usr/local/lib/python3.6/site-packages/
COPY --from=build /usr/local/bin /usr/local/bin
...

In the first stage, we create the artifact. We can use an “heavy” image, it does not matter as it will not be the base of the final image. We can use a container with a full development environment. Don’t worry, the created layers in the build steps won’t even be part of the final image. The final build step is our concrete image. We named the first build step as build. We can then use it the last one to copy all the artifacts we need. The base of the final image will be python:3.6-slim-jessie which is lighter than the build image. My first images weighted 300MB, I reduced it to 100MB with this feature. It is even more impressive with projects where the final artifact is a binary, like Go projects, that can run in a slim alpine container that weights only tens of MB including the binary.

Takeaway: Run the build in a container to make it reproducible and get slim target image.

Go easy on layers

Earlier, we saw the superpower of layer caching. It can be tempting to separate each instruction in its own layer. Because, you know, it may be easier to read and change later. It could look like the following.

FROM python:3.6
...
RUN apt-get update
RUN apt-get install wget
...

But it is not a great idea. Creating a layer comes with its own overhead. Thus doing this will considerably increase the size of the resulting image.

So the usual practice is to do that when we test the Dockerfile locally. If we have several instructions to run, we are confident about some of them but less about others, we write the instructions for the former first and then we add the latter. So if the build fails, it will only run the instructions we fixed and not all of them. And once we are confident, we can group the instructions in a single one.

FROM python:3.6
...
RUN apt-get update && apt-get install wget
...

This is what you will usually see in the official libraries repositories like the following.

https://github.com/docker-library/python/blob/master/3.6/jessie/Dockerfile

Takeaway: Do not create unnecessary layers to avoid getting bloated images.

Build working images

It is important to only build images if you are confident that they will work. Especially for the images where you deploy custom code like the Python application example I introduced above.

Usually, you get this confidence by writing a tests suite and automatically running it before building any artifact, the very concept of CI. You run unit, sometimes integration and E2E, tests before building a JAR or any other artifacts, so you should do the same with your Docker images. This is a great match for our new favorite, the multistage build. I stated above that it gives us the opportunity to get a reproducible environment. I am going to leverage it to run our tests before even building the different assets.

So how does it translate to our previous example ? Just how I think you would do it given the explanation above.

FROM python:3.6 as build
COPY requirements.txt .
RUN pip install -q -r requirements.txt
COPY . .
RUN flake8 && pytest && python setup.py -q sdist
...
FROM python:3.6-slim-jessie
COPY --from=build /usr/local/lib/python3.6/site-packages/ /usr/local/lib/python3.6/site-packages/
COPY --from=build /usr/local/bin /usr/local/bin
...

We run the tests just before building the egg. If our tests fail, the build fails too independently of the platform running Docker. You get a report, you can fix before relaunching the artifact building.

Takeaway: Build the safer artifacts possible by testing in the build phase

Simplify CI and deployments

At first, I was storing many types of artifacts. This implied managing multiple credentials over multiple platforms (because I was working with some free online services, no money :) ), setting up an upload type per artifact (docker registry, pypi), building then sending each artifact after each successful CI build. Anyway, this can only be useful if both artifacts are used. This could be the case if I was using several types of deployments (container in development environment, VMs in production) or I was developing an application for a client and agreed on a specific artifact to deliver for example.

But as I owned the application and planned to use Docker from my laptop to my production environment, it simply did not make sense for me to manage several types of artifacts. And it did sting me several times when I made mistakes and wanted to delete the built versions. Finally, I decided that the only artifact that matters is the Docker image. I am now using a Docker registry as my single artifact repository as this is the only thing I need to deploy whatever environments I deploy on. One artifact, one registry, for simpler CI builds and deployments.

Takeaway: This one is rather general but do not build unnecessary artifacts on your road to production, keep your deployments simple.

The next (mis)steps

All in all, I am really glad I learnt this way. Docker was in fact a right tool to do that. You can learn it as you go and Docker always has something to teach you along the way. I made mistakes, I learnt from them and there is still a lot to learn before being sufficiently proficient. How marvelous.

The next steps I want to explore:

improve security of my containers
deploy a simple and informative monitoring for my containers
play more with container orchestration
check what LinuxKit can bring to the table.

Last modified on 2023-10-28