Codementor Events

Basic Tutorial: Using Docker and Python

Published Apr 14, 2021Last updated Oct 10, 2021

This isn't intended to be an indepth tutorial on Docker or Flask. Both tools have excellent documentation and I would highly suggest you read it. A quick brief of Docker is this, Docker allows you to bundle all your application dependencies into a portable container that can be ran on any machine with a container runtime. This allows you to simplify your infrastructure by only having the necessary docker components installed and not have to worry about a specific python/node/java version being installed. They are installed in the container image. The container image is defined by a series of directives in a Dockerfile. This Dockerfile is what we will be writing in this post. I will try and explain why I write my Dockerfile a certain way and if you have any questions feel free to ask.

The Dockerfile will look like this.

FROM python:3.9-slim-buster

RUN apt-get update && apt-get install -y \
    build-essential \
    libpq-dev \
    && rm -rf /var/lib/apt/lists/*

RUN mkdir /code
WORKDIR /code

COPY requirements.txt .
RUN python3.9 -m pip install --no-cache-dir --upgrade \
    pip \
    setuptools \
    wheel
RUN python3.9 -m pip install --no-cache-dir \
    -r requirements.txt
COPY . .

EXPOSE 5000

CMD ["python3.9", "app.py"]

The first like FROM python:3.9-slim-buster determines what image we're inheriting from. I went with 3.9-slim-buster instead of 3.9-alpine. While Alpine starts as a smaller image (44.7MB vs. 114MB), it can sometimes be hard to find compiled binaries. This can cause the image to have to build the binaries themselves. You may end up having to install git and other tools in order to achieve this, which will increase the image size. Also compiling from source can make the build take a while. The slim-buster image is a nice inbetween. I rarely get over a 1 gig image size.

We then will install any underlying binaries we need with the following directive.

RUN apt-get update && apt-get install -y \
    build-essential \
    libpq-dev \
    && rm -rf /var/lib/apt/lists/*

build-essential will get us a C compiler and other stuff in order to install Python packages with C extentions. In our case psycopg2. We will also install libpq-dev as well. The && rm -rf ... will clean up apt-get for us to minimize image size. This has to be in the same directive. If you were to write it as follows.

RUN apt-get update && apt-get install -y \
    build-essential \
    libpq-dev
RUN rm -rf /var/lib/apt/lists/*

You would not actually reduce the image size as the package info will be removed in the later layer but still exist in the previous layer. If the image is squashed I believe this way of writing the image is fine. I am not particularly familiar with this process but to my understanding it consolidates all the layers and thus the files would actually be removed. I've never done this so I cannot give an informed opinion on squashing images.

The next two directives simply create a directory and make it our working directory. There is not much to say here.

RUN mkdir /code
WORKDIR /code

We then bring in our code and install our various pip dependencies.

COPY requirements.txt .
RUN python3.9 -m pip install --no-cache-dir --upgrade \
    pip \
    setuptools \
    wheel
RUN python3.9 -m pip install --no-cache-dir \
    -r requirements.txt
COPY . .

The order of these matter. Docker uses caching to determine whether or not to build a layer. When a layer invalidates the cache the subsquent layers will also be rebuilt. For the COPY directive the cache is calculated by the checksum of the files being copied. If you were to have the series of directives as follows.

COPY . .
RUN python3.9 -m pip install --no-cache-dir --upgrade \
    pip \
    setuptools \
    wheel
RUN python3.9 -m pip install --no-cache-dir \
    -r requirements.txt

Both the subsequent layers from the RUN directives would always be rebuilt and thus slow down the build. Reality is you only need to upgrade pip, setuptools, and wheel when you have new dependencies to install. Same with the actual install of the requirements.txt file. By only copying over the requirements.txt file we will only do those time consuming steps when our requirements.txt file actually changes. From there we simply copy in our new code, expose a port and define a command to be ran when the container is ran. Finally when installing pip dependencies you should use the --no-cache-dir flag as it prevents pip from downloading the packages to a cache in the event you want to quickly install again. This is unnecessary in the docker image and thus takes up space.

The way the file is written only the last 3 layers will be rebuilt on subsequent builds of the image. Thus rebuilding the image rather quickly. Go ahead and build the docker image with both ways and see the difference (you might have to make code changes for docker to pick up a checksum change). Also the caching determines which layers need to be pushed to a Docker image repository as well as which layers need to be pulled down to the host machine running the application on deployment. The way we wrote this Dockerfile makes it so that again only the last 3 layers, which are tiny, get pushed to the repo and pulled down to the application host, thus deploying quickly. This is a best case scenario. Obviously if you're trying to push changes that also change the requirements.txt those layers will also need to be pushed.

Our app.py will look like this

from flask import Flask

app = Flask(__name__)


@app.route('/')
def hello_world():
    return 'Hello World!'


if __name__ == '__main__':
    # the /etc/hosts in docker containers doesn't like 127.0.0.1
    #  so use 0.0.0.0 instead.
    app.run(host="0.0.0.0")

And requirements.txt will look like this.

flask
psycopg2
sqlalchemy

Obviously our code isn't currently using any database but I included some to showcase a realistic example.

To build and run you can use the following commands.

$ docker build -t flask-docker .
$ docker run -it -p 5000:5000 flask-docker

I hope this post was helpful in understanding certain ways to write Dockerfiles for you Python app. I used flask in this example but it isn't particularly different for Django. Basically only the CMD directive would change. Also for both you'd likely want to use uwsgi or gunincorn to actually run the webserver once deployed with NGinx or Apache as a reverse proxy.

I find the Dockerfiles best practices section of the Docker documentation to be really helpful.

Discover and read more posts from Adam Mertz
get started
post comments6Replies
Nazeer H
2 years ago

Awesome tutorial and explaination. This will help people to start with. Thanks a lot for this article.

binita pal
3 years ago

Thank you so much for the tutorial

https://bit.ly/2VCg1Dg <a href="
http://bit.ly/2VCg1Dg">bit.ly/2VCg1Dg</a>

burbigo deen
4 years ago

Thank you so much for the tutorial.

Show more replies