Docker Swarm continuous deployment using native service health checks

This blog post explains the process to configure the Docker service health checks using native docker service commands.

Docker swarm is a container orchestration tool similar Kubernetes, OpenShift, ECS, EKS and it comes as part of Docker engine. Read more about swarm in official docker docs

Image by Wilfried Pohnke from Pixabay

Create swarm cluster

Swarm cluster contains at least one master node and optional worker nodes. This blog post primarily focuses on configuring health checks instead of swarm cluster creation. You can read more cluster creation here

Create new swarm cluster and initialize it

$ docker swarm init --advertise-addr 192.168.99.100
Swarm initialized: current node (dxn1zf6l61qsb1josjja83ngz) is now a manager.

To add a worker to this swarm, run the following command:

docker swarm join \
--token SWMTKN-1-49nj1cmql0jkz5s954yi3oex3nedyz0fb0xx14ie39trti4wxv-8vxv8rssmk743ojnwacrr2e7c \
192.168.99.100:2377

To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.

List cluster nodes using docker node ls command

$ docker node ls

ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
dxn1zf6l61qsb1josjja83ngz * manager1 Ready Active Leader

Create Service

  1. Docker image used in this article built from Github project DockerHealthCheckDemo,
  2. SSH into master node and execute the following command to create new serviceemployee_service(this is just a basic command, more details will be added later) with 2 replicas(containers)
$ docker service create --replicas 2 -p 8080:8080 --name employee_service employee_springboot:latest

The service uses spring boot docker image employee_springboot:latest and exposes port 8080

You can always list available services using docker service ls command

$ docker service ls

ID NAME SCALE IMAGE
9uk4639qpg7n employee_service 2/2 employee

Continuous deployment

Update the service by bringing up new container from update image and maintain delay of 120 seconds between each container update. Continuous deployment can be achieved using the following command.

Let’s look at each parameter

$ docker service update --force --detach=false --update-parallelism=1 --update-delay=300s --update-failure-action=rollback --update-order=start-first employee_service

detach=false → Do not exit immediately and wait for the service to converge

update-parallelism=1 → Update one container at a time

update-delay=300s → Wait time between each container to come up(optional)

update-failure-action=rollback → Roll back to previous state if failed to update

update-order=start-first → Start new container first before killing existing container

However we have a problem here. By default Docker service brings up new container upon availability and marks it as healthy irrespective of the application status. So, HTTP requests from the client will be forwarded to new container before application came up, which then returns an error

Container Health check

To prevent this error, we need to add custom health check for the container. There are couple of way to do it

  1. Docker provides HEALTHCHECK instruction(command) to achieve this(preferred way)
  2. Or, Add health-cmd flag to docker service update command

Both do same thing but it’s just defining the place of the configuration.

HEALTHCHECK Instruction

First, let’s look at example HEALTHCHECK instruction below

## Use OpenJDK 11 slim image
FROM adoptopenjdk:11-jre-openj9-bionic

### Copy JAR file from local machine to container
COPY target/*.jar app.jar

### Expose the port
EXPOSE 8080

### Health check endpoint
HEALTHCHECK --start-period=2m --interval=30s --timeout=5s CMD curl -f http://localhost:8080/api/v1/health/find/status | grep UP || exit 1

### Start the Spring Boot application
CMD ["java","-jar","app.jar"]

As shown in the Dockerfile, container configured to take 2 minutes start up time, checks every 30 seconds for the status with 5 seconds timeout at each try.

HEALTHCHECK instruction accepts 4 parameters

--interval=DURATION (default: 30s)

--timeout=DURATION (default: 30s)

--start-period=DURATION (default: 0s)

--retries=N (default: 3)

The health check will first run interval seconds after the container is started, and then again interval seconds after each previous check completes.If a single run of the check takes longer than timeout seconds then the check is considered to have failed. start period provides initialization time for containers that need time to bootstrap. However, if a health check succeeds during the start period, the container is considered started and all consecutive failures will be counted towards the maximum number of retries.

health-cmd flag

Second way of doing this is using the health-cmd command with docker service update

$ docker service update --force --detach=false --update-parallelism=1 --update-delay=300s --update-failure-action=rollback --update-order=start-first --health-cmd="curl -f http://localhost:8080/api/v1/health/find/status | grep UP || exit 1" --health-start-period=2m --health-interval=5s --health-timeout=30s employee_service

we pass parameters with same values as in HEALTHCHECK instruction in Dockerfile

  • -health-cmd=”curl -f http://localhost:8080/api/v1/health/find/status | grep UP || exit 1
  • –health-start-period=2m
  • –health-interval=5s
  • –health-timeout=30s

Testing

  1. Clone the repo and build the image
$ git clone https://github.com/pavankjadda/DockerHealthCheckDemo.git$ cd DockerHealthCheckDemo$ docker build -t employee_springboot .

2. Create new service

$ docker service create --replicas 2 -p 8080:8080 --name  employee_service employee_springboot:latest
New service with two containers

Once the service is up, docker ps should show two containers 1a995cde59cce7b913e2c3f4 (shown above)and status should be healthy

3. Update the service

$ docker service update --force --detach=false --update-parallelism=1  --update-delay=120s --update-failure-action=rollback  --update-order=start-first --update-failure-action=rollback employee_service
Docker Service Updating

docker service update command brings up another container 665ee54f8be while serving the requests to existing containers. The status of the container shows as health: starting. You can read more about container health status here

After a minute, container 665ee54f8be comes up and container e7b913e2c3f4 taken down. After 2 minutes, both containers are taken down and new containers are serving the user requests

Docker Service Updated with new containers

Note: We can remove update-delay flag from docker service update command unless we specifically want delay between containers like AWS ECS blue green deployment.

Conclusion

Using docker service update command and Dockerfile HEALTHCHECK instruction, we can continuously deploy applications without interrupting the user workflow.

Pavan Kumar Jadda
Pavan Kumar Jadda
Articles: 36

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.