Docker


Rough notes on setting up an Ubuntu 22.04LTS server with docker and snap

IP allocations

First, we set up a static IP on the network device that would handle all external traffic and a DHCP on the network device that would access the management network, which is connected for maintenance.

To do so, we created the following file:

/etc/netplan/01-netcfg.yaml

using the following command:

sudo nano /etc/netplan/01-netcfg.yaml;

and added the following content to it:

# This file describes the network interfaces available on your system
# For more information, see netplan(5).
network:
  version: 2
  renderer: networkd
  ethernets:
    eth0:
      dhcp4: no
      addresses: [192.168.45.13/24]
      gateway4: 192.168.45.1
      nameservers:
          addresses: [1.1.1.1,8.8.8.8]
    eth1:
      dhcp4: yes

To apply the changes, we executed the following:

sudo netplan apply;

Update everything (the operating system and all packages)

Usually, it is a good idea to update your system before making significant changes to it:

sudo apt update -y; sudo apt upgrade -y; sudo apt autoremove -y;

Install docker via snap

In this setup, we did not use the docker version available on the Ubuntu repositories, we went for the ones from the snap. To install it, we used the following commands:

sudo apt install snapd;
sudo snap install docker;

Increase network pool for docker daemon

To handle the following problem:

ERROR: could not find an available, non-overlapping IPv4 address pool among the defaults to assign to the network

We modified the following file

/var/snap/docker/current/config/daemon.json

using the command:

sudo nano /var/snap/docker/current/config/daemon.json;

and set the content to be as follows:

{
    "log-level":        "error",
    "storage-driver":   "overlay2",
    "default-address-pools": [
        {
            "base": "172.80.0.0/16",
            "size": 24
        },
        {
            "base": "172.90.0.0/16",
            "size": 24
        }
    ]
}

We executed the following command to restart the docker daemon and get the network changes applied:

sudo snap disable docker;
sudo snap enable docker;

Gave access to our user to manage the docker

We added our user to the docker group so that we could manage the docker daemon without sudo rights.

sudo addgroup --system docker;
sudo adduser $USER docker;
newgrp docker;
sudo snap disable docker;
sudo snap enable docker;

After that, we made sure that the access rights to the volumes were correct:

sudo chown -R www-data:www-data /volumes/*
sudo chown -R tux:tux /volumes/letsencrypt/ /volumes/reverse/private/

Deploying

After we copied everything in place, we executed the following command to create our containers and start them with the appropriate networks and volumes:

export COMPOSE_HTTP_TIMEOUT=600;
docker-compose up -d --remove-orphans;

We had to increase the timeout as we were getting the following error:

ERROR: for container_a  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)
ERROR: An HTTP request took too long to complete. Retry with --verbose to obtain debug information.
If you encounter this issue regularly because of slow network conditions, consider setting COMPOSE_HTTP_TIMEOUT to a higher value (current value: 60).

Updating the databases and performing any repairs

First, we connected to a terminal of the database container using the following command:

docker exec -it mariadb_c1 /bin/bash;

From there, we executed the following commands:

mysql_upgrade --user=root --password;
mysqlcheck -p -o --all-databases;

Bulk / Batch stopping docker containers

The following commands will help you stop many docker containers simultaneously. Of course, you can change the command stop to another, for example rm or whatever suits your needs.

You need to keep in mind that if you have dependencies between containers, you might need to execute the commands below more than once.

Stop all docker containers.

docker container stop $(docker container ls -q);
#This command creates a list of all containers.
#Using the -q parameter, we only get back the container ID and not all information about them.
#Then it will stop each container one by one.

Stop specific docker containers using a filter on their name.

docker container stop $(docker container ls -q --filter name=_web);
#This command finds all containers that their name contains _web.
#Using the -q parameter, we only get back the container ID and not all information about them.
#Then it will stop each container one by one.

A personal note

Check the system for things you might need to configure, like a crontab or other services.


Bind for 0.0.0.0:443 failed: port is already allocated

On a Docker installation that we have, we updated the image files for our containers using the following command:

docker images --format "{{.Repository}}:{{.Tag}}" | grep ':latest' | xargs -L1 docker pull;

Then we tried to update our container, as usual, using the docker-compose command.

export COMPOSE_HTTP_TIMEOUT=180; # We extend the timeout to ensure there is enough time for all containers to start
docker-compose up -d --remove-orphans;

Unfortunately, we got the following error:

export COMPOSE_HTTP_TIMEOUT=180;
docker-compose up -d --remove-orphans;

Starting entry ... 
Starting entry ... error

ERROR: for entry  Cannot start service entry: driver failed programming external connectivity on endpoint entry (d3a5d95f55c4e872801e92b1f32d9693553bd553c414a371b8ba903cb48c2bd5): Bind for 0.0.0.0:443 failed: port is already allocated

ERROR: for entry  Cannot start service entry: driver failed programming external connectivity on endpoint entry (d3a5d95f55c4e872801e92b1f32d9693553bd553c414a371b8ba903cb48c2bd5): Bind for 0.0.0.0:443 failed: port is already allocated
ERROR: Encountered errors while bringing up the project.

We used the docker container ls command to check which container was hoarding port 443, but none was doing so. Because of this, we assumed that docker ran into a bug. The first step we took (and the last) which solved the problem was to restart the docker service as follows:

sudo service docker restart;

This command was enough to fix our problem without messing with docker further.


Incorrect definition of table mysql.column_stats: expected column

2022-06-07  7:17:02 3051 [ERROR] Incorrect definition of table mysql.column_stats: expected column 'hist_type' at position 9 to have type enum('SINGLE_PREC_HB','DOUBLE_PREC_HB','JSON_HB'), found type enum('SINGLE_PREC_HB','DOUBLE_PREC_HB').
2022-06-07  7:17:02 3051 [ERROR] Incorrect definition of table mysql.column_stats: expected column 'histogram' at position 10 to have type longblob, found type varbinary(255).

While checking the logs of a MariaDB docker container, we found the above error lines repeating thousands of times. It appears that there was an issue during the migration of the database to a newer version. The solution was to manually execute the command mysql_upgrade. To execute it, we first had to gain access to a shell inside the container, we did that using docker exec -it CONTAINER_NAME /bin/bash as below:

# Gain shell access to the database container
docker exec -it mariadb_alpha /bin/bash;
# In the shell of the container, we executed the following to automatically fix a variety of problems/errors
mysql_upgrade --user=root --password;

Unknown/unsupported storage engine: InnoDB

2022-05-30 06:09:48+00:00 [Note] [Entrypoint]: Entrypoint script for MariaDB Server 1:10.8.3+maria~jammy started.
2022-05-30 06:09:48+00:00 [Note] [Entrypoint]: Switching to dedicated user 'mysql'
2022-05-30 06:09:48+00:00 [Note] [Entrypoint]: Entrypoint script for MariaDB Server 1:10.8.3+maria~jammy started.
2022-05-30 06:09:49+00:00 [Note] [Entrypoint]: MariaDB upgrade information missing, assuming required
2022-05-30 06:09:49+00:00 [Note] [Entrypoint]: MariaDB upgrade (mariadb-upgrade) required, but skipped due to $MARIADB_AUTO_UPGRADE setting
2022-05-30  6:09:49 0 [Note] mariadbd (server 10.8.3-MariaDB-1:10.8.3+maria~jammy) starting as process 1 ...
2022-05-30  6:09:49 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2022-05-30  6:09:49 0 [Note] InnoDB: Number of transaction pools: 1
2022-05-30  6:09:49 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
2022-05-30  6:09:49 0 [Note] mariadbd: O_TMPFILE is not supported on /tmp (disabling future attempts)
2022-05-30  6:09:49 0 [Warning] mariadbd: io_uring_queue_init() failed with ENOSYS: check seccomp filters, and the kernel version (newer than 5.1 required)
2022-05-30  6:09:49 0 [Warning] InnoDB: liburing disabled: falling back to innodb_use_native_aio=OFF
2022-05-30  6:09:49 0 [Note] InnoDB: Initializing buffer pool, total size = 128.000MiB, chunk size = 2.000MiB
2022-05-30  6:09:49 0 [Note] InnoDB: Completed initialization of buffer pool
2022-05-30  6:09:49 0 [Note] InnoDB: File system buffers for log disabled (block size=512 bytes)
2022-05-30  6:09:49 0 [ERROR] InnoDB: Upgrade after a crash is not supported. The redo log was created with MariaDB 10.5.4.
2022-05-30  6:09:49 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
2022-05-30  6:09:49 0 [Note] InnoDB: Starting shutdown...
2022-05-30  6:09:49 0 [ERROR] Plugin 'InnoDB' init function returned error.
2022-05-30  6:09:49 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
2022-05-30  6:09:49 0 [Note] Plugin 'FEEDBACK' is disabled.
2022-05-30  6:09:49 0 [ERROR] Unknown/unsupported storage engine: InnoDB
2022-05-30  6:09:49 0 [ERROR] Aborting

Recently, we were working on a MariaDB installation in Docker which was using the latest version of the container. The definition in our configuration file was as follows:

# An excerpt from our docker-compose.yml
mariadb_alpha:
    depends_on:
      - another_container
    image: mariadb
    container_name: mariadb_alpha
    networks:
      - mariadb_alpha
    volumes:
      - /alpha/mysql:/var/lib/mysql
    restart: unless-stopped
    environment:
      MYSQL_ROOT_PASSWORD: qwerty
      MYSQL_DATABASE: aplha
      MYSQL_USER: user
      MYSQL_PASSWORD: password

After an update, the database stopped working and the logs were giving the above errors. Specifically, we got the error that InnoDB was an unknown or unsupported storage engine which is really bad! The command we used to view the logs is the following:

# We used the following docker command to view the logs of the container
docker container logs mariadb_alpha;

We noticed the following line from the records, which was extremely useful:

InnoDB: Upgrade after a crash is not supported. The redo log was created with MariaDB 10.5.4.

From this information, we were able to understand that the last time the database functioned properly, it was using MariaDB version 10.5.4. By visiting the official docker image website for MariaDB, we were able to see that there was a version tagged 10.5. We modified our YML file and changed the image of the container to the one below which uses the 10.5 tagged image:

# An excerpt from our docker-compose.yml
mariadb_alpha:
    ...
    image: mariadb:10.5
    ...

Then, we rebuilt our container using the docker-compose command:

#We increase the timeout to avoid issues
export COMPOSE_HTTP_TIMEOUT=180;
docker-compose up -d --remove-orphans;

After the setup was complete, we were able to see that the container was working as expected! We reverted the change in the YML file back to image: mariadb and executed docker-compose once more. The MariaDB container was updated to the latest version and was working as expected again!


Rough notes on setting up an Ubuntu server with docker

Static IP

First, we set up a static IP to our Ubuntu server using netplan. To do so, we created the following file:

/etc/netplan/01-netcfg.yaml

using the following command

sudo nano /etc/netplan/01-netcfg.yaml;

and added the following content to it:

# This file describes the network interfaces available on your system
# For more information, see netplan(5).
network:
  version: 2
  renderer: networkd
  ethernets:
    enp3s0f0:
      dhcp4: no
      addresses: [192.168.45.13/24]
      gateway4: 192.168.45.1
      nameservers:
          addresses: [1.1.1.1,8.8.8.8]

To apply the changes, we executed the following:

sudo netplan apply;

Update everything (the operating system and all packages)

Usually, it is a good idea to update your system before making significant changes to it:

sudo apt update -y; sudo apt upgrade -y; sudo apt autoremove -y;

Install docker

In this setup we did not use the docker version that is available on the Ubuntu repositories, we went for the official ones from docker.com. To install it, we used the following commands:

sudo apt-get install ca-certificates curl gnupg lsb-release;
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg;
echo   "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null;
sudo apt-get update;
sudo apt-get install docker-ce docker-ce-cli containerd.io;

Install docker-compose

Again, we installed the official docker-compose from github.com instead of the one available in the Ubuntu repositories. At the time that this post was created, version 1.29.2 was the recommended one:

sudo curl -L "https://github.com/docker/compose/releases/download/1.29.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose;
sudo chmod +x /usr/local/bin/docker-compose;

Increase network pool for docker daemon

To handle the following problem:

ERROR: could not find an available, non-overlapping IPv4 address pool among the defaults to assign to the network

We created the following file,

/etc/docker/daemon.json

using the command:

sudo nano /etc/docker/daemon.json;

and added the following content to it:

{
  "default-address-pools": [
    {
      "base": "172.80.0.0/16",
      "size": 24
    },
    {
      "base": "172.90.0.0/16",
      "size": 24
    }
  ]
}

We executed the following command to restart the docker daemon and get the network changes applied:

sudo systemctl restart docker;

Gave access to our user to manage docker

We added our user to the docker group so that we could manage the docker daemon without sudo rights.

sudo usermod -aG docker $USER;

Deploying

After we copied everything in place, we executed the following command to create our containers and start them with the appropriate networks and volumes:

export COMPOSE_HTTP_TIMEOUT=120;
docker-compose up -d --remove-orphans;

We had to increase the timeout as we were getting the following error:

ERROR: for container_a  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)
ERROR: An HTTP request took too long to complete. Retry with --verbose to obtain debug information.
If you encounter this issue regularly because of slow network conditions, consider setting COMPOSE_HTTP_TIMEOUT to a higher value (current value: 60).

Stopping all containers using a filter on the name

docker container stop $(docker container ls -q --filter name=_web);

The above command will find all containers whose names contain _web and stop them. That command is actually two commands where one is nested inside the other.

#This command finds all containers that their name contains _web, using the -q parameter, we only get back the container ID and not all information about them.
docker container ls -q --filter name=_web;
#The second command takes as input the output of the nested command and stops all containers that are returned.
docker container stop $(docker container ls -q --filter name=_web);