Do not use the snap version of docker for Ray on Ubuntu 20.04LTS


If you are trying to deploy a local Ray cluster on Ubuntu machines and you are getting the following error:

Shared connection to 192.168.1.74 closed.
     Running docker exec ray_container printenv HOME
       Full command is ssh -tt -i ~/.ssh/id_rsa -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o ExitOnForwardFailure=yes -o ServerAliveInterval=5 -o ServerAliveCountMax=3 -o ControlMaster=auto -o ControlPath=/tmp/ray_ssh_71415f9f14/c21f969b5f/%C -o ControlPersist=10s -o ConnectTimeout=120s [email protected] bash --login -c -i 'true && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (docker exec ray_container printenv HOME)'
 Shared connection to 192.168.1.74 closed.
     Running docker cp /tmp/ray_tmp_mount/default/~/ray_bootstrap_config.yaml ray_container:/home/ray/ray_bootstrap_config.yaml
       Full command is ssh -tt -i ~/.ssh/id_rsa -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o ExitOnForwardFailure=yes -o ServerAliveInterval=5 -o ServerAliveCountMax=3 -o ControlMaster=auto -o ControlPath=/tmp/ray_ssh_71415f9f14/c21f969b5f/%C -o ControlPersist=10s -o ConnectTimeout=120s [email protected] bash --login -c -i 'true && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && (docker cp /tmp/ray_tmp_mount/default/~/ray_bootstrap_config.yaml ray_container:/home/ray/ray_bootstrap_config.yaml)'
 lstat /tmp/ray_tmp_mount/default/~: no such file or directory
 Shared connection to 192.168.1.74 closed.
 2021-06-09 11:41:09,299    INFO node_provider.py:93 -- ClusterState: Writing cluster state: ['192.168.1.70', '192.168.1.74']

You might need to consider removing the snap version of docker and follow the official instructions of docker.

# From https://docs.docker.com/engine/install/ubuntu/
sudo apt-get remove docker docker-engine docker.io containerd runc;
sudo apt-get update;
sudo apt-get install apt-transport-https ca-certificates curl gnupg lsb-release;

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io

sudo addgroup --system docker
sudo adduser $USER docker
newgrp docker
sudo systemctl restart docker

This post is also available in: Greek

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.