Docker Guide
install docker
1 | # step 1: install tools |
install docker-ce for given version
1 | # Step 1: search versions |
test docker
1 | sudo docker version |
docker namespace
host
id
uid=1000(kezunlin) gid=1000(kezunlin) groups=1000(kezunlin),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),113(lpadmin),128(sambashare)
sudo docker images
sudo docker run -it --name kzl -v /home/kezunlin/workspace/:/home/kezunlin/workspace nvidia/cuda
container
root@6f167ef72a80:/home/kezunlin/workspace# ll
total 48
drwxrwxr-x 12 1000 1000 4096 Nov 30 10:04 ./
drwxr-xr-x 3 root root 4096 Nov 30 10:14 ../
drwxrwxr-x 10 1000 1000 4096 Dec 5 2017 MyGit/
drwxrwxr-x 12 1000 1000 4096 Oct 31 03:01 blog/
drwxrwxr-x 5 1000 1000 4096 Sep 20 07:33 opencv/
drwxrwxr-x 4 1000 1000 4096 Oct 31 07:55 openmp/
drwxrwxr-x 5 1000 1000 4096 Jan 9 2018 qt/
drwxrwxr-x 2 1000 1000 4096 Jan 4 2018 ros/
drwxrwxr-x 4 1000 1000 4096 Nov 16 2017 voc/
drwxrwxr-x 5 1000 1000 4096 Aug 7 03:19 vs/
root@6f167ef72a80:/home/kezunlin/workspace# touch 1.txt
root@6f167ef72a80:/home/kezunlin/workspace# id
uid=0(root) gid=0(root) groups=0(root)
host
ll /home/kezunlin/workspace/
total 48
drwxrwxr-x 12 kezunlin kezunlin 4096 11月 30 18:14 ./
drwxr-xr-x 47 kezunlin kezunlin 4096 11月 30 18:04 ../
-rw-r--r-- 1 root root 0 11月 30 18:14 1.txt
drwxrwxr-x 12 kezunlin kezunlin 4096 10月 31 11:01 blog/
drwxrwxr-x 5 kezunlin kezunlin 4096 9月 20 15:33 opencv/
drwxrwxr-x 4 kezunlin kezunlin 4096 10月 31 15:55 openmp/
drwxrwxr-x 5 kezunlin kezunlin 4096 1月 9 2018 qt/
drwxrwxr-x 2 kezunlin kezunlin 4096 1月 4 2018 ros/
drwxrwxr-x 4 kezunlin kezunlin 4096 11月 16 2017 voc/
drwxrwxr-x 5 kezunlin kezunlin 4096 8月 7 11:19 vs/
install nvidia-docker2
The machine running the CUDA container only requires the NVIDIA driver, the CUDA toolkit doesn’t have to be installed.
Host系统只需要安装NVIDIA driver即可运行CUDA container。
install
remove nvidia-docker 1.0
1 | # If you have nvidia-docker 1.0 installed: we need to remove it and all existing GPU containers |
Add the package repositories
vim repo.sh
1 | curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \ |
run scripts
1 | chmod +x repo.sh |
Install nvidia-docker2 and reload the Docker daemon configuration
1 | sudo apt-get install -y nvidia-docker2 |
test
1 | sudo docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi |
output
1 | Unable to find image 'nvidia/cuda:latest' locally |
or by tty
1 | sudo docker run --runtime=nvidia -t -i --privileged nvidia/cuda bash |
Advanced Topics
Default runtime
The default runtime used by the Docker® Engine is runc
, our runtime can become the default one by configuring the docker daemon with --default-runtime=nvidia
. Doing so will remove the need to add the --runtime=nvidia
argument to docker run. It is also the only way to have GPU access during docker build.
Environment variables
The behavior of the runtime can be modified through environment variables (such as NVIDIA_VISIBLE_DEVICES
).
Those environment variables are consumed by nvidia-container-runtime
and are documented here.
Our official CUDA images use default values for these variables.
docker command
1 | sudo docker image list |
kubernetes with GPU
kubernetes 对于 GPU 的支持截止到 1.9 版本,算是经历了3个阶段:
kubernetes 1.3 版本开始支持GPU,但是只支持单个 GPU卡;
kubernetes 1.6 版本开始支持对多个GPU卡的支持;
kubernetes 1.8 版本以 device plugin 方式提供对GPU的支持。
ls /dev/nvidia*
/dev/nvidia0 /dev/nvidia2 /dev/nvidia4 /dev/nvidia6 /dev/nvidiactl
/dev/nvidia1 /dev/nvidia3 /dev/nvidia5 /dev/nvidia7Kubernetes 1.8~1.9,通过
k8s-device-plugin
获取每个Node上GPU的信息,根据这些信息对GPU资源进行管理和调度。需要结合 nvidia-docker2 使用。k8s-device-plugin
也是由 nvidia 提供,在kubernetes中可以DaemonSet方式运行。
Reference
History
- 20180903: created.