GPUs#
VSC Tier-1 Cloud users can also deploy VMs with different kind of GPUs. A full GPU card is connected directly to the VM via PCI passthrough and it is not shared between VMs.
See section Instance types and flavors
for more information about the different GPUs available (GPUv*
flavors).
Nvidia GPUs require the proprietary Nvidia driver to work, here it is explained how to install and keep updated Nvidia drivers for each public OS available from VSC Tier-1 Cloud.
Ubuntu#
Add graphics drivers ppa repo:
sudo add-apt-repository ppa:graphics-drivers/ppa
Install Ubuntu drivers app:
sudo apt install ubuntu-drivers-common
Check available GPUs:
ubuntu-drivers devices
== /sys/devices/pci0000:00/0000:00:06.0 ==
modalias : pci:v000010DEd00001EB8sv000010DEsd000012A2bc03sc02i00
vendor : NVIDIA Corporation
model : TU104GL [Tesla T4]
driver : nvidia-driver-470-server - distro non-free
driver : nvidia-driver-515 - distro non-free
driver : nvidia-driver-470 - distro non-free
driver : nvidia-driver-525-server - distro non-free
driver : nvidia-driver-418-server - distro non-free
driver : nvidia-driver-510 - distro non-free
driver : nvidia-driver-525 - distro non-free recommended
driver : nvidia-driver-515-server - distro non-free
driver : nvidia-driver-450-server - distro non-free
driver : xserver-xorg-video-nouveau - distro free builtin
Install latest Nvidia driver (change
525
with the latest version available in your case):
sudo apt install nvidia-driver-525
Reboot your VM.
Check Nvidia driver and CUDA are available after the reboot:
$ nvidia-smi
Thu Feb 2 16:47:42 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.78.01 Driver Version: 525.78.01 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 Off | 00000000:00:06.0 Off | 0 |
| N/A 45C P8 16W / 70W | 6MiB / 15360MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 833 G /usr/lib/xorg/Xorg 4MiB |
+-----------------------------------------------------------------------------+
Alma Linux/CentOS/Red Hat 8.x#
Add epel repo:
sudo dnf install epel-release
Add Nvidia repo:
sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo
Install Nvidia driver and cuda:
sudo dnf install nvidia-driver nvidia-driver-cuda nvidia-driver-NVML
Reboot your VM.
Check Nvidia driver and CUDA are available after the reboot:
$ nvidia-smi
Thu Feb 2 15:36:40 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12 Driver Version: 525.85.12 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 Off | 00000000:00:06.0 Off | 0 |
| N/A 63C P0 31W / 70W | 2MiB / 15360MiB | 8% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Debian 11#
Install required repositories from Debian:
sudo apt install software-properties-common linux-headers-$(uname -r) -y
sudo add-apt-repository contrib
sudo add-apt-repository non-free
sudo apt install dirmngr ca-certificates software-properties-common apt-transport-https dkms curl -y
Import GPG key from Nvidia repo:
curl -fSsL https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64/3bf863cc.pub | sudo gpg --dearmor | sudo tee /usr/share/keyrings/nvidia-drivers.gpg > /dev/null 2>&1
Import Nvidia repo:
echo 'deb [signed-by=/usr/share/keyrings/nvidia-drivers.gpg] https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64/ /' | sudo tee /etc/apt/sources.list.d/nvidia-drivers.list
sudo apt update
sudo apt install nvidia-driver cuda nvidia-smi nvidia-settings -y
Reboot your VM.
Check Nvidia driver and CUDA are available after the reboot:
$ nvidia-smi
Wed Feb 8 16:32:36 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12 Driver Version: 525.85.12 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA A2 On | 00000000:00:06.0 Off | 0 |
| 0% 46C P0 20W / 60W | 0MiB / 15356MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+