VMI Specifications for Nvidia A100 Series

Vaibhav Raizada

unread,

Jun 18, 2021, 10:07:26 AM6/18/21

to kubevirt-dev

Hi,

I am trying to write a VMI specification with GPU access. The GPU card is Nvidia A100 (A100-SXM4-40GB), however I am not sure how to write the gpu device name in the specification. For Tesla T4, I used TU104GL_Tesla_T4 but I am not sure about A100 series.

---

apiVersion: kubevirt.io/v1alpha3

kind: VirtualMachineInstance

metadata:

labels:

special: vmi-gpu

spec:

domain:

devices:

disks:

- disk:

bus: virtio

- disk:

bus: virtio

gpus:

- deviceName: nvidia.com/??????????????????????????

rng: {}

machine:

type: ""

resources:

requests:

memory: 1024M

terminationGracePeriodSeconds: 0

volumes:

- containerDisk:

image: ovaleanu/centos:latest

- cloudInitNoCloud:

userData: |-

#cloud-config

ssh_pwauth: True

password: centos

chpasswd: { expire: False }

Thanks,

Vaibhav

Vladik Romanovsky

unread,

Jun 18, 2021, 11:44:57 AM6/18/21

to Vaibhav Raizada, kubevirt-dev

Hi Vaibhav,

Thank you for raising this topic.

Perhaps, it would be best to start with reading our user guide about host devices assignment with KubeVirt[1]

In general, the name of the resource/device depends on the device plugin that provides it.

KubeVirt has a built-in generic mechanism for discovering and allocating host devices, including GPU and vGPU (PCI/MDEVs in general).

As an admin, you would simply provide a list of devices that are permitted in the cluster (as below), naming it according to the admins' preference.

KubeVirt will then discover these devices on the cluster nodes and will start a device plugin for each.

configuration: permittedHostDevices: pciHostDevices: - pciVendorSelector: "10DE:1EB8" resourceName: "nvidia.com/Tesla_T4"

When this device is requested by a user for the VMI it will need to be referenced by this name

gpus:

- deviceName: nvidia.com/Tesla_T4

However, if you are using an external device plugin, it is up to the device plugin to choose the resource name for the device it advertises.

In most cases, you could find the name by querying the node with `kubectl describe node [nodeName]`

The name of the device could be found in the Capacity/Allocatable sections:

Allocatable:
nvidia.com/TU104GL_Tesla_T4: 1

Let me know if you have any other questions.

Thanks,

Vladik

[1] https://kubevirt.io/user-guide/virtual_machines/host-devices/#listing-permitted-devices

--
You received this message because you are subscribed to the Google Groups "kubevirt-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubevirt-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/f42d9d61-1a9f-4d89-961a-947267352668n%40googlegroups.com.

Vaibhav Raizada

unread,

Jun 21, 2021, 1:17:52 AM6/21/21

to kubevirt-dev

Hi Vladik,

Thanks for the help. I followed the way suggested in your mail but still ran into same problem.

Here is what I did:

1. Find the device and vendor ID

[root@node003 ~]# lspci -nnv|grep -i nvidia

01:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:20b0] (rev a1)

Subsystem: NVIDIA Corporation Device [10de:144e]

Kernel driver in use: nvidia

2. Modified kubevirt-cr.yaml

---

apiVersion: kubevirt.io/v1

kind: KubeVirt

metadata:

namespace: kubevirt

spec:

certificateRotateStrategy: {}

configuration:

permittedHostDevices:

pciHostDevices:

- pciVendorSelector: "10de:20b0"

resourceName: "nvidia.com/A100-SXM4-40GB"

externalResourceProvider: true

developerConfiguration:

featureGates: []

customizeComponents: {}

imagePullPolicy: IfNotPresent

workloadUpdateStrategy: {}

3. Installed kubevirt using above CR definition and the default operator for v0.42.1

4. Fetch status of pods and describe the pod

[root@node003 ~]# kubectl get pods

NAME READY STATUS RESTARTS AGE

virt-launcher-vmi-gpu-sgwp6 0/2 Pending 0 7s

[root@node003 ~]# kubectl describe pod virt-launcher-vmi-gpu-sgwp6

Name: virt-launcher-vmi-gpu-sgwp6

...............................................................................

Events:

Type Reason Age From Message

---- ------ ---- ---- -------

Warning FailedScheduling 4s (x3 over 87s) default-scheduler 0/1 nodes are available: 1 Insufficient nvidia.com/A100-SXM4-40GB.

Thanks,

Vaibhav

Fabian Deutsch

unread,

Jun 21, 2021, 4:24:24 AM6/21/21

to Vaibhav Raizada, kubevirt-dev

^^ Please try removing this line from the config (or set it to false)

> To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/5007989f-7210-47e7-bd09-0991400d2f7dn%40googlegroups.com.

Vaibhav Raizada

unread,

Jun 21, 2021, 6:28:21 AM6/21/21

to kubevirt-dev

I tried after removing the suggested line but still the same issue.

[root@node003 ~]# kubectl get pods

NAME READY STATUS RESTARTS AGE

virt-launcher-vmi-gpu-8fw2x 0/2 Pending 0 11m

[root@node003 ~]# kubectl describe pod virt-launcher-vmi-gpu-8fw2x

Warning FailedScheduling 14s (x4 over 2m32s) default-scheduler 0/1 nodes are available: 1 Insufficient nvidia.com/A100-SXM4-40GB.

Thanks,

Vaibhav

Vladik Romanovsky

unread,

Jun 21, 2021, 11:07:02 AM6/21/21

to Vaibhav Raizada, kubevirt-dev

Hi Vaibhav,

To to assign any PCI device to a virtual machine, the relevant device needs to be bound to the vfio-pci driver.

KubeVirt, therefore, will look for and start a device plugin only for host devices that are bound to the vfio-pci driver.

Our user guide has a section about how this can be achieved in a dynamic (non-persistent) way. I would suggest reading the whole document.

For a persistent configuration, the administrator can create a modprobe configuration file, listing the relevant devices, such as the following:

echo options vfio-pci ids=10de:20b0 > /etc/modprobe.d/vfio.conf
echo vfio-pci > /etc/modules-load.d/vfio-pci.conf

Once the devices are bound to the vfio-pci driver, KubeVirt will be able to discover these and will be listed in the Capacity/Allocatable sections in `kubectl describe node [node_name]`

Thanks,

Vladik

To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/cabd0718-fff3-4016-a970-40dc884763a6n%40googlegroups.com.

Message has been deleted

Vaibhav Raizada

unread,

Jun 23, 2021, 12:44:16 AM6/23/21

to kubevirt-dev

Hi Vladik,

Thanks again. I have bound one of the gpu to vfio-pci driver. See output of below command:

[root@node003 ~]# lspci -nnk -d 10de:

01:00.0 3D controller [0302]: NVIDIA Corporation GA100 [A100 SXM4 40GB] [10de:20b0] (rev a1)

Subsystem: NVIDIA Corporation Device [10de:144e]

Kernel driver in use: nvidia

Kernel modules: nouveau, nvidia_drm, nvidia

41:00.0 3D controller [0302]: NVIDIA Corporation GA100 [A100 SXM4 40GB] [10de:20b0] (rev a1)

Subsystem: NVIDIA Corporation Device [10de:144e]

Kernel driver in use: vfio-pci

Kernel modules: nouveau, nvidia_drm, nvidia

The execution of command "kubectl describe node" gives me below output which indicates that the device is available.

Allocatable:

cpu: 96

devices.kubevirt.io/kvm: 110

devices.kubevirt.io/tun: 110

devices.kubevirt.io/vhost-net: 110

ephemeral-storage: 86973087600

hugepages-1Gi: 0

hugepages-2Mi: 0

memory: 1056587164Ki

nvidia.com/GA100_A100_SXM4_40GB: 1

pods: 110

Yet, when I launch VM I get below error:

[root@node003 ~]# kubectl describe pod virt-launcher-vmi-gpu-bcg7n

Name: virt-launcher-vmi-gpu-bcg7n

.................................................................................

Events:

Type Reason Age From Message

---- ------ ---- ---- -------

Normal Scheduled 32s default-scheduler Successfully assigned default/virt-launcher-vmi-gpu-bcg7n to node003

Warning UnexpectedAdmissionError 33s kubelet Allocate failed due to requested number of devices unavailable for nvidia.com/GA100_A100_SXM4_40GB. Requested: 1, Available: 0, which is unexpected

My kubevirt-cr.yaml has below changes for your reference:

---

apiVersion: kubevirt.io/v1

kind: KubeVirt

metadata:

namespace: kubevirt

spec:

certificateRotateStrategy: {}

configuration:

permittedHostDevices:

pciHostDevices:

- pciVendorSelector: "10de:20b0"

resourceName: "nvidia.com/GA100_A100_SXM4_40GB"

developerConfiguration:

featureGates: []

customizeComponents: {}

imagePullPolicy: IfNotPresent

workloadUpdateStrategy: {}

Thanks,

Vaibhav

Vaibhav Raizada

unread,

Jun 23, 2021, 1:24:58 AM6/23/21

to kubevirt-dev

A quick update. After one hour or so, now when I run command: "kubectl decribe node" then the gpu count is back to zero.

Allocatable:

cpu: 96

devices.kubevirt.io/kvm: 110

devices.kubevirt.io/tun: 110

devices.kubevirt.io/vhost-net: 110

ephemeral-storage: 86973087600

hugepages-1Gi: 0

hugepages-2Mi: 0

memory: 1056587164Ki

nvidia.com/GA100_A100_SXM4_40GB: 0

pods: 110

I will chack again after an hour.

Thanks,

Vaibhav

Vaibhav Raizada

unread,

Jun 23, 2021, 6:34:08 AM6/23/21

to kubevirt-dev

I also want to add that I have also deployed Nvidia's Kubevirt-gpu-device-plugin.

https://github.com/NVIDIA/kubevirt-gpu-device-plugin

Thanks,

Vaibhav

Kedar Bidarkar

unread,

Jun 23, 2021, 7:33:26 AM6/23/21

to Vaibhav Raizada, kubevirt-dev

On Wed, Jun 23, 2021 at 4:04 PM Vaibhav Raizada <writeto...@gmail.com> wrote:

I also want to add that I have also deployed Nvidia's Kubevirt-gpu-device-plugin.

https://github.com/NVIDIA/kubevirt-gpu-device-plugin

1) NVIDIA device-plugin is not required from v0.36.0 Unless there is some specific need for that device plugin.

That's when this feature "Generalize host devices assignment." was added.

But looking at this mail thread, it appears you are using v0.42.1,

so AFAIK NVIDIA device-plugin should not be required

2) If using https://github.com/NVIDIA/kubevirt-gpu-device-plugin for some specific purpose

Then you may need to set "externalResourceProvider: true"

3) In both the cases though, when

a) "using Nvidia device plugin" or

b) "using KubeVirt's built-in generic mechanism"

We would still need the same steps [1] for configuring the Node first.

4) Configuring the Node [ Assuming, we need PCI Passthrough here ]

a) "Enable IOMMU and blacklist nouveau driver on KVM Host"

b) "echo "options vfio-pci ids=vendor-ID:device-ID" > /etc/modprobe.d/vfio.conf"

c) "echo 'vfio-pci' > /etc/modules-load.d/vfio-pci.conf"

NOTE: Unless you intend to use it for a specific purpose, avoid this command

"kubectl apply -f nvidia-kubevirt-gpu-device-plugin.yaml"  and let

"KubeVirt's built-in generic mechanism"  take over.

[1] - https://github.com/NVIDIA/kubevirt-gpu-device-plugin#preparing-a-gpu-to-be-used-in-pass-through-mode

Best Regards,

Kedar Bidarkar

To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/4e1825d4-a48a-499a-b827-9ce97cb93991n%40googlegroups.com.

Fabian Deutsch

unread,

Jun 24, 2021, 8:51:17 AM6/24/21

to Kedar Bidarkar, Vaibhav Raizada, kubevirt-dev

Thanks Kedar.

Vaibhav, were you able to use the GPU after dropping the nvidia DP?

> To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/CAHHN7kmr6AoRyDc6FtjYkfntM0N%3D%3DSx5nN90Z5xGG7rdOdzoSw%40mail.gmail.com.

Reply all

Reply to author

Forward