Not able to run VM in kubevirt - Fedora 36

1,256 views
Skip to first unread message

khaled elgohary

unread,
Aug 31, 2022, 11:17:01 AM8/31/22
to kubevirt-dev
Hello everyone,

I installed k8s v1.25.0 on Fedora server 36 OS  (1 node cluster ) then I installed kubevirt v0.55.0 and tried to run " quay.io/kubevirt/cirros-container-disk-demo" VM but the status of the VM went to "ErrorUnschedulable". I appreciate your help to sort out what is the root cause for this failure. 

Below is my observations and the details I have in points, please let me know if more details are required :
  • Virtualization is enabled :
virt-host-validate qemu
  QEMU: Checking for hardware virtualization                                 : PASS
  QEMU: Checking if device /dev/kvm exists                                   : PASS
  QEMU: Checking if device /dev/kvm is accessible                            : PASS
  QEMU: Checking if device /dev/vhost-net exists                             : PASS
  QEMU: Checking if device /dev/net/tun exists                               : PASS
  QEMU: Checking for cgroup 'cpu' controller support                         : PASS
  QEMU: Checking for cgroup 'cpuacct' controller support                     : PASS
  QEMU: Checking for cgroup 'cpuset' controller support                      : PASS
  QEMU: Checking for cgroup 'memory' controller support                      : PASS
  QEMU: Checking for cgroup 'devices' controller support                     : PASS
  QEMU: Checking for cgroup 'blkio' controller support                       : PASS
  QEMU: Checking for device assignment IOMMU support                         : PASS
  QEMU: Checking if IOMMU is enabled by kernel                               : PASS
  QEMU: Checking for secure guest support                                    : WARN (Unknown if this platform has Secure Guest support)

  • The vm yaml is the one mentioned in kubevirt tutorial site >  https://kubevirt.io/labs/manifests/vm.yaml.
  • the Launcher pod "virt-launcher-testvm-n2b67"  stuck in pending status  and the output for "kubectl describe " command can be found in the [launcher-description.txt]  :
                  virt-launcher-testvm-n2b67         0/2     Pending   0             59m
  • when I described the node I noticed the kubvirt schedulable label set to false  "kubevirt.io/schedulable=false", you can find the complete output for "kubectl describe node " command attached [Node-description.txt]. 
  • Moreover the handler is not able to register devices as I can see below logs in the cirt-handler-XXXX pod logs.

{"component":"virt-handler","level":"info","msg":"SELinux is reported as 'permissive'","pos":"virt-handler.go:381","timestamp":"2022-08-31T13:17:52.173970Z"}
{"component":"virt-handler","level":"warning","msg":"Permissive mode, ignoring 'semodule' failure: out: \"Failed to resolve typeattributeset statement at /var/lib/selinux/targeted/tmp/modules/400/virt_launcher/cil:15\\nFailed to resolve AST\\n/sbin/semodule:  Failed!\\n\", error: exit status 1","pos":"labels.go:102","timestamp":"2022-08-31T13:17:56.058845Z"}

{"component":"virt-handler","level":"info","msg":"Starting virt-handler controller.","pos":"vm.go:1348","timestamp":"2022-08-31T13:17:56.088893Z"}
{"component":"virt-handler","level":"info","msg":"Starting a device plugin for device: kvm","pos":"device_controller.go:56","timestamp":"2022-08-31T13:17:56.088990Z"}
{"component":"virt-handler","level":"info","msg":"Starting a device plugin for device: tun","pos":"device_controller.go:56","timestamp":"2022-08-31T13:17:56.089077Z"}
{"component":"virt-handler","level":"info","msg":"Starting a device plugin for device: vhost-net","pos":"device_controller.go:56","timestamp":"2022-08-31T13:17:56.089098Z"}
{"component":"virt-handler","level":"info","msg":"Starting a device plugin for device: sev","pos":"device_controller.go:56","timestamp":"2022-08-31T13:17:56.089109Z"}
{"component":"virt-handler","level":"info","msg":"refreshed device plugins for permitted/forbidden host devices","pos":"device_controller.go:292","timestamp":"2022-08-31T13:17:56.089140Z"}
{"component":"virt-handler","level":"info","msg":"enabled device-plugins for: []","pos":"device_controller.go:293","timestamp":"2022-08-31T13:17:56.089192Z"}
{"component":"virt-handler","level":"info","msg":"set verbosity to 2","pos":"virt-handler.go:457","timestamp":"2022-08-31T13:17:56.089208Z"}
{"component":"virt-handler","level":"info","msg":"disabled device-plugins for: []","pos":"device_controller.go:294","timestamp":"2022-08-31T13:17:56.089234Z"}
{"component":"virt-handler","level":"info","msg":"setting rate limiter to 5 QPS and 10 Burst","pos":"virt-handler.go:466","timestamp":"2022-08-31T13:17:56.089297Z"}
{"component":"virt-handler","level":"info","msg":"set verbosity to 2","pos":"virt-handler.go:457","timestamp":"2022-08-31T13:17:56.089313Z"}
{"component":"virt-handler","level":"info","msg":"setting rate limiter to 5 QPS and 10 Burst","pos":"virt-handler.go:466","timestamp":"2022-08-31T13:17:56.089164Z"}
{"component":"virt-handler","level":"info","msg":"refreshed device plugins for permitted/forbidden host devices","pos":"device_controller.go:292","timestamp":"2022-08-31T13:17:56.089399Z"}
{"component":"virt-handler","level":"info","msg":"enabled device-plugins for: []","pos":"device_controller.go:293","timestamp":"2022-08-31T13:17:56.089421Z"}
{"component":"virt-handler","level":"info","msg":"disabled device-plugins for: []","pos":"device_controller.go:294","timestamp":"2022-08-31T13:17:56.089497Z"}
{"component":"virt-handler","level":"error","msg":"Error starting sev device plugin","pos":"device_controller.go:68","reason":"error registering with device plugin manager: rpc error: code = Unknown desc = failed to dial device plugin: context deadline exceeded","timestamp":"2022-08-31T13:18:07.092539Z"}
{"component":"virt-handler","level":"error","msg":"Error starting tun device plugin","pos":"device_controller.go:68","reason":"error registering with device plugin manager: rpc error: code = Unknown desc = failed to dial device plugin: context deadline exceeded","timestamp":"2022-08-31T13:18:07.092524Z"}
{"component":"virt-handler","level":"error","msg":"Error starting kvm device plugin","pos":"device_controller.go:68","reason":"error registering with device plugin manager: rpc error: code = Unknown desc = failed to dial device plugin: context deadline exceeded","timestamp":"2022-08-31T13:18:07.092626Z"}
{"component":"virt-handler","level":"error","msg":"Error starting vhost-net device plugin","pos":"device_controller.go:68","reason":"error registering with device plugin manager: rpc error: code = Unknown desc = failed to dial device plugin: context deadline exceeded","timestamp":"2022-08-31T13:18:07.098892Z"}
W0831 13:18:07.996142    5118 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
{"component":"virt-handler","level":"info","msg":"Updating cluster config from KubeVirt to resource version '422547'","pos":"configuration.go:320","timestamp":"2022-08-31T13:18:13.157955Z"}
{"component":"virt-handler","level":"info","msg":"refreshed device plugins for permitted/forbidden host devices","pos":"device_controller.go:292","timestamp":"2022-08-31T13:18:13.157994Z"}
{"component":"virt-handler","level":"info","msg":"enabled device-plugins for: []","pos":"device_controller.go:293","timestamp":"2022-08-31T13:18:13.158009Z"}
{"component":"virt-handler","level":"info","msg":"disabled device-plugins for: []","pos":"device_controller.go:294","timestamp":"2022-08-31T13:18:13.158021Z"}
{"component":"virt-handler","level":"info","msg":"setting rate limiter to 5 QPS and 10 Burst","pos":"virt-handler.go:466","timestamp":"2022-08-31T13:18:13.158038Z"}
{"component":"virt-handler","level":"info","msg":"set verbosity to 2","pos":"virt-handler.go:457","timestamp":"2022-08-31T13:18:13.158051Z"}
{"component":"virt-handler","level":"error","msg":"Error starting sev device plugin","pos":"device_controller.go:68","reason":"error registering with device plugin manager: rpc error: code = Unknown desc = failed to dial device plugin: context deadline exceeded","timestamp":"2022-08-31T13:18:20.097032Z"}
{"component":"virt-handler","level":"error","msg":"Error starting tun device plugin","pos":"device_controller.go:68","reason":"error registering with device plugin manager: rpc error: code = Unknown desc = failed to dial device plugin: context deadline exceeded","timestamp":"2022-08-31T13:18:20.097074Z"}
{"component":"virt-handler","level":"error","msg":"Error starting kvm device plugin","pos":"device_controller.go:68","reason":"error registering with device plugin manager: rpc error: code = Unknown desc = failed to dial device plugin: context deadline exceeded","timestamp":"2022-08-31T13:18:20.097166Z"}
{"component":"virt-handler","level":"error","msg":"Error starting vhost-net device plugin","pos":"device_controller.go:68","reason":"error registering with device plugin manager: rpc error: code = Unknown desc = failed to dial device plugin: context deadline exceeded","timestamp":"2022-08-31T13:18:20.113480Z"}






Node-description.txt
launcher-description.txt

Roman Mohr

unread,
Aug 31, 2022, 11:22:02 AM8/31/22
to khaled elgohary, kubevirt-dev
Could you share the kubelet logs? virt-handler is waiting for the kubelet to connect to the device plugin sockets. The kubelet logs my have more hints.

Best regards,
Roman
 


--
You received this message because you are subscribed to the Google Groups "kubevirt-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubevirt-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/30516fcc-b35a-4b5c-b457-9aeec5e28ab0n%40googlegroups.com.

khaled elgohary

unread,
Aug 31, 2022, 12:09:03 PM8/31/22
to kubevirt-dev
Thanks for the prompt response. Actually the logs are too much so I just tail the last 300 lines .. Still I will go through them and I am sharing with you as well if you have any idea.
kubelet.log

Daniel Hiller

unread,
Aug 31, 2022, 12:13:28 PM8/31/22
to khaled elgohary, kubevirt-dev
Hey,

we are still in the process of creating the kubevirtci provider for 1.25 [1], so we don't have any data on whether KubeVirt works on 1.25. We are currently testing with k8s 1.24 to 1.22 .




--
-- 
Best,
Daniel

khaled elgohary

unread,
Aug 31, 2022, 1:53:55 PM8/31/22
to kubevirt-dev
Thanks for the notice .. I will downgrade the k8s to 1.24 version then will update you ... thanks.

khaled elgohary

unread,
Sep 2, 2022, 5:49:50 AM9/2/22
to kubevirt-dev
Thanks all .. after downgrading the k8s to 1.24.3, I am not getting the errors I was getting in the handler pod and the node labeled as "kubevirt.io/schedulable=true". Moreover the VM started as well.

Vladik Romanovsky

unread,
Sep 12, 2022, 9:20:48 AM9/12/22
to khaled elgohary, Christopher Desiniotis, Roman Mohr, kubevirt-dev
I think this PR: https://github.com/kubevirt/kubevirt/pull/8451 will help.

+Christopher Desiniotis (Thanks for the PR!) found that the expected order of the device plugins registration has been changed in 1.25 somehow. 
We don't have a 1.25 provider yet to verify the PR, but at least we know that this change is backward compatible.


Kevin Klues

unread,
Sep 12, 2022, 12:17:44 PM9/12/22
to kubevirt-dev
Hi all,

I'm the one who made the breaking change in the kubelet in this refactoring commit:

The change in semantics was not intentional, but it does bring the device plugin's custom registration process in line with the kubelet's standard plugin registration process.

The old flow had the following logic:
1. Register the plugin
2. Launch a go-routine to connect to the gRPC service being served by the plugin (with a timeout of 10s)
3. Return

This gave the plugin the opportunity to first register itself with the kubelet and then (within a 10s time window) start its gRPC server.

The new semantics are similar, except that no go-routine is launched to connect to the gRPC service. The connection is attempted synchronously. This means that the plugin must start serving its gRPC server before registering itself with the kubelet. Otherwise the registration call will fail.

As mentioned before, this change was not necessarily intentional, but it's also not clear if it should be reverted. It brings the device plugin's custom registration process more in line with the kubelet's standard registration process and the order is actually a bit more intuitive. Why bother registering if you aren't able to start serving your API?

I have opened the following kubernetes issue to track this and help decide what the right course of action is:
https://github.com/kubernetes/kubernetes/issues/112395

Kevin
Reply all
Reply to author
Forward
0 new messages