(10:47:51 AM) ylifshit: hi. can you please point me to the code that set the number of kvm resources?
(10:48:18 AM) ylifshit: i had the same issue with my "random" resource, as it is also "infinite"
(11:04:35 AM) sgott: you mean to just set it to an arbitrary number?
(11:07:14 AM) sgott: https://github.com/kubevirt/kubevirt/pull/1099 does this. pkg/virt-handler/device-manager/generic_device.go is the file. basically in the initialization I just make an arbitrarily large number
(11:12:51 AM) ylifshit: i guess this is happening in the loop calling addNewGenericDevice()?
(11:17:36 AM) sgott: yes. that adds a new device to the internal list. this list is sent to the Kubernetes side in ListAndWatch
(11:17:49 AM) sgott: once at the start and then any time interesting changes occur
(11:17:49 AM) ylifshit: ok. got it.
(11:18:53 AM) ylifshit: btw, did you look at the device plugin manager framework that Petr prepared:
https://github.com/kubevirt/device-plugin-manager ?
(11:19:03 AM) ylifshit: it removed some of the boiler plate code
(11:19:27 AM) sgott: and added a lot more in return! :D
(11:19:54 AM) sgott: I did not need all the smart goodness he created, since I don't need to dynamically allocate new kinds of plugins on the fly
(11:20:13 AM) ylifshit: yeah. i guess it does not worth the extra dependency
(11:20:20 AM) sgott: yes I did look at his work. It was very helpful in setting me on the right path
(11:22:26 AM) ylifshit: was wondering how we should do the generic ones for the vm
(11:23:11 AM) ylifshit: i mean, if the vm and not the launcher needs some device from the host
(11:23:36 AM) sgott: the VM and the launcher are the same in this respect
(11:24:14 AM) sgott: the device plugin framework can get devices from the host to the pod. if the VM needs those devices then it's up to qemu/libvirt to obtain them from the pod.
(11:24:15 AM) ylifshit: but there will be some code that may need to glue a non-standard device from the pod (that the DP created) into the VM?
(11:24:23 AM) sgott: yes
(11:24:29 AM) sgott: that "glue" would be up to Qemu
(11:24:39 AM) sgott: that's no different than any other passthrough
(11:25:04 AM) sgott: the part that would be interesting about that case is that it would need to be done dynamically. I'm not sure how the heck we could pull that off. 8)
(11:25:33 AM) ylifshit: yes. this is what i meant - we will have to write some code that does that
(11:26:13 AM) sgott: the trouble is, that virt-controller would need to know in advance that such a resource was going to be asked for (al a VM manifest). then virt-handler would have to know even before that that such a device would be needed so it could provide it via DP
(11:26:31 AM) sgott: that's the part we can't know. crystal ball is broke... :D
(11:27:25 AM) ylifshit: why? the DP is going to be declared in the VM spec
(11:28:03 AM) sgott: so take a step back... let's look at KVM devices...
(11:28:25 AM) ylifshit: kvm is implicit for us
(11:29:22 AM) sgott: so right now virt-handler already knows (becuase it's hard coded) that it needs to look for and provide KVM devices, and TUN devices. kvm because we said so (hardware accelerated is good.) TUN because we made that up--it comes from the rule that the VM will be attached to the pod network--hence we will need a tun device.
(11:30:18 AM) sgott: when virt-handler boots up (I repeat: AT BOOT TIME), it looks for kvm and tun devices, sees that they're there and registers an arbitrary number of devices to be available in advance...
(11:30:55 AM) sgott: this way what virt-controller schedules a pod, it adds requests for KVM and TUN... kubernetes uses this info to know what nodes it's even possible to schedule VMs on.
(11:31:35 AM) ylifshit: ok
(11:31:54 AM) ylifshit: but it works, the requirement is always there
(11:32:04 AM) sgott: if the request for a device is read from the manifest. let's say the that "/dev/random" was requested. great. vitr-controller would add a marking for "
devices.kubevirt.io/random" to the pod, and... the pod would hang forever in sadness. because nothing knew to provide the random device.
(11:33:23 AM) sgott: to make things work for "/dev/random" some mechanism would need to exist to prod virt-handler to know to volunteer random devices as the first step. then when virt-controller requested one for a pod, things could move forward.
(11:34:13 AM) sgott: we could use e.g. a configmap, but then literally every handler in the cluster would start offering up /dev/random devices (when only one was really needed)... gross. :)
(11:35:21 AM) sgott: so I don't know the answer, but that's the challenge to noodle on. there's probably an elegant way, but our current mechanism of "schedule pod, and that will tell virt-handler to act" is no longer valid--as the pod would hang forever due to unmet requirements
(11:35:48 AM) ylifshit: maybe we can just run the device plugin on the pod where the vm needs it?
(11:36:20 AM) ylifshit: it can happen after the vm was scheduled
(11:36:58 AM) ylifshit: currently, we run the DPs as a daemon sets. but this is probably not mandatory
(11:37:02 AM) sgott: then the pod is providing it's own DP... won't work... but... that's an interesting case. let's pretend you scheduled *a* pod (that's not virt-launcher) that made a device... it would be scheduled onto any random node that's legal ... and it would fire up the DP mechanism....
(11:37:03 AM) sgott: hrm...
(11:37:25 AM) sgott: that's correct. we could fire up a one-off device thingy
(11:38:01 AM) ylifshit: maybe the handler can do that?
(11:38:04 AM) sgott: there's something worth noting about that though ... devices shared via DP show up on the node. like literally in its yaml...
(11:38:16 AM) sgott: the handler is too late in the process
(11:38:21 AM) ylifshit: why?
(11:38:35 AM) ylifshit: the vm could be launched, will be pending until the device arrive
(11:38:35 AM) sgott: the handler only wakes up and acts once a kubevirt pod is scheduled onto its node.
(11:38:42 AM) sgott: see above for chicken and egg dilemma
(11:39:07 AM) ylifshit: hmm, i see, we need to schdule on the right node :-)
(11:39:14 AM) sgott: the pod is launched first. then the VM is inserted into it essentiallyu
(11:40:13 AM) ylifshit: ok. so for k8s to make the right decision on where ti run our pod, we must launch the DP first, and on all nodes
(11:41:47 AM) sgott: sort of. per a path you just started to explore a minute ago... you *could* do a one-off device in a pod, not a daemonset. this single-purpose pod would only exist to create a device through the plugin...
(11:42:12 AM) sgott: this pod cannot be a container inside the original pod because it's the belt and suspenders to make it even possible to schedule the pod
(11:42:21 AM) ylifshit: yes. but what if node1 had /dev/random and node2 does not have it
(11:43:01 AM) ylifshit: ok, i see
(11:43:16 AM) sgott: hrm. yuck. yeah that could get sticky. perhaps this does mean that we... gosh man. I dunno. it feels like we need a new controller here doesn't it? :)
(11:43:36 AM) sgott: "device-controller"... has a nice ring to it
(11:44:00 AM) ylifshit: +1
(11:44:18 AM) sgott: cause yeah it would get dirty fast if we were firing off random pods just to schedule devices when we could have just scheduled one big controller and told it to start providing things from the node
(11:44:40 AM) sgott: and like you pointed out, might be unlucky if for some reason a node didn't have a "random" device
(11:45:16 AM) ylifshit: i guess that for now, i would just assume there is manual configuration involved ahead of time
(11:45:44 AM) ylifshit: you need to run a DP deamon set for whatever devices that you know your VMs are going to need
(11:46:08 AM) sgott: yes, for now it's a reasonable stipulation that "you will need to configure your cluster for _____ devices if you want to be able to use them in VMs"
(11:46:17 AM) sgott: crawl walk run as they sat
(11:46:19 AM) sgott: say
(11:48:44 AM) ylifshit: probably good to capture that somehow. maybe shoot an email, nothing too actionable, but make sure that fabian has that on his radar?
(11:49:20 AM) sgott: I was just re-reading this chat transcript and wondering if it would be a good idea to just forward it to the mailing list
(11:49:34 AM) ylifshit: sure :-)