Hi,
We are working on a vhostuser network binding plugin for Kubevirt and wanted to share its design with the community.
Vhostuser interfaces are required to attach VM to a userspace/dpdk dataplane like OVS-DPDK or VPP.
The attached design proposal describes how we implemented the vhostuser network binding plugin so far and focuses on the issue of sharing the vhostuser unix socket files between the virt-launcher pod’s compute container, the Multus/CNI pod and the dataplane pod.
Today, we rely on the virt-launcher pod “sockets” emptyDir, but a cleaner approach would be welcome.
We already had some discussion with Alice Frosi and Fabian Deutsch, and plan to attend next community meeting on April 17th to discuss about the proposal.
A more formal PR for this proposal will follow soon.
Regards,
Benoit.
Orange Restricted
____________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.
Hi,
We are working on a vhostuser network binding plugin for Kubevirt and wanted to share its design with the community.
Vhostuser interfaces are required to attach VM to a userspace/dpdk dataplane like OVS-DPDK or VPP.
The attached design proposal describes how we implemented the vhostuser network binding plugin so far and focuses on the issue of sharing the vhostuser unix socket files between the virt-launcher pod’s compute container, the Multus/CNI pod and the dataplane pod.
Today, we rely on the virt-launcher pod “sockets” emptyDir, but a cleaner approach would be welcome.
We already had some discussion with Alice Frosi and Fabian Deutsch, and plan to attend next community meeting on April 17th to discuss about the proposal.
A more formal PR for this proposal will follow soon.
Regards,
Benoit.
Orange Restricted
____________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.
--
You received this message because you are subscribed to the Google Groups "kubevirt-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubevirt-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/DB9PR02MB975533FDC83AAEBD81B136D3E8092%40DB9PR02MB9755.eurprd02.prod.outlook.com.
On Mon, Apr 15, 2024 at 5:52 PM <benoit....@orange.com> wrote:Hi,
We are working on a vhostuser network binding plugin for Kubevirt and wanted to share its design with the community.
Vhostuser interfaces are required to attach VM to a userspace/dpdk dataplane like OVS-DPDK or VPP.
The attached design proposal describes how we implemented the vhostuser network binding plugin so far and focuses on the issue of sharing the vhostuser unix socket files between the virt-launcher pod’s compute container, the Multus/CNI pod and the dataplane pod.
Today, we rely on the virt-launcher pod “sockets” emptyDir, but a cleaner approach would be welcome.
We already had some discussion with Alice Frosi and Fabian Deutsch, and plan to attend next community meeting on April 17th to discuss about the proposal.
A more formal PR for this proposal will follow soon.
Thank you. This topic is extremely interesting - so much we've seen multiple efforts in the past to implement it.
Part of the reason none of them went anywhere is because there was not a proposed way to integrate any sort of e2e testing in it. While I know it is not trivial, please make sure to avoid that pitfall. Ensure your proposal covers that.
--Also, would you quickly explain the differences between your proposal and [0] (for instance) ?
Regards,
Benoit.
--
Orange Restricted
____________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.
You received this message because you are subscribed to the Google Groups "kubevirt-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubevirt-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/DB9PR02MB975533FDC83AAEBD81B136D3E8092%40DB9PR02MB9755.eurprd02.prod.outlook.com.
You received this message because you are subscribed to the Google Groups "kubevirt-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubevirt-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/CAAx_ZVU6qYLQuwE_%2Bmt4vb-f7TUabvSRsZqV03WB8GbXaDHYNw%40mail.gmail.com.
On Mon, Apr 15, 2024 at 5:52 PM <benoit....@orange.com> wrote:Hi,
We are working on a vhostuser network binding plugin for Kubevirt and wanted to share its design with the community.
Vhostuser interfaces are required to attach VM to a userspace/dpdk dataplane like OVS-DPDK or VPP.
The attached design proposal describes how we implemented the vhostuser network binding plugin so far and focuses on the issue of sharing the vhostuser unix socket files between the virt-launcher pod’s compute container, the Multus/CNI pod and the dataplane pod.
Today, we rely on the virt-launcher pod “sockets” emptyDir, but a cleaner approach would be welcome.
We already had some discussion with Alice Frosi and Fabian Deutsch, and plan to attend next community meeting on April 17th to discuss about the proposal.
A more formal PR for this proposal will follow soon.
Thank you. This topic is extremely interesting - so much we've seen multiple efforts in the past to implement it.
Part of the reason none of them went anywhere is because there was not a proposed way to integrate any sort of e2e testing in it. While I know it is not trivial, please make sure to avoid that pitfall. Ensure your proposal covers that.
Also, would you quickly explain the differences between your proposal and [0] (for instance) ?
Thank you Miguel for your feedback,
Answers inline.
Benoit.On Monday 15 April 2024 at 18:06:24 UTC+2 Miguel Duarte de Mora Barroso wrote:On Mon, Apr 15, 2024 at 5:52 PM <benoit....@orange.com> wrote:Hi,
We are working on a vhostuser network binding plugin for Kubevirt and wanted to share its design with the community.
Vhostuser interfaces are required to attach VM to a userspace/dpdk dataplane like OVS-DPDK or VPP.
The attached design proposal describes how we implemented the vhostuser network binding plugin so far and focuses on the issue of sharing the vhostuser unix socket files between the virt-launcher pod’s compute container, the Multus/CNI pod and the dataplane pod.
Today, we rely on the virt-launcher pod “sockets” emptyDir, but a cleaner approach would be welcome.
We already had some discussion with Alice Frosi and Fabian Deutsch, and plan to attend next community meeting on April 17th to discuss about the proposal.
A more formal PR for this proposal will follow soon.
Thank you. This topic is extremely interesting - so much we've seen multiple efforts in the past to implement it.We're quite aware of the various implementations and PR about that. We think the Network Binding Plugin framework was the missing part to make a clean implementation.
Part of the reason none of them went anywhere is because there was not a proposed way to integrate any sort of e2e testing in it. While I know it is not trivial, please make sure to avoid that pitfall. Ensure your proposal covers that.I agree on that point. We are running some CI testing on our implementation right now. It surely needs to be adapted to Kubevirt e2e test environment, we're not yet familiar with it. I guess the challenge is to have the right configuration for hugepages and a working data plane running. DPDK is not mandatory and a specific NIC to attach to the data plane is not required I think.
Also, would you quickly explain the differences between your proposal and [0] (for instance) ?The main difference is that we implemented vhostuser as a Network Binding Plugin following this design [1]. The plugin is run as a sidecar container in virt-launcher pod, and modifies the domain XML to add and configure vhostuser interface according to the VMI spec. Proposal [0] was quite invasive, required annotations specific to the CNI, and at the end it was proposed to implement it as network binding plugin. So here we are ;)
--
Orange Restricted
____________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.
--
You received this message because you are subscribed to the Google Groups "kubevirt-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubevirt-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/DB9PR02MB975533FDC83AAEBD81B136D3E8092%40DB9PR02MB9755.eurprd02.prod.outlook.com.
You received this message because you are subscribed to the Google Groups "kubevirt-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubevirt-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/f399fa03-2381-4cee-b9d6-371bdff045ban%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/CAAx_ZVVTZrx2WZ-WQNWiY8U3XAHq-pVV%3DOhufHi65xk6SSdk7A%40mail.gmail.com.
Hi Miguel and Edward,You are more familiar with Multus than me and I hope you can help us here.As Benoit explained in the first email, they are currently finding the vhost-user socket by looking into the pods filesystem (a similar approach as the handler connects to the launcher). However, as far as I understand, starting from the next Multus version, the CNI plugin won't have access anymore to the host filesystem, therefore they aren't able to find the socket anymore (@Benoit please correct any wrong details here).
Hi Miguel and Edward,You are more familiar with Multus than me and I hope you can help us here.As Benoit explained in the first email, they are currently finding the vhost-user socket by looking into the pods filesystem (a similar approach as the handler connects to the launcher). However, as far as I understand, starting from the next Multus version, the CNI plugin won't have access anymore to the host filesystem, therefore they aren't able to find the socket anymore (@Benoit please correct any wrong details here).Generally, QEMU supports the vhost-user protocol where the backend device handling is delegated to a third party component. It is true for networking, but also for other types of devices, see [1].I was thinking if we could introduce a new plugin mechanism where we could expose the content of a directory of virt-launcher to an external plugin. See the attachment for a picture with the flow.I see a couple of advantages of this approach:- is generic and could also be potentially reused by other device types- hides the KubeVirt implementation details. Currently, you need to know where the KubeVirt sockets are located in the virt-launcher filesystem. Potentially, if we change the directory path for the sockets, this would break the CNI plugin- can isolate the resources dedicate to that particular plugin
On Tue, Apr 16, 2024 at 9:04 AM Alice Frosi <afr...@redhat.com> wrote:Hi Miguel and Edward,You are more familiar with Multus than me and I hope you can help us here.As Benoit explained in the first email, they are currently finding the vhost-user socket by looking into the pods filesystem (a similar approach as the handler connects to the launcher). However, as far as I understand, starting from the next Multus version, the CNI plugin won't have access anymore to the host filesystem, therefore they aren't able to find the socket anymore (@Benoit please correct any wrong details here).Generally, QEMU supports the vhost-user protocol where the backend device handling is delegated to a third party component. It is true for networking, but also for other types of devices, see [1].I was thinking if we could introduce a new plugin mechanism where we could expose the content of a directory of virt-launcher to an external plugin. See the attachment for a picture with the flow.I see a couple of advantages of this approach:- is generic and could also be potentially reused by other device types- hides the KubeVirt implementation details. Currently, you need to know where the KubeVirt sockets are located in the virt-launcher filesystem. Potentially, if we change the directory path for the sockets, this would break the CNI plugin- can isolate the resources dedicate to that particular pluginI have to say at this abstraction level it seems to make sense (at least to me).What I don't kind of understand is how is this any different from mounting a volume from the launcher pod to the node ? ....
Hi MiguelOn Tue, Apr 16, 2024 at 12:43 PM Miguel Duarte de Mora Barroso <mdba...@redhat.com> wrote:On Tue, Apr 16, 2024 at 9:04 AM Alice Frosi <afr...@redhat.com> wrote:Hi Miguel and Edward,You are more familiar with Multus than me and I hope you can help us here.As Benoit explained in the first email, they are currently finding the vhost-user socket by looking into the pods filesystem (a similar approach as the handler connects to the launcher). However, as far as I understand, starting from the next Multus version, the CNI plugin won't have access anymore to the host filesystem, therefore they aren't able to find the socket anymore (@Benoit please correct any wrong details here).Generally, QEMU supports the vhost-user protocol where the backend device handling is delegated to a third party component. It is true for networking, but also for other types of devices, see [1].I was thinking if we could introduce a new plugin mechanism where we could expose the content of a directory of virt-launcher to an external plugin. See the attachment for a picture with the flow.I see a couple of advantages of this approach:- is generic and could also be potentially reused by other device types- hides the KubeVirt implementation details. Currently, you need to know where the KubeVirt sockets are located in the virt-launcher filesystem. Potentially, if we change the directory path for the sockets, this would break the CNI plugin- can isolate the resources dedicate to that particular pluginI have to say at this abstraction level it seems to make sense (at least to me).What I don't kind of understand is how is this any different from mounting a volume from the launcher pod to the node ? ....Virt-launcher cannot use an hostPath volume because this requires the pod to be privileged.
Additionally, in this way, the CNI plugin only needs access to a single directory where all the directory dedicated to different VM will appear.
On Tue, Apr 16, 2024 at 1:02 PM Alice Frosi <afr...@redhat.com> wrote:Hi MiguelOn Tue, Apr 16, 2024 at 12:43 PM Miguel Duarte de Mora Barroso <mdba...@redhat.com> wrote:On Tue, Apr 16, 2024 at 9:04 AM Alice Frosi <afr...@redhat.com> wrote:Hi Miguel and Edward,You are more familiar with Multus than me and I hope you can help us here.As Benoit explained in the first email, they are currently finding the vhost-user socket by looking into the pods filesystem (a similar approach as the handler connects to the launcher). However, as far as I understand, starting from the next Multus version, the CNI plugin won't have access anymore to the host filesystem, therefore they aren't able to find the socket anymore (@Benoit please correct any wrong details here).Generally, QEMU supports the vhost-user protocol where the backend device handling is delegated to a third party component. It is true for networking, but also for other types of devices, see [1].I was thinking if we could introduce a new plugin mechanism where we could expose the content of a directory of virt-launcher to an external plugin. See the attachment for a picture with the flow.I see a couple of advantages of this approach:- is generic and could also be potentially reused by other device types- hides the KubeVirt implementation details. Currently, you need to know where the KubeVirt sockets are located in the virt-launcher filesystem. Potentially, if we change the directory path for the sockets, this would break the CNI plugin- can isolate the resources dedicate to that particular pluginI have to say at this abstraction level it seems to make sense (at least to me).What I don't kind of understand is how is this any different from mounting a volume from the launcher pod to the node ? ....Virt-launcher cannot use an hostPath volume because this requires the pod to be privileged.I see. But *conceptually*, it's like a volume ... But iiuc, it would be virt-handler pulling the strings instead.Makes sense !
Additionally, in this way, the CNI plugin only needs access to a single directory where all the directory dedicated to different VM will appear.I fail to see the implication. Please elaborate :)
The CNI plugin pod will detect the sockets as far as virt-handler makes them available in the shared directory.
The CNI plugin pod will detect the sockets as far as virt-handler makes them available in the shared directory.I get that - it all boils down to having a way for the launcher pod to make the sockets available in the node's filesystem afaict.An alternative would be to request qemu (or libvirt ??) to consume an existing socket file, which would be created by something else - like a device plugin. Just throwing it out there.
Hi All,
Since last community meeting (thank you all for the discussion!) I had a look at several solutions to share the sockets:
- a new hostPath volume defined at the vhostuser binding plugin configuration: obviously it work but requires privileged pod for both virt-launcher and the dataplane that consumes the socket- a PVC volume defined at the vhostuser binding plugin configuration: must be a per node volume, for example using Hostpath Provisionner, mounted by both virt-launcher and dataplane pods. Issue with PVC is that they are namespaced so both pods need to be in the same namespace: it will not happen.- I finally looked at device plugin: we can think about the dataplane as a switch with ports resources, and a VM would request one or several ports on it. The device plugin can define a list of host path mounts to be injected in the pods: the directories that will host the sockets. We'll need to push some annotations to the virt-launcher pod, to have a way to define the complete host paths. CNI reads the annotations of the pod, names the socket (push it in pod annotations?) and configures the dataplane. Then vhostuser network binding plugin needs to get access to the pod annotations (downward API?) to get the sockets names and modifies the domain XML.
--
You received this message because you are subscribed to the Google Groups "kubevirt-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubevirt-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/1db44e1d-0266-4798-a01e-a5834790a3fbn%40googlegroups.com.
Hi All,
Since last community meeting (thank you all for the discussion!) I had a look at several solutions to share the sockets:
- a new hostPath volume defined at the vhostuser binding plugin configuration: obviously it work but requires privileged pod for both virt-launcher and the dataplane that consumes the socket- a PVC volume defined at the vhostuser binding plugin configuration: must be a per node volume, for example using Hostpath Provisionner, mounted by both virt-launcher and dataplane pods. Issue with PVC is that they are namespaced so both pods need to be in the same namespace: it will not happen.- I finally looked at device plugin: we can think about the dataplane as a switch with ports resources, and a VM would request one or several ports on it. The device plugin can define a list of host path mounts to be injected in the pods: the directories that will host the sockets. We'll need to push some annotations to the virt-launcher pod, to have a way to define the complete host paths. CNI reads the annotations of the pod, names the socket (push it in pod annotations?) and configures the dataplane. Then vhostuser network binding plugin needs to get access to the pod annotations (downward API?) to get the sockets names and modifies the domain XML.Alice, maybe this device plugin approach is not reusable for other needs? It's also more complex and requires to introduce a new component...
On Wed, Apr 24, 2024 at 5:00 PM Benoit Gaussen <benoit....@orange.com> wrote:Hi All,
Since last community meeting (thank you all for the discussion!) I had a look at several solutions to share the sockets:
- a new hostPath volume defined at the vhostuser binding plugin configuration: obviously it work but requires privileged pod for both virt-launcher and the dataplane that consumes the socket- a PVC volume defined at the vhostuser binding plugin configuration: must be a per node volume, for example using Hostpath Provisionner, mounted by both virt-launcher and dataplane pods. Issue with PVC is that they are namespaced so both pods need to be in the same namespace: it will not happen.- I finally looked at device plugin: we can think about the dataplane as a switch with ports resources, and a VM would request one or several ports on it. The device plugin can define a list of host path mounts to be injected in the pods: the directories that will host the sockets. We'll need to push some annotations to the virt-launcher pod, to have a way to define the complete host paths. CNI reads the annotations of the pod, names the socket (push it in pod annotations?) and configures the dataplane. Then vhostuser network binding plugin needs to get access to the pod annotations (downward API?) to get the sockets names and modifies the domain XML.Per what I managed to learn so far from this thread, to me, this seems the only possible option.The DP can setup the socket at the node level and make it available to the pod through a mount.It is used for other network resources and is not dependent on components like Multus.
It will probably also solve the need to have annotation on the pod, communicated from the CNI (which is nasty).(AFAIK the DP can pass env variables into the container, communicating whatever information is needed)
I think you will need to pass a socket and not a folder, but that is an implementation detail that can be sorted out later.
If you have a draft design proposal available, it may be easier to communicate ideas through it.
Would be possible that you just define a single resource with an unlimited number of instances and then you create a dpdk vhost-user socket on request for the pod creation.
Alice
Alice
Benoit.Alice
--
You received this message because you are subscribed to the Google Groups "kubevirt-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubevirt-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/fee06670-d555-4a5a-86c9-75aa7d96026an%40googlegroups.com.
Hi Fabian,Yes I'll update the design proposal with the possible solutions (volumes and device plugin proposal) and go through a PR in kubevirt/community.I hope this can happen quickly.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/1df6223f-c65f-4275-ac11-af2be75e9280n%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/59316f30-5aed-471a-b46c-322c8149b52an%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/ff58bbec-1049-4919-9926-282c1449c0e2n%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/793fee42-03f7-403d-af08-592bcc6a741cn%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/8c423cfb-5fc3-4ec2-bc4a-89b02cc0db2cn%40googlegroups.com.
Hi All,
We did progress on the vhostuser network binding plugin and have an implementation working along with a device plugin that enables vhostuser sockets sharing between virt-launcher pods and dataplane.
This is documented in the PR #294.
However, we encountered an issue with Live Migration. To summarize, as the migration domain from source pod, is overriding the domain created at destination pod, the socket path must be the same at source and des