Clarifying L2 bridge connectivity for VMs and static IP / DHCP

1,313 views
Skip to first unread message

keithdo...@gmail.com

unread,
Oct 26, 2018, 5:50:16 PM10/26/18
to kubevirt-dev

Been following along and using the kubevirt project and had asked some questions around netwoking back in June.  Much has changed since then but I still need to double check that my understanding is correct and make sure I understand the project's direction.

Currently, I have a setup that is using the network-attachment-definition as a top-level CRD, multus, and the network/interface stanzas to get L2 connectivity to a VM.  I'm using the bridge plugin and and am attaching interfaces into linux bridges on the host.  I can do this and create multiple interfaces in the VM.  All of that works nicely most of the time and has been great progress.  I currently have two problems:  one is that if DHCP isn't working L2 segment the VM will never be started and the second is that sometimes even though the VM gets an assigned IP, the default route is not installed the VM.   I'd like to make sure my understanding and setup is correct first though before trying to debug those.

1) IPAM - I get that k8s wants to know about the IPs assigned to pods and requires IPAM for service reason.  But when starting VMs and using multus on secondary interfaces that services are not currently supported on, is this really required?   Do I have to have IPAM plugins on the networks used by the VM interfaces?  Currently I am using the DHCP IPAM plugin which means that I have to have a DHCP daemon process running.   This entity then attempts to get DHCP leases before the pod is fully starting and proxies the DHCP requests.  This explains why when DHCP isn't working, things aren't started.  I don't think this is necessarily the behavior we want for VMs - many times it's easier to get them up and debug DHCP there.  Plus, maybe we want static IP config driven by the VM.

2) Kubevirt's DHCP proxy - It seems that Kubevirt is also installing a DHCP proxy in-between when using a bridge to connect the VM interface to the pod interface.  I get that before this was required, but is it really required now that the virt-launcher owns the pod IP and the VM is getting other addresses?  Why DHCP proxy there?  DHCP proxies worry me....there are so many esoteric options embedded in DHCP and proxying or relaying them sounds easy but the last 2% of the scenarios can be tricky.

3) Pass-through - I know that in the specs there were plans to allow a "passthrough" mode to give the VM the virt-launcher interface directly.  In this scheme, no dhcp proxy, no bridge overhead in virt-launcher.  Is there a technical reason this can't be done or other problems to overcome or has it just not been gotten to?  Open to a patch in this area?  I may tinker with this some.


It just seems to me if I were to remove the bridge IPAM portion and DHCP proxy there, the virt-launcher bridge and DHCP man-in-the-middle, the VM would have pure and direct L2 functionality.  Then if DHCP didn't work or the VM wanted static IPs, things would be fine.  But I also think some of the CNI infra is waiting for IPAM to complete and I'm not sure if/why it needs to know about the IPs on secondary interfaces.  That may or may not even be a question for kubevirt, but I suspect there are some here that opinions and can clarify.   I've looked at PRs, issues, specs sent out, and the roadmaps but can't definitively answer these questions.  Thanks for the time.

-K

Yuval Lifshitz

unread,
Oct 28, 2018, 1:39:26 PM10/28/18
to kubevirt-dev


On Saturday, 27 October 2018 00:50:16 UTC+3, keithdo...@gmail.com wrote:

Been following along and using the kubevirt project and had asked some questions around netwoking back in June.  Much has changed since then but I still need to double check that my understanding is correct and make sure I understand the project's direction.

Currently, I have a setup that is using the network-attachment-definition as a top-level CRD, multus, and the network/interface stanzas to get L2 connectivity to a VM.  I'm using the bridge plugin and and am attaching interfaces into linux bridges on the host.  I can do this and create multiple interfaces in the VM.  All of that works nicely most of the time and has been great progress.  I currently have two problems:  one is that if DHCP isn't working L2 segment the VM will never be started and the second is that sometimes even though the VM gets an assigned IP, the default route is not installed the VM.   I'd like to make sure my understanding and setup is correct first though before trying to debug those.
 
I assume that you use the standard bridge CNI [1] ?
In this case, you can either define an IPAM object in the network attachment CRD, and then get the IP from the CNI, or leave it out. 
If you leave it out, but configure an interface on your VM to get their IP from DHCP, there has to be a DHCP server connected to that network for the VM to start properly (this is also dependent with guest OS).
Could you please send over the VMI yaml file?


1) IPAM - I get that k8s wants to know about the IPs assigned to pods and requires IPAM for service reason.  But when starting VMs and using multus on secondary interfaces that services are not currently supported on, is this really required?   Do I have to have IPAM plugins on the networks used by the VM interfaces?  Currently I am using the DHCP IPAM plugin which means that I have to have a DHCP daemon process running.   This entity then attempts to get DHCP leases before the pod is fully starting and proxies the DHCP requests.  This explains why when DHCP isn't working, things aren't started.  I don't think this is necessarily the behavior we want for VMs - many times it's easier to get them up and debug DHCP there.  Plus, maybe we want static IP config driven by the VM.


IPAM is optional. See [1]
 
2) Kubevirt's DHCP proxy - It seems that Kubevirt is also installing a DHCP proxy in-between when using a bridge to connect the VM interface to the pod interface.  I get that before this was required, but is it really required now that the virt-launcher owns the pod IP and the VM is getting other addresses?  Why DHCP proxy there?  DHCP proxies worry me....there are so many esoteric options embedded in DHCP and proxying or relaying them sounds easy but the last 2% of the scenarios can be tricky.


Kubevirt's DHCP server is used only in the case that the pod's interface did get an IP (could be primary or secondary network). If the pod just gets an L2 network, Kubevirt's DHCP server is not used, and the VM gets an IP address from the L2 network it is connected to (if DHCP exists there). Note that we don't proxy any DHCP requests through Kubevirt's DHCP server, since it is only used when the pod got an IP address on its interface through the CNI.
 
3) Pass-through - I know that in the specs there were plans to allow a "passthrough" mode to give the VM the virt-launcher interface directly.  In this scheme, no dhcp proxy, no bridge overhead in virt-launcher.  Is there a technical reason this can't be done or other problems to overcome or has it just not been gotten to?  Open to a patch in this area?  I may tinker with this some.



Currently there is a PR for supporting SR-IOV [2] in kubevirt. I guess the a passthrough of the entire interface could also be added, just note that in such a case, it will be completely taken from the host. Not sure what is the priority for this.
 
It just seems to me if I were to remove the bridge IPAM portion and DHCP proxy there, the virt-launcher bridge and DHCP man-in-the-middle, the VM would have pure and direct L2 functionality. 

As mentioned before, IPAM is not mandatory for the bridge CNI, and in such a case Kubevirt's DHCP server will not be used.
Please consider looking into these blog posts on Kubevirt networking: [3], [4]
 
Then if DHCP didn't work or the VM wanted static IPs, things would be fine.  But I also think some of the CNI infra is waiting for IPAM to complete and I'm not sure if/why it needs to know about the IPs on secondary interfaces.  That may or may not even be a question for kubevirt, but I suspect there are some here that opinions and can clarify.   I've looked at PRs, issues, specs sent out, ancompletlyd the roadmaps but can't definitively answer these questions.  Thanks for the time.

-K



Yuval Lifshitz

unread,
Oct 28, 2018, 4:57:32 PM10/28/18
to kubevirt-dev
I apologize... IPAM *is* mandatory in the bridge CNI, at least until this PR [5] is merged.

In the meantime (or in general :-), feel free to use the ovs-bridge CNI [6], where IPAM is optional.
(Note that you will need Open vSwitch on the host)

Roman Mohr

unread,
Oct 29, 2018, 4:04:00 AM10/29/18
to keithdo...@gmail.com, kubevirt-dev
On Fri, Oct 26, 2018 at 11:50 PM <keithdo...@gmail.com> wrote:
>
>
> Been following along and using the kubevirt project and had asked some questions around netwoking back in June. Much has changed since then but I still need to double check that my understanding is correct and make sure I understand the project's direction.
>
> Currently, I have a setup that is using the network-attachment-definition as a top-level CRD, multus, and the network/interface stanzas to get L2 connectivity to a VM.

Apart from Yuval's precise explanations, I would be very interested in
hearing what exactly you are doing so that you can't use the pod
network. Maybe there are other solutions as well to simplify your
setup.

Best Regards,
Roman

> I'm using the bridge plugin and and am attaching interfaces into linux bridges on the host. I can do this and create multiple interfaces in the VM. All of that works nicely most of the time and has been great progress. I currently have two problems: one is that if DHCP isn't working L2 segment the VM will never be started and the second is that sometimes even though the VM gets an assigned IP, the default route is not installed the VM. I'd like to make sure my understanding and setup is correct first though before trying to debug those.
>
> 1) IPAM - I get that k8s wants to know about the IPs assigned to pods and requires IPAM for service reason. But when starting VMs and using multus on secondary interfaces that services are not currently supported on, is this really required? Do I have to have IPAM plugins on the networks used by the VM interfaces? Currently I am using the DHCP IPAM plugin which means that I have to have a DHCP daemon process running. This entity then attempts to get DHCP leases before the pod is fully starting and proxies the DHCP requests. This explains why when DHCP isn't working, things aren't started. I don't think this is necessarily the behavior we want for VMs - many times it's easier to get them up and debug DHCP there. Plus, maybe we want static IP config driven by the VM.
>
> 2) Kubevirt's DHCP proxy - It seems that Kubevirt is also installing a DHCP proxy in-between when using a bridge to connect the VM interface to the pod interface. I get that before this was required, but is it really required now that the virt-launcher owns the pod IP and the VM is getting other addresses? Why DHCP proxy there? DHCP proxies worry me....there are so many esoteric options embedded in DHCP and proxying or relaying them sounds easy but the last 2% of the scenarios can be tricky.
>
> 3) Pass-through - I know that in the specs there were plans to allow a "passthrough" mode to give the VM the virt-launcher interface directly. In this scheme, no dhcp proxy, no bridge overhead in virt-launcher. Is there a technical reason this can't be done or other problems to overcome or has it just not been gotten to? Open to a patch in this area? I may tinker with this some.
>
>
> It just seems to me if I were to remove the bridge IPAM portion and DHCP proxy there, the virt-launcher bridge and DHCP man-in-the-middle, the VM would have pure and direct L2 functionality. Then if DHCP didn't work or the VM wanted static IPs, things would be fine. But I also think some of the CNI infra is waiting for IPAM to complete and I'm not sure if/why it needs to know about the IPs on secondary interfaces. That may or may not even be a question for kubevirt, but I suspect there are some here that opinions and can clarify. I've looked at PRs, issues, specs sent out, and the roadmaps but can't definitively answer these questions. Thanks for the time.
>
> -K
>
> --
> You received this message because you are subscribed to the Google Groups "kubevirt-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kubevirt-dev...@googlegroups.com.
> To post to this group, send email to kubevi...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/a13a6558-519c-44b3-87bb-3ed449a32f9b%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Keith Holleman

unread,
Oct 29, 2018, 7:14:30 PM10/29/18
to yuv...@gmail.com, kubevi...@googlegroups.com

Thanks for taking the time to respond.  Some comments / responses in line.

On Sun, Oct 28, 2018 at 10:39 AM Yuval Lifshitz <yuv...@gmail.com> wrote:


On Saturday, 27 October 2018 00:50:16 UTC+3, keithdo...@gmail.com wrote:

Been following along and using the kubevirt project and had asked some questions around netwoking back in June.  Much has changed since then but I still need to double check that my understanding is correct and make sure I understand the project's direction.

Currently, I have a setup that is using the network-attachment-definition as a top-level CRD, multus, and the network/interface stanzas to get L2 connectivity to a VM.  I'm using the bridge plugin and and am attaching interfaces into linux bridges on the host.  I can do this and create multiple interfaces in the VM.  All of that works nicely most of the time and has been great progress.  I currently have two problems:  one is that if DHCP isn't working L2 segment the VM will never be started and the second is that sometimes even though the VM gets an assigned IP, the default route is not installed the VM.   I'd like to make sure my understanding and setup is correct first though before trying to debug those.
 
I assume that you use the standard bridge CNI [1] ?
In this case, you can either define an IPAM object in the network attachment CRD, and then get the IP from the CNI, or leave it out. 
If you leave it out, but configure an interface on your VM to get their IP from DHCP, there has to be a DHCP server connected to that network for the VM to start properly (this is also dependent with guest OS).
Could you please send over the VMI yaml file?

Yes, I was using the standard CNI bridge you referenced and the DHCP IPAM plugin from the same repo.  Attached the VMI and network CRD, not sure how helpful it is.  I guess some OSs may be dependent on getting an IP but many are not.  If a DHCP server is not configured or not operational, especially if only on one interface, I would argue that some or many would want the VM to still come up.  

 
1) IPAM - I get that k8s wants to know about the IPs assigned to pods and requires IPAM for service reason.  But when starting VMs and using multus on secondary interfaces that services are not currently supported on, is this really required?   Do I have to have IPAM plugins on the networks used by the VM interfaces?  Currently I am using the DHCP IPAM plugin which means that I have to have a DHCP daemon process running.   This entity then attempts to get DHCP leases before the pod is fully starting and proxies the DHCP requests.  This explains why when DHCP isn't working, things aren't started.  I don't think this is necessarily the behavior we want for VMs - many times it's easier to get them up and debug DHCP there.  Plus, maybe we want static IP config driven by the VM.


IPAM is optional. See [1]
 
2) Kubevirt's DHCP proxy - It seems that Kubevirt is also installing a DHCP proxy in-between when using a bridge to connect the VM interface to the pod interface.  I get that before this was required, but is it really required now that the virt-launcher owns the pod IP and the VM is getting other addresses?  Why DHCP proxy there?  DHCP proxies worry me....there are so many esoteric options embedded in DHCP and proxying or relaying them sounds easy but the last 2% of the scenarios can be tricky.


Kubevirt's DHCP server is used only in the case that the pod's interface did get an IP (could be primary or secondary network). If the pod just gets an L2 network, Kubevirt's DHCP server is not used, and the VM gets an IP address from the L2 network it is connected to (if DHCP exists there). Note that we don't proxy any DHCP requests through Kubevirt's DHCP server, since it is only used when the pod got an IP address on its interface through the CNI.

I am definitely using a very overloaded term in my use of "proxy" here.  My point is that the kubevirt DHCP server is attempting to recreate a server config and DHCP reply based on looking at running state of an interface.  It's not a true relay of the original DHCP interaction and could miss many DHCP options.  

I thought the kubevirt DHCP server was used anytime the "bridge" stanza was included in the VMI file.   I didn't realize that this logic was keyed on whether an IP address was present or not.  I see that code now.  I had been unable to end up with an interface w/o an IP because I was forced to use a CNI IPAM.  I will try either the bridge patch or the OVS CNI or both - I was also unaware of these.  Thanks much for the pointers.
 
 
3) Pass-through - I know that in the specs there were plans to allow a "passthrough" mode to give the VM the virt-launcher interface directly.  In this scheme, no dhcp proxy, no bridge overhead in virt-launcher.  Is there a technical reason this can't be done or other problems to overcome or has it just not been gotten to?  Open to a patch in this area?  I may tinker with this some. 



Currently there is a PR for supporting SR-IOV [2] in kubevirt. I guess the a passthrough of the entire interface could also be added, just note that in such a case, it will be completely taken from the host. Not sure what is the priority for this.


That's not the pass-through that I was referring to - not immediately interested in passing through an L2 host interface or SR-IOV but will likely be in due time.  There is the notion of passthrough here in the "Connections" section on the VMI interface device and as I understood this was only to define how the interface was connected in virt-launcher to the VM.  The network's CNI would define how the interface is attached to the physical interface - in the bridge (and I assume the OVS-CNI case) a veth is created and given to the virt launcher.  If you're not using IP in your IPAM, you don't need the DHCP server to move the IP and pass it to the VM, you then don't really need the bridge created in virt-launcher at all.  The veth could be directly passed along.

 
It just seems to me if I were to remove the bridge IPAM portion and DHCP proxy there, the virt-launcher bridge and DHCP man-in-the-middle, the VM would have pure and direct L2 functionality. 

As mentioned before, IPAM is not mandatory for the bridge CNI, and in such a case Kubevirt's DHCP server will not be used.

Got it - will try this.  I have a path to pure L2 connectivity to the physical network w/o any DHCP interference.  Will be curious to see if my default route problem goes away with this, I suspect it will.
 
Please consider looking into these blog posts on Kubevirt networking: [3], [4]

Thanks, I hadn't seen those before but was aware of most of the content.  Is there a reference in the source tree for that?  Might be work adding some reference here:


Thanks again for the time and reply.

-K

vm-network-bridge.yaml
beiwe-l2.yaml

Keith Holleman

unread,
Oct 29, 2018, 7:29:02 PM10/29/18
to rm...@redhat.com, kubevi...@googlegroups.com
Roman,

Not sure exactly what you mean by "you can't use the pod network."   Maybe you can clarify some?  

I am just trying to create a VM and give it direct L2 connectivity to a network reachable via the host (worker-node) and don't want anyone mucking with DHCP.  Some VMs, may exist only in k8s/pod-networking world and would need only a single IP address based on the pod network. That VM is fine to define in bound services and have it's outbound connections nat'd.  Some VMs might have two interfaces and serve as some sort of VNF between the pod network and a physical network.  But it could also be that someone wants to spin up a VM and connect it to a network the way any conventional hypervisor does today with a vswitch and it has no connectivity into the k8s world at all.   Then the VM has full L2/L3 inbound and outbound connectivity on one or more interfaces (or networks).  K8s along with kubevit is providing scheduling, storage, network (single or multiple interfaces) and HA across a multi-node compute cluster.

I'm trying to replicate the last VM model at this point - am I using this in an unintended way?  Or if there is an easier way I'm open to other solutions as well.

-K

Fabian Deutsch

unread,
Oct 30, 2018, 7:36:11 AM10/30/18
to keithdo...@gmail.com, Mohr, Roman, kubevi...@googlegroups.com, Yuval Lifshitz, Ihar Hrachyshka
Hey Keith!

let me add some context to this networking topic.

KubeVirt's mission is to support traditional VM workloads as well as integrating with Kubernetes.

Due to this KubeVirt had to meet two goals on the networking side: a) Enable classical virt networking (aka L2 connectivity and more optiomized liek SRIOV and deivce passthrough) b) provide a way to integrate with Kubernetes in order to let users use KubeVirt VMs with Services, Ingress, etc. Please note that Kubernetes networking does do IPAM - Every Pod is expected to have at least one IP. This is why there is no L2-only (thus interface _without_ an IP) connection method for VMs and Pods in Kubernetes. This is a core assumption in Kubernetes.

Due to this the following approach is taken:
By default, a KubeVirt VM will get a single vNIC, with an IP address provided by Kubernetes. This is pretty much the same behavior as a regular Kubernetes pod, which also get's a single NIC with an IP from Kubernetes.
Thus: The primary vNIC of a VM is rather intended to integrate nicely with Kubernetes, this is the reason why we do some "mucking" in order to have feature parity with Kubernetes pods.

If you don't want this hand-taking approach, then you can take an L2 interface provided by multus - which is "bypassing" the Kubernetes network model, and gives you much more freedom.

After this generic statements, let me reply inline :)

On Tue, Oct 30, 2018 at 12:29 AM Keith Holleman <keithdo...@gmail.com> wrote:
Roman,

Not sure exactly what you mean by "you can't use the pod network."   Maybe you can clarify some?  

I am just trying to create a VM and give it direct L2 connectivity to a network reachable via the host (worker-node) and don't want anyone mucking with DHCP.  Some VMs, may exist only in k8s/pod-networking world and would need only a single IP address based on the pod network.

These VMs will have to live with the constraints setup by Kubernets (i.e. every vNIC attached to the pod network will have an IP).
 
That VM is fine to define in bound services and have it's outbound connections nat'd.  Some VMs might have two interfaces and serve as some sort of VNF between the pod network and a physical network. 

Yep.
 
But it could also be that someone wants to spin up a VM and connect it to a network the way any conventional hypervisor does today with a vswitch and it has no connectivity into the k8s world at all. 

Correct - We have a way to completely disconnect a VM from the pod network.
 
 Then the VM has full L2/L3 inbound and outbound connectivity on one or more interfaces (or networks).  K8s along with kubevit is providing scheduling, storage, network (single or multiple interfaces) and HA across a multi-node compute cluster.

This is true except for "network".

Kubernetes provides scheduling, compute, and storage.
But the networking model of Kubernetes would not be used. Kubernetes networking model is today limited to "the pod network". Which does guarantee integration with Services and Ingress, but does **not** allow plain L2 networking.

multus - even if it is designed for Kubernetes - is providing an additional way to do networking in Kubernetes - but I wouldn't phrase it as that it's "Kubernetesn netwrking".
The issue is that interfaces provided by multus are _not necessarily_ able to tie in with the rest of the Kubernetes networking concepts like Service and Ingress.
I don't think it is an issue for your use-case, I just wanted to clarify it :)


I'm trying to replicate the last VM model at this point - am I using this in an unintended way?  Or if there is an easier way I'm open to other solutions as well.

With multus - and Yuval, Sebastian, Ihar, Petr, üplease keep me honest - you should be able to attach one or more plain interfaces (in addition to the primary interface) from the host to the VM in a pretty optimal way.

Let me reply inline on the previous Qs.
 

-K

On Mon, Oct 29, 2018 at 1:04 AM Roman Mohr <rm...@redhat.com> wrote:
On Fri, Oct 26, 2018 at 11:50 PM <keithdo...@gmail.com> wrote:
>
>
> Been following along and using the kubevirt project and had asked some questions around netwoking back in June.  Much has changed since then but I still need to double check that my understanding is correct and make sure I understand the project's direction.
>
> Currently, I have a setup that is using the network-attachment-definition as a top-level CRD, multus, and the network/interface stanzas to get L2 connectivity to a VM.

Apart from Yuval's precise explanations, I would be very interested in
hearing what exactly you are doing so that you can't use the pod
network. Maybe there are other solutions as well to simplify your
setup.

Best Regards,
Roman

>  I'm using the bridge plugin and and am attaching interfaces into linux bridges on the host.  I can do this and create multiple interfaces in the VM.  All of that works nicely most of the time and has been great progress.  I currently have two problems:  one is that if DHCP isn't working L2 segment the VM will never be started and the second is that sometimes even though the VM gets an assigned IP, the default route is not installed the VM.   I'd like to make sure my understanding and setup is correct first though before trying to debug those.


On which interface did you see this?
On the primary interface or on an additional multus backed interface?
 
> 1) IPAM - I get that k8s wants to know about the IPs assigned to pods and requires IPAM for service reason.  But when starting VMs and using multus on secondary interfaces that services are not currently supported on, is this really required?   Do I have to have IPAM plugins on the networks used by the VM interfaces?  Currently I am using the DHCP IPAM plugin which means that I have to have a DHCP daemon process running.   This entity then attempts to get DHCP leases before the pod is fully starting and proxies the DHCP requests.  This explains why when DHCP isn't working, things aren't started.  I don't think this is necessarily the behavior we want for VMs - many times it's easier to get them up and debug DHCP there.  Plus, maybe we want static IP config driven by the VM.
>
> 2) Kubevirt's DHCP proxy - It seems that Kubevirt is also installing a DHCP proxy in-between when using a bridge to connect the VM interface to the pod interface.  I get that before this was required, but is it really required now that the virt-launcher owns the pod IP and the VM is getting other addresses?  Why DHCP proxy there?  DHCP proxies worry me....there are so many esoteric options embedded in DHCP and proxying or relaying them sounds easy but the last 2% of the scenarios can be tricky.

This is only on the primary interface in order to provide feature parity with Kubrenetes pods.
This DHCP server is not present on the additional multus backed vNICs.
 
>
> 3) Pass-through - I know that in the specs there were plans to allow a "passthrough" mode to give the VM the virt-launcher interface directly.  In this scheme, no dhcp proxy, no bridge overhead in virt-launcher.  Is there a technical reason this can't be done or other problems to overcome or has it just not been gotten to?  Open to a patch in this area?  I may tinker with this some.

Ihar (CC'ed) is working on one aspect of this (SR-IOV).
SR-IOV is one way to do this. Would you need direct passthrough of one physical device?

Last thoughts are: it looks a bit like there was some confusion between how the primary vNIC and the additional multus backed vNICs behave.

- fabian
 
>
>
> It just seems to me if I were to remove the bridge IPAM portion and DHCP proxy there, the virt-launcher bridge and DHCP man-in-the-middle, the VM would have pure and direct L2 functionality.  Then if DHCP didn't work or the VM wanted static IPs, things would be fine.  But I also think some of the CNI infra is waiting for IPAM to complete and I'm not sure if/why it needs to know about the IPs on secondary interfaces.  That may or may not even be a question for kubevirt, but I suspect there are some here that opinions and can clarify.   I've looked at PRs, issues, specs sent out, and the roadmaps but can't definitively answer these questions.  Thanks for the time.
>
> -K
>
> --
> You received this message because you are subscribed to the Google Groups "kubevirt-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kubevirt-dev...@googlegroups.com.
> To post to this group, send email to kubevi...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kubevirt-dev/a13a6558-519c-44b3-87bb-3ed449a32f9b%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "kubevirt-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubevirt-dev...@googlegroups.com.
To post to this group, send email to kubevi...@googlegroups.com.

Keith Holleman

unread,
Oct 30, 2018, 1:30:17 PM10/30/18
to Fabian Deutsch, rm...@redhat.com, kubevi...@googlegroups.com, ylif...@redhat.com, ihra...@redhat.com

Thanks for the time for an extensive reply.  Agree with and follow everything you are saying except a few things.  Answers and comments in a trimmed version below.

 
This is true except for "network".

Kubernetes provides scheduling, compute, and storage.
But the networking model of Kubernetes would not be used. Kubernetes networking model is today limited to "the pod network". Which does guarantee integration with Services and Ingress, but does **not** allow plain L2 networking.

multus - even if it is designed for Kubernetes - is providing an additional way to do networking in Kubernetes - but I wouldn't phrase it as that it's "Kubernetesn netwrking".
The issue is that interfaces provided by multus are _not necessarily_ able to tie in with the rest of the Kubernetes networking concepts like Service and Ingress.
I don't think it is an issue for your use-case, I just wanted to clarify it :)

Okay, it's a grey area for sure.  You could argue whether kubernetes is supplying the network or not.  I get multus and CNI is outside.   The fact that you have to refer to CRDs defined in kubernetes is an argument the other way.  But I think we understand what the other is saying here.
 
 I'm using the bridge plugin and and am attaching interfaces into linux bridges on the host.  I can do this and create multiple interfaces in the VM.  All of that works nicely most of the time and has been great progress.  I currently have two problems:  one is that if DHCP isn't working L2 segment the VM will never be started and the second is that sometimes even though the VM gets an assigned IP, the default route is not installed the VM.   I'd like to make sure my understanding and setup is correct first though before trying to debug those.


On which interface did you see this?
On the primary interface or on an additional multus backed interface?

I saw this on the additional multus backed interface.
 
 
> 1) IPAM - I get that k8s wants to know about the IPs assigned to pods and requires IPAM for service reason.  But when starting VMs and using multus on secondary interfaces that services are not currently supported on, is this really required?   Do I have to have IPAM plugins on the networks used by the VM interfaces?  Currently I am using the DHCP IPAM plugin which means that I have to have a DHCP daemon process running.   This entity then attempts to get DHCP leases before the pod is fully starting and proxies the DHCP requests.  This explains why when DHCP isn't working, things aren't started.  I don't think this is necessarily the behavior we want for VMs - many times it's easier to get them up and debug DHCP there.  Plus, maybe we want static IP config driven by the VM.
>
> 2) Kubevirt's DHCP proxy - It seems that Kubevirt is also installing a DHCP proxy in-between when using a bridge to connect the VM interface to the pod interface.  I get that before this was required, but is it really required now that the virt-launcher owns the pod IP and the VM is getting other addresses?  Why DHCP proxy there?  DHCP proxies worry me....there are so many esoteric options embedded in DHCP and proxying or relaying them sounds easy but the last 2% of the scenarios can be tricky.

This is only on the primary interface in order to provide feature parity with Kubrenetes pods.
This DHCP server is not present on the additional multus backed vNICs.

I don't think that's true.  This is what I see when I spin something up using the yaml files for my VMI provided before.  My pod network is 10.244.0.0/16.  Here I can see that eth0 has the pod network and no DHCP server but the additional multus interface does have the DHCP server.

[root@beiwe-local /]# ss -tuapn

Netid           State             Recv-Q            Send-Q                             Local Address:Port                         Peer Address:Port            

udp             UNCONN            17920             0                               0.0.0.0%k6t-net1:67                                0.0.0.0:*                users:(("virt-launcher",pid=18,fd=16))

[root@beiwe-local /]# ip addr

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

    inet 127.0.0.1/8 scope host lo

       valid_lft forever preferred_lft forever

3: eth0@if25: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default 

    link/ether 0a:58:0a:f4:01:0b brd ff:ff:ff:ff:ff:ff link-netnsid 0

    inet 10.244.1.11/24 scope global eth0

       valid_lft forever preferred_lft forever

5: net1@if26: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master k6t-net1 state UP group default 

    link/ether 0a:58:0a:1a:14:9e brd ff:ff:ff:ff:ff:ff link-netnsid 0

6: k6t-net1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 

    link/ether 0a:58:0a:1a:14:9e brd ff:ff:ff:ff:ff:ff

    inet 169.254.75.10/32 brd 169.254.75.10 scope global k6t-net1

       valid_lft forever preferred_lft forever

7: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master k6t-net1 state UNKNOWN group default qlen 1000

    link/ether fe:58:0a:c1:fc:f7 brd ff:ff:ff:ff:ff:ff


I think as Yuval pointed out before, this DHCP behavior is determined by whether an IP is present or not.   So if you can get a CNI plugin to not use IPAM (he provided two options), this should be fine.  I'm trying to patch the bridge using the PR he pointed out to avoid IPAM, but having some issues that I need to work through more.

>
> 3) Pass-through - I know that in the specs there were plans to allow a "passthrough" mode to give the VM the virt-launcher interface directly.  In this scheme, no dhcp proxy, no bridge overhead in virt-launcher.  Is there a technical reason this can't be done or other problems to overcome or has it just not been gotten to?  Open to a patch in this area?  I may tinker with this some.

Ihar (CC'ed) is working on one aspect of this (SR-IOV).
SR-IOV is one way to do this. Would you need direct passthrough of one physical device?

Did you see my follow-up that the pass through I was referring to was in the virt-launcher, not from the host?   Simply a way to avoid the additional bridge in virt-launcher for additional multus interfaces when using a CNI plugin like bridge or the ovs one.

-K

Keith Holleman

unread,
Oct 30, 2018, 7:47:47 PM10/30/18
to Fabian Deutsch, rm...@redhat.com, kubevi...@googlegroups.com, ylif...@redhat.com, ihra...@redhat.com

I think as Yuval pointed out before, this DHCP behavior is determined by whether an IP is present or not.   So if you can get a CNI plugin to not use IPAM (he provided two options), this should be fine.  I'm trying to patch the bridge using the PR he pointed out to avoid IPAM, but having some issues that I need to work through more.


Just to follow up, I did get the bridge plugin to avoid DHCP and then as expected the DHCP ipam plugin wasn't needed in the host and the virt-launcher DHCP server was no where to be found.  This was the goal from the start - pure L2/DHCP to VM, so thanks all for the help.   So far haven't seen issues with the default route missing, will keep watching.

The question about pass-through is still open though, would be nice to avoid the bridge in the virt-launcher.  I may work on getting this pass-through to work at some point and submitting a PR but it won't be right now.  If anyone knows of technical reasons or challenges this hasn't been done yet, please enlighten.

-K

 
Reply all
Reply to author
Forward
0 new messages