I'm happy to give a quick overview and demo of the BPF based
kube-proxy alternative we have been working on. It might be an
alternative to IPVS. The base numbers are equal or better than IPVS and much better with DRS enabled. It works with IPv4 and IPv6.
Metrics | Number of Service | LVS | IPTables |
Time to access service | 1000 | 10 ms | 7-18 ms |
5000 | 9 ms | 15-80 ms | |
10000 | 9 ms | 80-7000 ms | |
15000 | 9 ms | Unresponsive | |
50000 | 9 ms | Unresponsive | |
Memory Usage | 1000 | 386 MB | 1.1 G |
5000 | N/A | 1.9 G | |
10000 | 542 MB | 2.3 G | |
15000 | N/A | Out of Memory | |
50000 | 1272 MB | Out of Memory | |
CPU Usage | 1000 | 0% | N/A |
5000 | 50% - 85% | ||
10000 | 50%-100% | ||
15000 | N/A | ||
50000 | N/A |
An issue recently came to light that IPVS doesn't support port ranges,
which many people have asked for. Keep it in mind as prototypes
emerge.
On Mon, Dec 12, 2016 at 10:22 AM, <tomasz.p...@intel.com> wrote:
> Folks,
>
> There's a discussion related to a new IPVS backend ip kube-proxy. Our small
> research at Intel shows that it may really improve CPU utilization
> (especially in DR mode) when compared with current iptables mode. I've
> notice that work has already started [1] but looks like code is abandoned.
> Can we have this item on the agenda for the next sig-networking meeting so
> we can discuss few different features set for ipvs backend ?
>
> Thanks in advance.
>
> --
> You received this message because you are subscribed to the Google Groups
> "kubernetes-sig-network" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to kubernetes-sig-network+unsub...@googlegroups.com.
Folks,
The reason my PR https://github.com/kubernetes/kubernetes/pull/30134 is in WIP status is because I use seesaw's ipvs package to talk sync ipvs configuration, which has a libnl compile and runtime dependency, so we need to update some build/deploy scripts.
I am recently trying to use a pure go approach of netlink(github.com/vishvananda/netlink/nl) to talk with ipvs kernel module. Unfortunately, I am not a netlink expert, if anyone is familiar with netlink(for example, know how to construct a "ipvsadm --restore" netlink request), it will definitely help a lot to speed up the development process.
On a side note, libnetwork have a netlink ipvs package we could leverage on, see https://github.com/docker/libnetwork/tree/master/ipvs . But unfortunately, it only has Create/Update/Delete methods for ipvs Service and Destination, missing Get methods for ipvs Service/Destination. netlink request for Get is not hard to construct, but it's challenging to parse the netlink response.
--
You received this message because you are subscribed to the Google Groups "kubernetes-sig-network" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-ne...@googlegroups.com.
To post to this group, send email to kubernetes-...@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-sig-network.
For more options, visit https://groups.google.com/d/optout.
IPVS as another back end for kube-proxy would be pretty interesting.
However, its not at all clear to me how the performance gains
described are a direct result of IPVS (vs other network
optimizations).
My (incomplete) understanding of DPDK and other network optimizations
(VPP, etc.) is that they are most effective when used within a
dedicated networking device. How these optimizations benefit the more
general use case where hosts mix workloads is not obvious to me.
The performance gain comes from avoiding sequential lists. In iptables
context, some of them can be avoided with ipset but in particular for
DNAT many of them remain. This limits scale. IPVS, nftables, BPF, ...
work around this by providing more complex data structures such as
hash tables to optimize this.
This is *not* about the fixed cost of the datapath itself and is thus
unrelated to other network optimizations.
> email to kubernetes-sig-network+unsub...@googlegroups.com.
On Tuesday, December 13, 2016 at 4:23:21 PM UTC+1, Chris Marino wrote:
IPVS as another back end for kube-proxy would be pretty interesting.
However, its not at all clear to me how the performance gains
described are a direct result of IPVS (vs other network
optimizations).Easiest way to get Kubernetes into DPDK is OVS, but than we would need yet another backend for doing nat/snat in a OVS way :P Performance gains would be noticeable on network intensive workloads but DPDK can also ensure very stable jitter as packet switching would be done in user space and with help of intel RDT and cpuset you can ensure dedicated CPU resources.
My (incomplete) understanding of DPDK and other network optimizations
(VPP, etc.) is that they are most effective when used within a
dedicated networking device. How these optimizations benefit the more
general use case where hosts mix workloads is not obvious to me.
It depends how network intensive is your workload :P For regular use cases (excluding loadbalancers, highly loaded memcache) replacing iptables with ipvsadm should be enough.
I will try to come up with few slides showing potential use cases for DPDK and where it really benefits.TP--
You received this message because you are subscribed to the Google Groups "kubernetes-sig-network" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-network+unsub...@googlegroups.com.
To post to this group, send email to kubernetes-sig-network@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "kubernetes-sig-network" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-network+unsub...@googlegroups.com.
To post to this group, send email to kubernetes-sig-network@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-network+unsubscri...@googlegroups.com.
To post to this group, send email to kubernetes-sig-network@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-sig-network.
For more options, visit https://groups.google.com/d/optout.
To post to this group, send email to kubernetes-...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-network+unsubscri...@googlegroups.com.
To post to this group, send email to kubernetes-...@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-sig-network.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "kubernetes-sig-network" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-network+unsub...@googlegroups.com.
To post to this group, send email to kubernetes-sig-network@googlegroups.com.
I think we did find a 20% ish improvement in our kernels when we turned
off kernel audit. But we never got anywhere near the ubuntu kernel
numbers. So still don't know what the different is, and haven't really
done any digging....