Flannel Perfomance on AWS

367 views
Skip to first unread message

Michael Hamrah

unread,
May 10, 2015, 9:32:49 AM5/10/15
to coreo...@googlegroups.com
I've been running some performance tests with Flannel on AWS and noticed an interesting result when switching from m3.large to c4.xlarge with enhanced networking. I'm on CoreOS stable running Flannel 0.3.0.

I have two m3.large nodes running in the same AZ, in a VPC, in us-west-2. When running a container with net=host using iperf I see 713 Mbits/sec with a ping latency of round-trip min/avg/max = 0.269/0.317/0.393 ms.

```
/ # iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 10.252.128.209 port 5001 connected with 10.252.128.104 port 40821
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec   853 MBytes   713 Mbits/sec
```

Running in a container using flannel this drops down to 614 Mbits/sec with a latency of round-trip min/avg/max = 0.404/0.455/0.547 ms. 

```
/ # iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 172.21.102.27 port 5001 connected with 172.21.6.13 port 43429
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.4 sec   758 MBytes   614 Mbits/sec
```

This seems to be inline with the flannel overhead I've been reading out. I also wanted to test ec2 with enhanced networking to see if the margin of separation would be minimized. The same setup running on c4.xlarges yielded 1 Gbits/sec which is the same as running the Amazon Linux AMI.

```
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 10.252.129.146 port 5001 connected with 10.252.129.145 port 44632
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  1.19 GBytes  1.02 Gbits/sec
```

However, when running with flannel, I get a drastic reduction in throughput. I tried several runs and always came in around the 130 Mbits/sec

```
/ # iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 172.21.4.17 port 5001 connected with 172.21.71.2 port 46283
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.4 sec   171 MBytes   138 Mbits/sec
[  5] local 172.21.4.17 port 5001 connected with 172.21.71.2 port 46319
[  5]  0.0-10.1 sec   161 MBytes   134 Mbits/sec
[  4] local 172.21.4.17 port 5001 connected with 172.21.71.3 port 58268
[  4]  0.0-10.3 sec   188 MBytes   153 Mbits/sec
```

I see a couple of possible reasons:

1) I'm doing something drastically wrong
2) There's odd behavior when handling UDP packets with c4.xlarge instances (I should run another explicit test with UDP, but in previous tests, I don't remember this being an issue).
3) I got some wonky instances/some other stuff was going on with the nodes.

I want to try a few different backends, but I'm curious if others have run into this issue, or if anyone can offer advice for optimizing the flannel overlay network?

Thanks,

Mike

Alex Polvi

unread,
May 10, 2015, 11:59:59 PM5/10/15
to Michael Hamrah, coreos-user
Michael, you should try out the native aws-vpc backend for flannel-- and use native AWS VPC fabric.

-Alex

--
You received this message because you are subscribed to the Google Groups "CoreOS User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to coreos-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Eugene Yakubovich

unread,
May 11, 2015, 1:54:23 PM5/11/15
to Alex Polvi, Michael Hamrah, coreos-user
Hi Mike,

Is this using default (udp) backend or VXLAN? VXLAN should give you
better performance. Like Alex Polvi has said, you can try aws-vpc
backend to avoid UDP encapsulation. That shipped in flannel 0.4.0, you
can opt into that by placing

Environment=FLANNEL_VER=0.4.0

into the flannel drop-in. Make sure to disable "Src/Dest Check" on the
instances -- otherwise AWS networking will drop packets from unknown
IPs.

However, I would also like to understand why c4.xlarge instances cause
such performance degradation. Did you enable Enhanced Networking (on
the instance and loaded the driver)? If so, it might actually be
what's causing the problems. Looks like AWS recommends ixgbev version
2.14.2+ (for performance enchantments) but CoreOS Linux ships with
2.12.1-k (that's the version that's in the upstream kernel). Also, are
you running both m3.large and c4.xlarge with the HVM image? (to rule
out PV vs HVM diffs).

Thanks,
Eugene

Michael Hamrah

unread,
May 11, 2015, 5:19:03 PM5/11/15
to coreo...@googlegroups.com, mha...@gmail.com, alex....@coreos.com
I'm open to the AWS VPC backend, I was simply exploring the default (on a side note, I do plan on using a multi-cloud cluster; wondering if you can use the VPC backend if instances aren't on AWS? seems doubtful).

I'm more interested in the c4.xlarge issue as well.

- Both are using HVM images using ami-37280207. 
- Enhanced networking is actually off: the default vif driver is used.

```
core@ip-10-252-128-32 ~ $ ethtool -i eth0
driver: vif
version:
firmware-version:
bus-info: vif-0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: no
```

I haven't tried installing the latest version of ixgbevf; I also don't want to do something that intrusive to CoreOS. I'll try an alpha channel to see what the version is there.

eugene.y...@coreos.com

unread,
May 11, 2015, 5:36:11 PM5/11/15
to coreo...@googlegroups.com, mha...@gmail.com


On Monday, May 11, 2015 at 2:19:03 PM UTC-7, Michael Hamrah wrote:
I'm open to the AWS VPC backend, I was simply exploring the default (on a side note, I do plan on using a multi-cloud cluster; wondering if you can use the VPC backend if instances aren't on AWS? seems doubtful).

It'll only work on AWS.
 

I haven't tried installing the latest version of ixgbevf; I also don't want to do something that intrusive to CoreOS. I'll try an alpha channel to see what the version is there.

It'll be the same version there. For some reason, Intel has not upstreamed the newer versions. However, if the SR-IOV is off, it actually reduces the surface area to search. I would try VXLAN backend though as it will provide another data point.

For default UDP backend, one thing that can help is setting CPU affinity to give iperf and flanneld their own core. Try using taskset to ensure they're on different CPUs and see if it makes a difference.

Michael Hamrah

unread,
May 11, 2015, 7:30:32 PM5/11/15
to eugene.y...@coreos.com, coreo...@googlegroups.com
Good to know re: CoreOS alpha. I'll start experimenting with different backends.

Here's an interesting data point. I ran an iperf test in in UDP mode. Here's the result with net=host:

/ # iperf -c 10.252.128.32 -u -b 1000000000
------------------------------------------------------------
Client connecting to 10.252.128.32, UDP port 5001
Sending 1470 byte datagrams
UDP buffer size:  208 KByte (default)
------------------------------------------------------------
[  3] local 10.252.128.33 port 54702 connected with 10.252.128.32 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec   966 MBytes   810 Mbits/sec
[  3] Sent 688930 datagrams
[  3] Server Report:
[  3]  0.0-10.0 sec   701 MBytes   588 Mbits/sec   0.048 ms 189047/688929 (27%)
[  3]  0.0-10.0 sec  1 datagrams received out-of-order

I'm getting 588 Mbits/sec by buffering 1000 Megabits.

Running on the flannel overlay network I get a similar result:

/ # iperf -c 172.21.46.14 -u -b 1000000000000
------------------------------------------------------------
Client connecting to 172.21.46.14, UDP port 5001
Sending 1470 byte datagrams
UDP buffer size:  208 KByte (default)
------------------------------------------------------------
[  3] local 172.21.84.9 port 35303 connected with 172.21.46.14 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec   966 MBytes   810 Mbits/sec
[  3] Sent 689036 datagrams
[  3] Server Report:
[  3]  0.0-10.0 sec   602 MBytes   505 Mbits/sec   0.099 ms 259280/689035 (38%)
[  3]  0.0-10.0 sec  1 datagrams received out-of-order

Not quite the same thing, but close. A lot better than TCP.

My guess is there's some buffer setting with how the UDP backend wraps the TCP packets affecting performance. I don't know why it would be c4.xlarge, except if it was due to the driver defaulting to something really low because of the enhanced networking hardware not being set up correctly.

Mike 
Reply all
Reply to author
Forward
0 new messages