Flannel Perfomance on AWS

Michael Hamrah

unread,

May 10, 2015, 9:32:49 AM5/10/15

to coreo...@googlegroups.com

I've been running some performance tests with Flannel on AWS and noticed an interesting result when switching from m3.large to c4.xlarge with enhanced networking. I'm on CoreOS stable running Flannel 0.3.0.

I have two m3.large nodes running in the same AZ, in a VPC, in us-west-2. When running a container with net=host using iperf I see 713 Mbits/sec with a ping latency of round-trip min/avg/max = 0.269/0.317/0.393 ms.

```

/ # iperf -s

------------------------------------------------------------

Server listening on TCP port 5001

TCP window size: 85.3 KByte (default)

------------------------------------------------------------

[ 4] local 10.252.128.209 port 5001 connected with 10.252.128.104 port 40821

[ ID] Interval Transfer Bandwidth

[ 4] 0.0-10.0 sec 853 MBytes 713 Mbits/sec

```

Running in a container using flannel this drops down to 614 Mbits/sec with a latency of round-trip min/avg/max = 0.404/0.455/0.547 ms.

```

/ # iperf -s

------------------------------------------------------------

Server listening on TCP port 5001

TCP window size: 85.3 KByte (default)

------------------------------------------------------------

[ 4] local 172.21.102.27 port 5001 connected with 172.21.6.13 port 43429

[ ID] Interval Transfer Bandwidth

[ 4] 0.0-10.4 sec 758 MBytes 614 Mbits/sec

```

This seems to be inline with the flannel overhead I've been reading out. I also wanted to test ec2 with enhanced networking to see if the margin of separation would be minimized. The same setup running on c4.xlarges yielded 1 Gbits/sec which is the same as running the Amazon Linux AMI.

```

------------------------------------------------------------

Server listening on TCP port 5001

TCP window size: 85.3 KByte (default)

------------------------------------------------------------

[ 4] local 10.252.129.146 port 5001 connected with 10.252.129.145 port 44632

[ ID] Interval Transfer Bandwidth

[ 4] 0.0-10.0 sec 1.19 GBytes 1.02 Gbits/sec

```

However, when running with flannel, I get a drastic reduction in throughput. I tried several runs and always came in around the 130 Mbits/sec

```

/ # iperf -s

------------------------------------------------------------

Server listening on TCP port 5001

TCP window size: 85.3 KByte (default)

------------------------------------------------------------

[ 4] local 172.21.4.17 port 5001 connected with 172.21.71.2 port 46283

[ ID] Interval Transfer Bandwidth

[ 4] 0.0-10.4 sec 171 MBytes 138 Mbits/sec

[ 5] local 172.21.4.17 port 5001 connected with 172.21.71.2 port 46319

[ 5] 0.0-10.1 sec 161 MBytes 134 Mbits/sec

[ 4] local 172.21.4.17 port 5001 connected with 172.21.71.3 port 58268

[ 4] 0.0-10.3 sec 188 MBytes 153 Mbits/sec

```

I see a couple of possible reasons:

1) I'm doing something drastically wrong

2) There's odd behavior when handling UDP packets with c4.xlarge instances (I should run another explicit test with UDP, but in previous tests, I don't remember this being an issue).

3) I got some wonky instances/some other stuff was going on with the nodes.

I want to try a few different backends, but I'm curious if others have run into this issue, or if anyone can offer advice for optimizing the flannel overlay network?

Thanks,

Mike

Alex Polvi

unread,

May 10, 2015, 11:59:59 PM5/10/15

to Michael Hamrah, coreos-user

Michael, you should try out the native aws-vpc backend for flannel-- and use native AWS VPC fabric.

-Alex

--
You received this message because you are subscribed to the Google Groups "CoreOS User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to coreos-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Eugene Yakubovich

unread,

May 11, 2015, 1:54:23 PM5/11/15

to Alex Polvi, Michael Hamrah, coreos-user

Hi Mike,

Is this using default (udp) backend or VXLAN? VXLAN should give you
better performance. Like Alex Polvi has said, you can try aws-vpc
backend to avoid UDP encapsulation. That shipped in flannel 0.4.0, you
can opt into that by placing

Environment=FLANNEL_VER=0.4.0

into the flannel drop-in. Make sure to disable "Src/Dest Check" on the
instances -- otherwise AWS networking will drop packets from unknown
IPs.

However, I would also like to understand why c4.xlarge instances cause
such performance degradation. Did you enable Enhanced Networking (on
the instance and loaded the driver)? If so, it might actually be
what's causing the problems. Looks like AWS recommends ixgbev version
2.14.2+ (for performance enchantments) but CoreOS Linux ships with
2.12.1-k (that's the version that's in the upstream kernel). Also, are
you running both m3.large and c4.xlarge with the HVM image? (to rule
out PV vs HVM diffs).

Thanks,
Eugene

Michael Hamrah

unread,

May 11, 2015, 5:19:03 PM5/11/15

to coreo...@googlegroups.com, mha...@gmail.com, alex....@coreos.com

I'm open to the AWS VPC backend, I was simply exploring the default (on a side note, I do plan on using a multi-cloud cluster; wondering if you can use the VPC backend if instances aren't on AWS? seems doubtful).

I'm more interested in the c4.xlarge issue as well.

- Both are using HVM images using ami-37280207.

- Enhanced networking is actually off: the default vif driver is used.

```

core@ip-10-252-128-32 ~ $ ethtool -i eth0

driver: vif

version:

firmware-version:

bus-info: vif-0

supports-statistics: yes

supports-test: no

supports-eeprom-access: no

supports-register-dump: no

supports-priv-flags: no

```

I haven't tried installing the latest version of ixgbevf; I also don't want to do something that intrusive to CoreOS. I'll try an alpha channel to see what the version is there.

eugene.y...@coreos.com

unread,

May 11, 2015, 5:36:11 PM5/11/15

to coreo...@googlegroups.com, mha...@gmail.com

On Monday, May 11, 2015 at 2:19:03 PM UTC-7, Michael Hamrah wrote:

I'm open to the AWS VPC backend, I was simply exploring the default (on a side note, I do plan on using a multi-cloud cluster; wondering if you can use the VPC backend if instances aren't on AWS? seems doubtful).

It'll only work on AWS.

I haven't tried installing the latest version of ixgbevf; I also don't want to do something that intrusive to CoreOS. I'll try an alpha channel to see what the version is there.

It'll be the same version there. For some reason, Intel has not upstreamed the newer versions. However, if the SR-IOV is off, it actually reduces the surface area to search. I would try VXLAN backend though as it will provide another data point.

For default UDP backend, one thing that can help is setting CPU affinity to give iperf and flanneld their own core. Try using taskset to ensure they're on different CPUs and see if it makes a difference.

Michael Hamrah

unread,

May 11, 2015, 7:30:32 PM5/11/15

to eugene.y...@coreos.com, coreo...@googlegroups.com

Good to know re: CoreOS alpha. I'll start experimenting with different backends.

Here's an interesting data point. I ran an iperf test in in UDP mode. Here's the result with net=host:

/ # iperf -c 10.252.128.32 -u -b 1000000000

------------------------------------------------------------

Client connecting to 10.252.128.32, UDP port 5001

Sending 1470 byte datagrams

UDP buffer size: 208 KByte (default)

------------------------------------------------------------

[ 3] local 10.252.128.33 port 54702 connected with 10.252.128.32 port 5001

[ ID] Interval Transfer Bandwidth

[ 3] 0.0-10.0 sec 966 MBytes 810 Mbits/sec

[ 3] Sent 688930 datagrams

[ 3] Server Report:

[ 3] 0.0-10.0 sec 701 MBytes 588 Mbits/sec 0.048 ms 189047/688929 (27%)

[ 3] 0.0-10.0 sec 1 datagrams received out-of-order

I'm getting 588 Mbits/sec by buffering 1000 Megabits.

Running on the flannel overlay network I get a similar result:

/ # iperf -c 172.21.46.14 -u -b 1000000000000

------------------------------------------------------------

Client connecting to 172.21.46.14, UDP port 5001

Sending 1470 byte datagrams

UDP buffer size: 208 KByte (default)

------------------------------------------------------------

[ 3] local 172.21.84.9 port 35303 connected with 172.21.46.14 port 5001

[ ID] Interval Transfer Bandwidth

[ 3] 0.0-10.0 sec 966 MBytes 810 Mbits/sec

[ 3] Sent 689036 datagrams

[ 3] Server Report:

[ 3] 0.0-10.0 sec 602 MBytes 505 Mbits/sec 0.099 ms 259280/689035 (38%)

[ 3] 0.0-10.0 sec 1 datagrams received out-of-order

Not quite the same thing, but close. A lot better than TCP.

My guess is there's some buffer setting with how the UDP backend wraps the TCP packets affecting performance. I don't know why it would be c4.xlarge, except if it was due to the driver defaulting to something really low because of the enhanced networking hardware not being set up correctly.

Mike

Reply all

Reply to author

Forward