HTTP/1 vs HTTP/2 performance

129 views
Skip to first unread message

dtehr...@twilio.com

unread,
Aug 17, 2017, 2:05:16 PM8/17/17
to envoy-users
Hi folks,

I'm seeing some unexpected performance results w/Envoy's HTTP/2 proxying that I'm not able to figure out.

In summary: I'm seeing higher latencies (mean, median, p95, p99) via HTTP/2 than with HTTP/1. I thought that the persistent connections w/H2 would improve things...

Test Setup:
  * Using two `c3.2xlarge`s within the same AZ in AWS. 8 CPUs, 1Gb network.
  * Using Vegeta as the client-side test-driver: https://github.com/tsenart/vegeta
  * Running Vegta at 1,000 HTTP reqs/sec.
  * Using nginx as the server. The only thing the nginx server does is a `return 200;` on "/". Very simple and performant.
  * HTTP/1 setup: Envoy-to-nginx.
      * Admittedly there is one less hop (Envoy) in this configuration. Using this as the config because it mirrors how we're using HAProxy today.
  * HTTP/2: Envoy-to-Envoy in between Vegeta & nginx, doing HTTP/1.1->HTTP/2 proxying.
  * A very recent build of Envoy (c7004069ffd2654b5a7c7f76340f5266b12d8c9f/Clean/RELEASE)

Test Execution:
  * Running vegeta with `vegeta attack -duration=60s -rate 1000 -workers 10`
  * I observe via `netstat` that the persistent TCP connections are created for HTTP/2. Cool.
  * On the ingress box, I observe via `tcpdump` that ingress traffic on `eth0` is indeed HTTP/2, and also that traffic on `lo` is HTTP/1. Looks like everything is wired up correctly.
  * I've run this many times over the past 2 days so results are repeatable.


JSON configs, and output from /stats & /clusters are attached. Also attached is a chart of test results for 1,000 reqs/sec, but the charts for 5k & 10k rps are of similar shapes. Only attaching 1K rps for now to avoid confusing.

Is my testing methodology wrong, or my Envoys misconfigured? I've tried tuning various Envoy buffer/circuit breaking/generate_request_id settings, with negligible improvements.

Thanks for any help,
Dan

chart.png
egress-clusters.txt
egress-envoy-config.json
egress-stats.txt
ingress-clusters.txt
ingress-envoy-config.json
ingress-stats.txt

Daniel Hochman

unread,
Aug 17, 2017, 2:35:48 PM8/17/17
to dtehr...@twilio.com, envoy-users
Took a quick look.

In the HTTP/1 test, Envoy is running client (egress) side, is that correct? I'm not seeing stats for that portion of the test.
 
Because the request itself is so fast, there's probably not a lot of connection churn on HTTP/1 or concurrent requests even at 1K/s. I wonder if that's contributing to the results of this test.


Daniel Hochman
Engineer
Lyft

--
You received this message because you are subscribed to the Google Groups "envoy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to envoy-users+unsubscribe@googlegroups.com.
To post to this group, send email to envoy...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/envoy-users/94611f60-12b6-42ee-9a18-adbf9db683c6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Matt Klein

unread,
Aug 17, 2017, 3:11:08 PM8/17/17
to Daniel Hochman, Dan Tehranian, envoy-users
Yeah, per Daniel, I think the workload you have may be faster w/ HTTP/1. The reason is that you probably are not getting much benefit (if any) from header compression, HTTP/2 framing adds overhead, and with the low number of clients, Envoy is likely opening up enough concurrent HTTP/1.1 connections that everything is happening in parallel.

This is not really an Envoy issue per say but I think really an HTTP/1 vs. HTTP/2 issue. (Obviously there might be some issue here related to Envoy but I kind of doubt it).


For more options, visit https://groups.google.com/d/optout.



--
Matt Klein
Software Engineer
mkl...@lyft.com

dtehr...@twilio.com

unread,
Aug 17, 2017, 4:08:17 PM8/17/17
to envoy-users, dhoc...@lyft.com, dtehr...@twilio.com
Hi Daniel & Matt, thanks of the replies.

re: missing stats for HTTP/1 egress - Sorry about that. New file attached. 

re: header compression - Yea, were talking about that as a possibility as well. I will try to grab sample headers from our prod env and re-test with that.

Let me ask this: If one wanted to construct a test that allowed the benefits of Envoy's HTTP/2 proxy to truly shine, what should be added? Sounds like additional HTTP headers are needed. Maybe a POST payload and some JSON response?

Thanks,
Dan
To unsubscribe from this group and stop receiving emails from it, send an email to envoy-users...@googlegroups.com.

To post to this group, send email to envoy...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/envoy-users/94611f60-12b6-42ee-9a18-adbf9db683c6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "envoy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to envoy-users...@googlegroups.com.

To post to this group, send email to envoy...@googlegroups.com.
egress-stats-2.txt

Matt Klein

unread,
Aug 17, 2017, 6:49:30 PM8/17/17
to Dan Tehranian, envoy-users, Daniel Hochman
Let me ask this: If one wanted to construct a test that allowed the benefits of Envoy's HTTP/2 proxy to truly shine, what should be added? Sounds like additional HTTP headers are needed. Maybe a POST payload and some JSON response?

For internal datacenter use cases, HTTP/2 will be better when you have a large mesh of many backends and high request volume. Instead of having to keep alive multiple connections to each backend you will only need a single connection (balancing memory usage vs. blocking waiting for the connection pool). The other major benefit is that when there is a server error/timeout, instead of having to kill the connection, it can be kept alive because the stream is reset but the connection stays alive. IMO this is the primary benefit. To expose this you would need to start injecting errors into the system and see how it effects overall latency.

I suspect there are some workloads where header compression will help within datacenter, but this is a balance of smaller data vs. compression/decompression time. I'm not an expert on this. There are people on this list that probably have a better idea of when/where HTTP/2 is going to be faster for a single request/response within DC.



To unsubscribe from this group and stop receiving emails from it, send an email to envoy-users+unsubscribe@googlegroups.com.

To post to this group, send email to envoy...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages