Performance on medium and high latency links?

523 views
Skip to first unread message

John Rusk

unread,
Oct 4, 2016, 10:53:47 PM10/4/16
to QUIC Prototype Protocol Discussion group
Hi,

I tried out QUIC on some high bandwidth links, of varying latencies, using the test client and server: quic_server and quic_client. I noticed that throughput decreased significantly as latency increased. 

Stats from my tests, which all involved transferring a 1MB file:

    On localhost: takes 0.06 of a second. I.e. 60 milliseconds.

    Internet link with 7ms latency: 0.8 of a second

    Internet link with 20ms latency: about 2.5 seconds

    Internet link with 60ms latency: about 7 seconds

Is such a strong relationship between latency and throughput normal with QUIC?

I believe I have successfully increased the default max cwnd, as recommended in a previous thread, although these results make me wonder if I also need to change other parameters too.

John

--

John Rusk
Software Engineer @ Microsoft

Ian Swett

unread,
Oct 5, 2016, 12:16:08 PM10/5/16
to proto...@chromium.org
The default max CWND in the most recent code(ie: M53) is 2000 packets, so I wouldn't expect that to be the limiting factor in your use cases.

That looks like you're testing on such a high bandwidth link, you might be limited by the speed of slow start, since that takes N^2 more time to reach the right CWND as latency increases.  To make the issue worse, it's possible hybrid slow start is causing you to exit slow start early due to min_rtt increase?

Are these numbers fairly repeatable?  About how fast is the link?  What are comparable numbers for TCP?

--
You received this message because you are subscribed to the Google Groups "QUIC Prototype Protocol Discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to proto-quic+unsubscribe@chromium.org.
To post to this group, send email to proto...@chromium.org.
For more options, visit https://groups.google.com/a/chromium.org/d/optout.

Mamoun Mansour

unread,
Oct 5, 2016, 12:31:32 PM10/5/16
to proto...@chromium.org
Hi , i have tested both TCP vs QUIC in the same condition like John and find that TCP is better in High  delayed bandwidth, any explanation for this results?

Regrds

Van Catha

unread,
Oct 5, 2016, 12:36:56 PM10/5/16
to QUIC Prototype Protocol Discussion group
Could this be related to packet loss and perhaps the demo client/server do not use optimal algorithms for dealing with it?

Ian Swett

unread,
Oct 5, 2016, 12:59:22 PM10/5/16
to proto...@chromium.org
At some bandwidth over 100Mbits, I'd expect TCP to perform better.  In particular, the quic_client and quic_server are not as fast as Chrome to our production server at Google.

However, I wouldn't expect this to occur somewhat consistently at higher bandwidths, and not be RTT dependent.

It should not be congestion control related, since Linux defaults to Cubic as well, but it's always possible our Cubic code has a bug.

If this is easy to repro with the quic client and server, I'd be happy to look at some code and diagnose what's happening.

On Wed, Oct 5, 2016 at 12:36 PM, Van Catha <van...@gmail.com> wrote:
Could this be related to packet loss and perhaps the demo client/server do not use optimal algorithms for dealing with it?

--

John Rusk

unread,
Oct 5, 2016, 5:20:01 PM10/5/16
to QUIC Prototype Protocol Discussion group
Hi Ian,

Yes, it should be easy to reproduce.  I just ran the QUIC client and server between machines in different Azure data centres.  OS was Ubuntu 14.04. The behavior was consistent in my tests. I tested on two different days (once with standard max cwnd and once with a larger one).  Got similar results on both days, and in multiple runs on those days.

Given that this is data centre to data centre, the bandwidth is well over 100Mbits! 

I have not exactly replicated the tests with TCP, although in similar tests, over the same links, TCP did degrade with latency, but not so severely.  (My TCP tests used a larger data file, and happened to be on different OS).

John 



Jana Iyengar

unread,
Oct 6, 2016, 1:15:08 PM10/6/16
to proto...@chromium.org
Hi John,

I would use a MUCH larger file to test -- 1 MB is ~1000packets, which you'll reach in under 10RTTs in high bandwidth networks. Startup effects dominate under these conditions: your connection will be in slow start, which will lead you to see variations in throughput with RTT. I'd need to check but I think quic-client also runs QUIC's full handshake, which would also significantly affect your results when your transfer is a handful of RTTs, and would explain the correlation between RTT and transfer time. 

Since QUIC uses Cubic (as does Linux TCP), there really shouldn't be a substantial difference in throughput at large file sizes between TCP and QUIC. I did some tests a while ago, so this might be dated, but I've been able to hit ~100Mbps with quic_client + quic_server, and the perf bottleneck was CPU due to the unoptimized client/server. Can you run 'top' on both machines as you run these experiments? As Ian suggested, one change would be to run Chrome against the quic_server, since that would make at least one of the ends better optimized for CPU, if that happens to be the limit in your case.

In general, for benchmarking the protocol, I recommend using the client/server at bandwidths that don't peg your CPU.

- jana

John Rusk

unread,
Oct 6, 2016, 4:33:05 PM10/6/16
to QUIC Prototype Protocol Discussion group
Thanks Jana,

Can you remember approximately what the latency of the link was, when you hit 100Mbps?

I did initially try a much larger file (1GB) and just gave up waiting for the transfer to complete when testing over 60ms link.   Maybe I should try something in the 10's or 100's of MB range.

John

John Rusk

unread,
Oct 6, 2016, 5:20:18 PM10/6/16
to QUIC Prototype Protocol Discussion group
Hi again,

I just tested a 50MB file.  It took 5 mins over 60ms link. Given the specs of the machines I'm using, and the fact that machines of the same spec transfer much faster over short latencies, I think we can be fairly confident that CPU is not the limiting factor.  (I didn't do "top", sorry).

I'm guessing that's a big enough file, and transfer duration, to verify that either (a) there's a real problem or (b) there's something wrong with the way I'm testing ;-)

John

Jana Iyengar

unread,
Oct 7, 2016, 1:18:39 AM10/7/16
to proto...@chromium.org
Sorry, John, I don't recall. As Ian said, we'll recreate this test and see what shows up.

Jim Roskind

unread,
Oct 8, 2016, 3:26:14 PM10/8/16
to proto...@chromium.org
John: Given your description of your experiment, was the way you manipulated (selected) the RTT by selecting different pairs of Azure data centres?  If so, how did you control for possible differences in packet loss, or concurrent congestion, in these different scenarios?  Did you monitor packet loss?

You commented that you did not replicate the tests with TCP, which left me to wonder about the results, and whether variations in packet loss played into this.  It would be good IMO to not only do the same experiments using TCP, it would be good to do both experiments somewhat contemporaneously.  For instance, alternate back-n-forth between running a TCP then a QUIC then a TCP experiment, etc. You should probably do this alternation until you see low variations for TCP, and low variations for QUIC.  You probably can't run the two tests concurrently, or else congestion from one will impact the other <sigh>.

It is also probably interesting to monitor packet loss, and make sure that UDP packets are not being mistreated on one of the paths.

Most commonly, when experiments like yours are done, they are performed in a very controlled (laboratory?) environment. When SPDY was being developed/tested, Mike Belshe even avoided using an Intranet, and instead ran tests on a physical private network, avoiding ANY chance of noise from unexpected concurrent traffic. Mike even worked very hard to reduce/remove not only external spurious traffic, he also reduced (eliminated?) concurrent process CPU costs on his stacked-pile-of-machines, which I affectionately called the "Belshe-Borg!" When running over the Internet, unexpected variations in concurrent flows can wreak havoc on consistency of results.  

IMO, unless your goal is to debug performance problems in Azure (and potential Azure focused optimizations!), I'd steal a page from Belshe's playbook, and certainly avoid using cloud computing for this class of experimentation.  ...but... YMMV.

Thanks,

Jim

p.s., You might also want to monitor the RTT *during* your experiments.  If there is bufferbloat variation on one of the paths (due to larger buffers?) that can also have a large impact on the reported results.  Here again, I'd watch to see that UDP packets were not being treated differently from TCP packets (monitor RTT for both types of streams).  It is more than conceivable (for example) than an "attempt" at fair-queuing in some router/switch misunderstood flows of UDP packets, resulting in more or less bufferbloat for TCP vs UDP.

Opinions expressed are my own, and not that of my company.

John Rusk

unread,
Oct 9, 2016, 5:31:56 PM10/9/16
to proto...@chromium.org
Thanks Jim,

Yes, that is how I selected RTT.

Your suggestions make sense, and by and large I didn't do them ;-)  Except for observing that:
(a) performance of other protocols (e.g. UDT) remains relatively stable between given pairs of data centres, from which I infer that the potential issues you talk about isn't affecting those protocols, and
(b) over long latency those other protocols out-perform the QUIC toy client & server by order(s) of magnitude, from which I infer that whatever causes QUIC to be slow over longer latencies is more than just some transient network congestion. (Or if it is transient network congestion, the QUIC toy client and server handle it unusually badly.)

I am hoping that Ian or Jana will attempt to reproduce my result.  If they get completely different results, that will imply that I messed up my tests in some bigger, more fundamental way that simply lacking the isolation you suggest - since I don't think your (very sensible) measures can explain orders of magnitude difference in throughput.  On the other hand, Ian/Jana may reproduce my results, which would be useful information of a different kind...

John


张晗

unread,
Nov 12, 2016, 7:17:46 AM11/12/16
to QUIC Prototype Protocol Discussion group, john...@gmail.com
Hi,
I tried out QUIC on two computers with toy server and chrome browser. OS ubuntu 14.04.
I transfered a web page with 50 images(about 170KB each).
Bandwidth 50Mbps, RTT 220ms, loss 2% page load time 1.61s
Bandwidth 10Mbps, RTT 220ms,loss 2% page load time 1.65s


在 2016年10月5日星期三 UTC+8上午10:53:47,John Rusk写道:

John Rusk

unread,
Nov 13, 2016, 4:48:39 PM11/13/16
to QUIC Prototype Protocol Discussion group, john...@gmail.com
Hi 张晗,

Thanks for sharing your test results.  I notice that they are on links that are in the order of 10 or 100 times slower than those I tested on. I.e. yours have much lower bandwidths.  I wonder if that explains the difference in our results.  Or maybe its because you did several files, each fairly small, whereas I did one giant file... Or, as I said above, maybe there was some undetected error in my test methodology. 

In any case, thanks for sharing your results.

John

Diego de Araújo Martinez Camarinha

unread,
Nov 15, 2016, 5:46:54 PM11/15/16
to proto...@chromium.org, john...@gmail.com
Hello all,

Continuing on the subject, I've been experimenting with QUIC, using the proto-quic source code.
It's important to say that I use arch linux and building QUIC's client and server doesn't work well with that linux distribution. So I've been using two docker containers (running Ubuntu 14.04) for the tests, one for the client and one for the server and both on localhost.

When I try to transfer a larger file (1GB) between the containers, several minutes pass before it completes. The average transfer rate I'm getting is something around 0.9Mb/s. That doesn't seem much. I've also transferred a file of the same size, between two containers, using scp and have got 113Mb/s of average transfer rate.

Jana and Ian have suggested that there may be a problem with CPU usage (that might get worse when using docker). Do you think that only with Chrome's optimizations I could achieve a similar performance? Are there many differences between QUIC's toy server and what is used in production on Google's servers? Or maybe there's something wrong with my environment setup (maybe using docker is not a good choice)?

Best regards,
--
Diego

Sent from ProtonMail, encrypted email based in Switzerland.


--
You received this message because you are subscribed to the Google Groups "QUIC Prototype Protocol Discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to proto-quic+...@chromium.org.

Ian Swett

unread,
Nov 15, 2016, 6:20:40 PM11/15/16
to proto...@chromium.org, john...@gmail.com
The CPU issues we've seen are on transfers well above 100Mbits, unless something has regressed immensely in the toy server or client, so it sounds like something is fairly broken if you're only getting 0.9Mbits.

The toy server is definitely not as fast as Google servers, but it shouldn't be that slow for a single flow.  It might be worth grabbing a profile of the client and server while they're transferring and see what's happening, though I can imagine all the time is being consumed on something unrelated(ie: it's not considering the transfer complete until the connection idle times out).

To unsubscribe from this group and stop receiving emails from it, send an email to proto-quic+unsubscribe@chromium.org.
To post to this group, send email to proto...@chromium.org.

--
You received this message because you are subscribed to the Google Groups "QUIC Prototype Protocol Discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to proto-quic+unsubscribe@chromium.org.

phil Lin

unread,
Oct 9, 2019, 3:29:55 PM10/9/19
to QUIC Prototype Protocol Discussion group, john...@gmail.com

Hi, 张晗
i want to ask how do you get the RTT and loss?
Thanks.
张晗於 2016年11月12日星期六 UTC+1下午1時17分46秒寫道:
Reply all
Reply to author
Forward
0 new messages