HTTP today circumvents TCP Slow Start

已查看 882 次
跳至第一个未读帖子

Mike Belshe

未读,
2010年1月21日 12:22:272010/1/21
收件人 spdy-dev
By using fewer TCP connections, SPDY is generally more efficient than HTTP.  However, for high-latency links (think satellite), TCP's slow-start can become the bottleneck.  Most troubling is that even though SPDY is doing all the right things to use fewer connections, send fewer bytes and send fewer packets, it can be slower than HTTP when downloading the same content.

Of course, this topic has come up many times over the last 20 years.  Slow start intentionally slows us down in order to keep all traffic flowing on the internet.  But what is new in this study is that Web Pages today have already worked around the Slow-Start bottleneck.  If the RFC says that slow start should start with an initial congestion window of 3-4 packets, but the world has already found ways to use initial congestion windows in excess of 50 packets, perhaps the specification is just wrong?

Read the deck for more details.  Feedback is welcome.

Mike
An_Argument_For_Changing_TCP_Slow_Start.pdf

Mike Hearn

未读,
2010年1月21日 13:11:042010/1/21
收件人 spdy...@googlegroups.com
Makes sense. My only thoughts:

- Many sites still do not use subdomain sharding: it's an advanced technique
which I suspect is only used by the hottest head sites of the web. It'd be
interesting to build a table of how many domains the most popular sites use.
So this might not be such a big deal in practice, except for people who sit
on Facebook/Maps/Youtube all day.

- Theoretically, Chrome could use raw sockets and its own TCP stack, right?
Of course that raises the bar to getting the full SPDY benefit considerably.

- Even in the SPDY world big sites will still use subdomain sharding anyway,
because often the subdomains aren't there purely to hack around browser
limits but because the content is being served from different locations.
Eg most of the facebook subdomains are akamai but www isn't

Mike Belshe

未读,
2010年1月21日 13:29:032010/1/21
收件人 spdy...@googlegroups.com
On Thu, Jan 21, 2010 at 10:11 AM, Mike Hearn <he...@google.com> wrote:
Makes sense. My only thoughts:

- Many sites still do not use subdomain sharding: it's an advanced technique
 which I suspect is only used by the hottest head sites of the web. It'd be
 interesting to build a table of how many domains the most popular sites use.
 So this might not be such a big deal in practice, except for people who sit
 on Facebook/Maps/Youtube all day.

It's certainly true that not every page jacks it's cwnd using that technique.  The connection limit of 6 does apply to nearly every web page (web pages average ~50 subresources, so 6 connections is the norm).

We're trying to run better models on how many subdomains-per-webpage are used.  Obviously, subdomains can be used for many reasons, and ad networks is probably one of the largest reasons.  Some data suggests the average # of subdomains per page is 6-8.

Also, the top-100 sites (maps sites, image sites, social networking sites, etc) do use sharding techniques, and these represent the bulk of web traffic (non video) today.

 

- Theoretically, Chrome could use raw sockets and its own TCP stack, right?
 Of course that raises the bar to getting the full SPDY benefit considerably.

Not really - that requires admin access to deploy and would generally not be good for users.  Browsers could use UDP based protocols.  I don't really want to do that.  I'd rather fix TCP.

 

- Even in the SPDY world big sites will still use subdomain sharding anyway,
 because often the subdomains aren't there purely to hack around browser
 limits but because the content is being served from different locations.
 Eg most of the facebook subdomains are akamai but www isn't

Agree.  But I do think removing the need for sharding and enabling multiplexing will change this - sites like facebook will be encouraged to load content from a small set of domains for better speed rather than a large set of domains for better speed.

Mike
 

Mark Nottingham

未读,
2010年1月22日 19:42:032010/1/22
收件人 spdy...@googlegroups.com
Very interesting stuff.

Have you thought about going to the IETF with this? I imagine OS vendors will be reluctant to change something like this without their blessing (although that's not always -- or maybe even usually -- the case ;).

Maybe a good starting place (as in, what you propose might be out of scope, but at least the right people will be in the room):
http://www.ietf.org/dyn/wg/charter/tcpm-charter.html

This puts HTTP-over-SCTP in an interesting light as well, of course. Or, as you mention, building something on top of UDP (although I agree this wouldn't be a great outcome).

Cheers,


--
Mark Nottingham http://www.mnot.net/

Mike Belshe

未读,
2010年1月22日 19:46:252010/1/22
收件人 spdy...@googlegroups.com
On Fri, Jan 22, 2010 at 4:42 PM, Mark Nottingham <mn...@mnot.net> wrote:
Very interesting stuff.

I thought so too :-)

 

Have you thought about going to the IETF with this? I imagine OS vendors will be reluctant to change something like this without their blessing (although that's not always -- or maybe even usually -- the case ;).

Yes, Google is going to the IETF in March to discuss this issue.  This data here is a client side analysis.  Other folks have server-side data as well.

 

Maybe a good starting place (as in, what you propose might be out of scope, but at least the right people will be in the room):
 http://www.ietf.org/dyn/wg/charter/tcpm-charter.html

This puts HTTP-over-SCTP in an interesting light as well, of course. Or, as you mention, building something on top of UDP (although I agree this wouldn't be a great outcome).

I actually don't think SCTP helps - it uses the same slow start that TCP does (across all streams, I believe!).  I know there are some SCTP experts on this list - maybe they will chime in?

Mike

Jon Leighton

未读,
2010年1月23日 13:02:062010/1/23
收件人 spdy-dev
On Jan 22, 7:46 pm, Mike Belshe <mbel...@google.com> wrote:

> On Fri, Jan 22, 2010 at 4:42 PM, Mark Nottingham <m...@mnot.net> wrote:
> > Very interesting stuff.
>
> I thought so too :-)
>
>
>
> > Have you thought about going to the IETF with this? I imagine OS vendors
> > will be reluctant to change something like this without their blessing
> > (although that's not always -- or maybe even usually -- the case ;).
>
> Yes, Google is going to the IETF in March to discuss this issue.  This data
> here is a client side analysis.  Other folks have server-side data as well.
>
>
>
> > Maybe a good starting place (as in, what you propose might be out of scope,
> > but at least the right people will be in the room):
> >  http://www.ietf.org/dyn/wg/charter/tcpm-charter.html
>
> > This puts HTTP-over-SCTP in an interesting light as well, of course. Or, as
> > you mention, building something on top of UDP (although I agree this
> > wouldn't be a great outcome).
>
> I actually don't think SCTP helps - it uses the same slow start that TCP
> does (across all streams, I believe!).  I know there are some SCTP experts
> on this list - maybe they will chime in?

SCTP streams are logical streams over a single association (SCTP's
term for a connection). The cwnd evolves based on the cumulative
bytes sent out and acknowledged over the association, not based on an
individual stream (so yes - across all streams). SCTP's capacity to
get data out quickly during slow start is no different than TCP's -
but if you can justify increasing the initial cwnd for TCP the same
arguments will be equally valid for SCTP.

One other note of interest - FreeBSD 6.1 sends extra ACKs in the form
of window updates, which artificially accelerate cwnd growth during
slow start. Perhaps some other OSes do similar things to gain an
edge?

- Jon

Mark Nottingham

未读,
2010年2月22日 16:04:402010/2/22
收件人 spdy...@googlegroups.com
Picking up an old thread...

On 22/01/2010, at 5:29 AM, Mike Belshe wrote:

>> - Theoretically, Chrome could use raw sockets and its own TCP stack, right?
>> Of course that raises the bar to getting the full SPDY benefit considerably.
>>
> Not really - that requires admin access to deploy and would generally not be good for users. Browsers could use UDP based protocols. I don't really want to do that. I'd rather fix TCP.

What about inflating the window size by getting a bigger buffer to start?. iperf seems to use this to good effect; see
http://iperf.svn.sourceforge.net/viewvc/iperf/trunk/src/tcp_window_size.c

AIUI you still may hit OS limits, but it's better than nothing. Not sure if this would work on Windows, however...

Mike Belshe

未读,
2010年2月22日 16:42:592010/2/22
收件人 spdy...@googlegroups.com
On Mon, Feb 22, 2010 at 1:04 PM, Mark Nottingham <mn...@mnot.net> wrote:
Picking up an old thread...

On 22/01/2010, at 5:29 AM, Mike Belshe wrote:

>>  - Theoretically, Chrome could use raw sockets and its own TCP stack, right?
>>  Of course that raises the bar to getting the full SPDY benefit considerably.
>>
> Not really - that requires admin access to deploy and would generally not be good for users.  Browsers could use UDP based protocols.  I don't really want to do that.  I'd rather fix TCP.

What about inflating the window size by getting a bigger buffer to start?. iperf seems to use this to good effect; see
 http://iperf.svn.sourceforge.net/viewvc/iperf/trunk/src/tcp_window_size.c

Tweaking window size has it's own set of issues, but in this case, I'm referring to the congestion window, not window size.

For more info - read up on slow start here:  http://www.ietf.org/rfc/rfc2581.txt

Basically, it doesn't matter how big your window is; TCP senders are initially bound by slow start.

Mike

Patrick Meenan

未读,
2010年2月23日 19:03:172010/2/23
收件人 spdy-dev
I totally agree that the current state of TCP Congestion avoidance
(and slow start in particular) is very much a bottleneck for http with
it's short-lived connections and access pattern, particularly from the
server side so I expect it has more impact on a SPDY server than on
chrome itself.

What I'd really like to see would be for OS vendors to make the
initial window a tunable parameter (and for bonus points also make it
settable on a per-connection basis with setsockopt()). That way we
could do fancy things like tune it based on my application - say 90%
of my responses are 10k or under I could tune it for 10k and serve the
bulk of my responses with a single round trip. You could even get
fancy and run a fancy feedback loop that automatically decrease it if
you start seeing the retransmit level increase across your server farm
and increase it until you start to see problems. I've been wanting to
add a patch to the Linux kernel to make it tunable but I haven't had
the time to look at it yet.

That way the algorithms don't have to have a single setting that works
well for all installs and applications.

-Pat

Patrick Meenan

未读,
2010年2月23日 19:08:402010/2/23
收件人 spdy-dev
BTW, it's probably worth mentioning that a good number of load balance
devices already use a much larger initial cwnd (they usually advertise
it as "TCP acceleration") so even in the straight http world the
initial cwnd's are a LOT higher than the 108 in the deck for some
sites :-)

Simon Watts

未读,
2013年8月2日 09:41:172013/8/2
收件人 spdy...@googlegroups.com
Just fyi most modern satellite links use PEP which breaks the TCP at each end and produces locals ACKs and flow control using internal SACK protocol over the satellite link.  This means slow start is not usually an issue

Cheers
Simon
回复全部
回复作者
转发
0 个新帖子