-Patrick
2012/05/18 6:19 "Patrick McManus" <mcm...@ducksong.com>:
Do you mean use only per-connection flow control?
Then +1.
It achieves the control of total number of buffering, which is the very issue the flow control was introduced.
TCP's flow-control alone would allow HOL blocking to occur in cases where the proxy->server connections were full for only some streams because the userspace process of the server doesn't control advertisements of its ingress window.Essentially, the proxy needs a way to advertise the amount of room it has in its outgoing buffers, which it can't easily do with TCP's flow-control.-=R
"Essentially, the proxy needs a way to advertise the amount of room it has in its outgoing buffers, which it can't easily do with TCP's flow-control."Yes it can. When a proxy server's buffer grows to a certain threshold, it just stops posting buffers to the receive socket connected to the web server. When this happens the TCP stack buffer fills up (since no packets are being pulled off by the user mode process), and TCP starts sending smaller receive windows in its ACKs to the web server, which slows down its send rate. This seems fine to me.I'm still not understanding what problem we are trying to solve by adding another layer of flow control to the TCP session.
Hi,
Was not me, was Peter :)
On Fri, May 18, 2012 at 6:33 PM, Ryan Hamilton <r...@google.com> wrote:
> Perhaps I was confused. I was responding to your question:
>
>> I'm still not understanding what problem we are trying to solve by adding
>> another layer of flow control to the TCP session.
-Patrick
On Fri, 2012-05-18 at 18:44 +0200, Simone Bordet wrote:My understanding is that giving every stream a window equal to
> I do not clearly see the case for a SPDY session level flow control,
> in addition to SPDY stream level flow control.
session-buffers-available represents a scary overcommitment for the
server if every stream decided to utilize it at the same time. It also
creates an unwelcome incentive to minimize the number of parallel
streams.
Likewise, dividing the session buffers available into small shares for
each stream often results in some streams easily running out of window
space while other streams waste their allocations.
You'll see that google.com has been using small 12KB initial windows -
which are too small to allow full rate uploads for many common BDPs. The
flow control proposal arises out of the need to fix that bottleneck
while still informing the client of the server's available bufferspace
(which is presumably > 12KB).
On Fri, 2012-05-18 at 10:39 -0700, William Chan (陈智昌) wrote:you can't verify it, other than at some extreme point as the total
>
> Does this work? How do you know they've received the xoff
> notification?
session window provides the real backstop. but when do you really care?
do you really have a partitioned set of resources available to some
streams but not others (in which case I agree, we need per stream
windows), or do you just want to pushback and stop the sending asap (in
which case xoff ought to be sufficient).
With all the problems it carries (e.g. bufferbloat), and additional
ones (how to share bandwidth among streams) ?
I realize that this email thread is primarily about adding another layer of flow control to SPDY that operates at the SPDY session level but I'm still trying to understand if even the per stream flow control is necessary.So it seems like the main problem per stream flow control is meant to solve is head of line blocking. To be specific, the problem is when you have a low and a high priority stream active and the low priority stream is transferring a lot of data from a fast content server and the high priority stream is only sending intermittently. In this case, there can be a full 64KB of low priority data in flight across the TCP connection when the high priority data needs to be sent over the TCP connection. If the first packet of the 64KB gets dropped, then the high priority data has to wait at least until the low priority block gets re-transmitted.
Per stream flow control mitigates (but does not solve) this by preventing the low priority data from ever using the full TCP connection so that the number of low priority bytes in front of the high priority data is smaller. The drawback is that the low priority stream never gets to use the full connection, even when there is no high priority data to send. Is this description correct? And is this the primary use case for per stream flow control?Has it been considered to use multiple SPDY sessions, one TCP connection per priority level at any given time, so that two streams with different priorities are never competing for bandwidth over the same TCP connection?
Not asking anyone in particular. Just trying to think through the problem. My apologies if I'm taking the discussion backwards a bit.
Hi,To add more clarity here is an example. A large file is being uploaded. Let's say the server is not able to store the file fast enough. Then with a standard http transmission the receive buffer fills up and the upload waits for more buffer space.With speedy this means the server stops reading the buffer then all requests will be blocked. This leaves few options:1. Handle infinite buffer (not really doable)2. Have someway to block further requests on the stream by notify all reads blocked/unblocked (on/off switch which could trigger new connection)3. Have someway to block buffering for a stream (moves buffer/blocking from TCP to protocol handling level)
Having 2 levels of flow control lets you separate the concerns: the
session value is about total buffers available, the per-stream value is
about letting different streams proceed at different rates (and that's
why I think it can be done with xon/xoff in the presence of a session
window).
Detail about the buffer is what flow control exposes, though, right?The client always chooses what to send and what not to send anyway. A receiver never has control over the sender, they can merely give them policy and hope the sender acts upon it.
We'll always need per-stream flow control. No avoiding that. Anything which accurately signals the number of bytes per stream which it is safe to send, and which provides a constant and predictable overall memory usage on the receiver for all streams will do.
I'm having a hard time reconciling your suggestion with this. If we do the per-stream flow control, and can't do 2 levels of flow control, what data would we send which wouldn't BE flow control that would allow the server to be assured (with a well behaved client) to use no more buffer than it wishes to use?
If I'm too confusing, my proposal is:- remove the current per stream window- server will declare it's total buffer ( with a reasonable default - 64k or 1M), and use a per-connection window.- server will send flow control packets with a list of stream IDs and how much bytes are buffered for each stream. There are few options on which streams to include in the flow control packet, and when to send this packet.
- client will not send more than [total buffer] until it receives a flow control packet with a stream-based window update. When it receives a flow control packet - it can chose what stream to send based on free space, priorities, etc - as long as total in-flight bytes fit total buffer.
This doesn't make sense to me. You say "remove the current per stream window" and then you say "it's still a combination of connection and stream flow control".
On Mon, May 21, 2012 at 2:06 PM, Costin Manolache <cos...@gmail.com> wrote:
If I'm too confusing, my proposal is:- remove the current per stream window- server will declare it's total buffer ( with a reasonable default - 64k or 1M), and use a per-connection window.- server will send flow control packets with a list of stream IDs and how much bytes are buffered for each stream. There are few options on which streams to include in the flow control packet, and when to send this packet.This sounds like per stream window updates. Why do you say "remove the current per stream window"?
- client will not send more than [total buffer] until it receives a flow control packet with a stream-based window update. When it receives a flow control packet - it can chose what stream to send based on free space, priorities, etc - as long as total in-flight bytes fit total buffer.With a per-stream window and per-session window, a sender could still do this. When space opens up in the per-session window, the sender is obviously allowed to choose which stream to send data over, assuming that per-stream window has space.
On Mon, May 21, 2012 at 2:09 PM, William Chan (陈智昌) <will...@chromium.org> wrote:This doesn't make sense to me. You say "remove the current per stream window" and then you say "it's still a combination of connection and stream flow control".It replaces the current definition of per-stream window - right now each stream window/flow is sent individually, all streams have the same initial window, and the information sent back is how much it's allowed to sent for the stream.It still has per-stream flow control - in the sense that you send back how much is buffered for each stream ( all or or only streams that have >x bytes or % buffered ).On Mon, May 21, 2012 at 2:06 PM, Costin Manolache <cos...@gmail.com> wrote:
If I'm too confusing, my proposal is:- remove the current per stream window- server will declare it's total buffer ( with a reasonable default - 64k or 1M), and use a per-connection window.- server will send flow control packets with a list of stream IDs and how much bytes are buffered for each stream. There are few options on which streams to include in the flow control packet, and when to send this packet.This sounds like per stream window updates. Why do you say "remove the current per stream window"?The window updates will be per session.The 'per stream window' is replaced by info about how much is currently buffered for the stream, combined with info about total session buffer available.- client will not send more than [total buffer] until it receives a flow control packet with a stream-based window update. When it receives a flow control packet - it can chose what stream to send based on free space, priorities, etc - as long as total in-flight bytes fit total buffer.With a per-stream window and per-session window, a sender could still do this. When space opens up in the per-session window, the sender is obviously allowed to choose which stream to send data over, assuming that per-stream window has space.Yes, but I guess it's a more direct decision when you know total buffer andhow much each stream has buffered. And when you start new streams - they won't be limited by a small per-stream window, but by per-session window.
On Mon, May 21, 2012 at 2:41 PM, Costin Manolache <cos...@gmail.com> wrote:On Mon, May 21, 2012 at 2:09 PM, William Chan (陈智昌) <will...@chromium.org> wrote:This doesn't make sense to me. You say "remove the current per stream window" and then you say "it's still a combination of connection and stream flow control".It replaces the current definition of per-stream window - right now each stream window/flow is sent individually, all streams have the same initial window, and the information sent back is how much it's allowed to sent for the stream.It still has per-stream flow control - in the sense that you send back how much is buffered for each stream ( all or or only streams that have >x bytes or % buffered ).On Mon, May 21, 2012 at 2:06 PM, Costin Manolache <cos...@gmail.com> wrote:
If I'm too confusing, my proposal is:- remove the current per stream window- server will declare it's total buffer ( with a reasonable default - 64k or 1M), and use a per-connection window.- server will send flow control packets with a list of stream IDs and how much bytes are buffered for each stream. There are few options on which streams to include in the flow control packet, and when to send this packet.This sounds like per stream window updates. Why do you say "remove the current per stream window"?The window updates will be per session.The 'per stream window' is replaced by info about how much is currently buffered for the stream, combined with info about total session buffer available.- client will not send more than [total buffer] until it receives a flow control packet with a stream-based window update. When it receives a flow control packet - it can chose what stream to send based on free space, priorities, etc - as long as total in-flight bytes fit total buffer.With a per-stream window and per-session window, a sender could still do this. When space opens up in the per-session window, the sender is obviously allowed to choose which stream to send data over, assuming that per-stream window has space.Yes, but I guess it's a more direct decision when you know total buffer andhow much each stream has buffered. And when you start new streams - they won't be limited by a small per-stream window, but by per-session window.Don't you know how much each stream has buffered via per-stream flow control windows? And the existence of a per-session window obviates the need for small initial per-stream windows. You can make them larger now. 64k is a reasonable default still (if you disagree, let's fix it then), and the SETTINGS frame lets you adjust it to a more appropriate value within one RTT for the server rwin, and zero RTTs for the client rwin.
I guess I don't understand how communicating a percentage of stream buffer size is better than a per-stream flow control window size. The latter approach lets you dynamically adjust the window size on a per-stream basis in a more elegant matter IMO.
I think they both achieve the same goal. The server can lie about the amount of buffer it has available to accomplish any of the goals that it would have done. The question thus resolves to: which is easier to implement.
I'd guess that the current window-based way is easier and cheaper since the window-updates are delta-encodings for the state changes.
On Mon, May 21, 2012 at 2:58 PM, William Chan (陈智昌) <will...@chromium.org> wrote:
On Mon, May 21, 2012 at 2:41 PM, Costin Manolache <cos...@gmail.com> wrote:On Mon, May 21, 2012 at 2:09 PM, William Chan (陈智昌) <will...@chromium.org> wrote:This doesn't make sense to me. You say "remove the current per stream window" and then you say "it's still a combination of connection and stream flow control".It replaces the current definition of per-stream window - right now each stream window/flow is sent individually, all streams have the same initial window, and the information sent back is how much it's allowed to sent for the stream.It still has per-stream flow control - in the sense that you send back how much is buffered for each stream ( all or or only streams that have >x bytes or % buffered ).On Mon, May 21, 2012 at 2:06 PM, Costin Manolache <cos...@gmail.com> wrote:
If I'm too confusing, my proposal is:- remove the current per stream window- server will declare it's total buffer ( with a reasonable default - 64k or 1M), and use a per-connection window.- server will send flow control packets with a list of stream IDs and how much bytes are buffered for each stream. There are few options on which streams to include in the flow control packet, and when to send this packet.This sounds like per stream window updates. Why do you say "remove the current per stream window"?The window updates will be per session.The 'per stream window' is replaced by info about how much is currently buffered for the stream, combined with info about total session buffer available.- client will not send more than [total buffer] until it receives a flow control packet with a stream-based window update. When it receives a flow control packet - it can chose what stream to send based on free space, priorities, etc - as long as total in-flight bytes fit total buffer.With a per-stream window and per-session window, a sender could still do this. When space opens up in the per-session window, the sender is obviously allowed to choose which stream to send data over, assuming that per-stream window has space.Yes, but I guess it's a more direct decision when you know total buffer andhow much each stream has buffered. And when you start new streams - they won't be limited by a small per-stream window, but by per-session window.Don't you know how much each stream has buffered via per-stream flow control windows? And the existence of a per-session window obviates the need for small initial per-stream windows. You can make them larger now. 64k is a reasonable default still (if you disagree, let's fix it then), and the SETTINGS frame lets you adjust it to a more appropriate value within one RTT for the server rwin, and zero RTTs for the client rwin.Would 64K be the default for both session and stream windows ?
I guess my proposal is almost equivalent - if you know the stream window and client window you can calculate how much is buffered. After you fill the session window you need to wait for session and stream window updates in both cases.I guess I don't understand how communicating a percentage of stream buffer size is better than a per-stream flow control window size. The latter approach lets you dynamically adjust the window size on a per-stream basis in a more elegant matter IMO.I think it's more direct to tell the client the relevant information - how big is the proxy buffer and how much of each stream is buffered.The 'percentage' was an optimization - you don't need to send flow control for streams that go trough / have very low buffers, or if the proxy buffer has plenty of space. The per-stream flow will only kick in when it's needed.You can still do delta-encoding for stream and session.
What would be the meaning of 'per stream window' - it's no longer amount you are allowed to send for that stream - you need to consider the session window and all other streams windows. It's this computation that I think would be cleaner if you work in reverse, with how much is buffered instead of the 'window' which is no longer a direct indication of how much to send.
But if you can find a good explanation of the stream window and how it interact with the session window it would be less confusing.
If we wanted to allow dynamically resizing the session window size apart from per stream windows, then we'd have to introduce a separate SESSION_WINDOW_UPDATE or something. Or add a new field to the WINDOW_UPDATE frame for the amount to adjust the session window.
On Mon, May 21, 2012 at 4:17 PM, Costin Manolache <cos...@gmail.com> wrote:On Mon, May 21, 2012 at 2:58 PM, William Chan (陈智昌) <will...@chromium.org> wrote:
On Mon, May 21, 2012 at 2:41 PM, Costin Manolache <cos...@gmail.com> wrote:On Mon, May 21, 2012 at 2:09 PM, William Chan (陈智昌) <will...@chromium.org> wrote:This doesn't make sense to me. You say "remove the current per stream window" and then you say "it's still a combination of connection and stream flow control".It replaces the current definition of per-stream window - right now each stream window/flow is sent individually, all streams have the same initial window, and the information sent back is how much it's allowed to sent for the stream.It still has per-stream flow control - in the sense that you send back how much is buffered for each stream ( all or or only streams that have >x bytes or % buffered ).On Mon, May 21, 2012 at 2:06 PM, Costin Manolache <cos...@gmail.com> wrote:
If I'm too confusing, my proposal is:- remove the current per stream window- server will declare it's total buffer ( with a reasonable default - 64k or 1M), and use a per-connection window.- server will send flow control packets with a list of stream IDs and how much bytes are buffered for each stream. There are few options on which streams to include in the flow control packet, and when to send this packet.This sounds like per stream window updates. Why do you say "remove the current per stream window"?The window updates will be per session.The 'per stream window' is replaced by info about how much is currently buffered for the stream, combined with info about total session buffer available.- client will not send more than [total buffer] until it receives a flow control packet with a stream-based window update. When it receives a flow control packet - it can chose what stream to send based on free space, priorities, etc - as long as total in-flight bytes fit total buffer.With a per-stream window and per-session window, a sender could still do this. When space opens up in the per-session window, the sender is obviously allowed to choose which stream to send data over, assuming that per-stream window has space.Yes, but I guess it's a more direct decision when you know total buffer andhow much each stream has buffered. And when you start new streams - they won't be limited by a small per-stream window, but by per-session window.Don't you know how much each stream has buffered via per-stream flow control windows? And the existence of a per-session window obviates the need for small initial per-stream windows. You can make them larger now. 64k is a reasonable default still (if you disagree, let's fix it then), and the SETTINGS frame lets you adjust it to a more appropriate value within one RTT for the server rwin, and zero RTTs for the client rwin.Would 64K be the default for both session and stream windows ?I suspect we can make the session window larger than 64K. I'm open to suggestions here.
I guess my proposal is almost equivalent - if you know the stream window and client window you can calculate how much is buffered. After you fill the session window you need to wait for session and stream window updates in both cases.I guess I don't understand how communicating a percentage of stream buffer size is better than a per-stream flow control window size. The latter approach lets you dynamically adjust the window size on a per-stream basis in a more elegant matter IMO.I think it's more direct to tell the client the relevant information - how big is the proxy buffer and how much of each stream is buffered.The 'percentage' was an optimization - you don't need to send flow control for streams that go trough / have very low buffers, or if the proxy buffer has plenty of space. The per-stream flow will only kick in when it's needed.You can still do delta-encoding for stream and session.
What would be the meaning of 'per stream window' - it's no longer amount you are allowed to send for that stream - you need to consider the session window and all other streams windows. It's this computation that I think would be cleaner if you work in reverse, with how much is buffered instead of the 'window' which is no longer a direct indication of how much to send.But if you can find a good explanation of the stream window and how it interact with the session window it would be less confusing.I guess I'm not sure what's confusing about the interaction with the stream window and the session window. When writing stream data in a naive implementation, the amount you'd write is amount_to_write = min(stream_data_length, stream_window_size, session_window_size).
Then stream_window_size -= amount_to_write and session_window_size -= amount_to_write. A more advanced implementation may examine the remaining session window size to see how much space is left and the stream priority, and based on info like that, may opt to write less (or zero) stream data if the session window is small and/or the stream priority is low.
If we wanted to allow dynamically resizing the session window size apart from per stream windows, then we'd have to introduce a separate SESSION_WINDOW_UPDATE or something. Or add a new field to the WINDOW_UPDATE frame for the amount to adjust the session window.
I guess my proposal is almost equivalent - if you know the stream window and client window you can calculate how much is buffered. After you fill the session window you need to wait for session and stream window updates in both cases.I guess I don't understand how communicating a percentage of stream buffer size is better than a per-stream flow control window size. The latter approach lets you dynamically adjust the window size on a per-stream basis in a more elegant matter IMO.I think it's more direct to tell the client the relevant information - how big is the proxy buffer and how much of each stream is buffered.The 'percentage' was an optimization - you don't need to send flow control for streams that go trough / have very low buffers, or if the proxy buffer has plenty of space. The per-stream flow will only kick in when it's needed.You can still do delta-encoding for stream and session.
What would be the meaning of 'per stream window' - it's no longer amount you are allowed to send for that stream - you need to consider the session window and all other streams windows. It's this computation that I think would be cleaner if you work in reverse, with how much is buffered instead of the 'window' which is no longer a direct indication of how much to send.But if you can find a good explanation of the stream window and how it interact with the session window it would be less confusing.I guess I'm not sure what's confusing about the interaction with the stream window and the session window. When writing stream data in a naive implementation, the amount you'd write is amount_to_write = min(stream_data_length, stream_window_size, session_window_size).Then stream_window_size -= amount_to_write and session_window_size -= amount_to_write. A more advanced implementation may examine the remaining session window size to see how much space is left and the stream priority, and based on info like that, may opt to write less (or zero) stream data if the session window is small and/or the stream priority is low.A non-naive implementation would look at all outgoing streams, sorted by priority, and attempt to divide the stream_window_size somehow - maybe give priority to SYN_STREAM packets, etc.
And it may consider which streams are 'stale' == no change in window size, i.e. the server is stuck and proxy needs to cache it.If we wanted to allow dynamically resizing the session window size apart from per stream windows, then we'd have to introduce a separate SESSION_WINDOW_UPDATE or something. Or add a new field to the WINDOW_UPDATE frame for the amount to adjust the session window.The use case would be a proxy server that may cache more if the load is low.But the main issue (IMHO) is how to chose the amount to send for each stream, based on stream window, priority and remaining buffer space.The proxy buffer can be considered fixed size when the decision is made, the 'stream window' is a proxy for how much of that stream is buffered - and that indicates how likely that stream is to clog the pipe.
I don't think the 'naive' implementation will work so well if you have many streams going slowly ( few uploads plus some new requests ), unless you somehow reduce the window for each of the slow streams - but that won't work for the initial window size. So you need to reduce the initial window.
If I'm too confusing, my proposal is- remove the current per stream window- server will declare it's total buffer ( with a reasonable default - 64k or 1M), and use a per-connection window.
2 - The PDF above makes specific remarks and improvement recommendations based on internal design decisions and source code in OpenSSH. They are not strictly enforced by the SSH protocol and other implementations will have different algorithms to adjust the window size.
On Tue, 22 May 2012, Greg Wilkins wrote:Yes, SSH does indeed have more or less the exact same windowing that SPDY has, which in turn seems to mimic what TCP itself offers.
Some more reading of interest is http://www.ietf.org/rfc/rfc4254.txt. This is the SSH connection protocol that supports multiplexed channels using individual window sizes - more or less what SPDY/3 has. It also has an optimisation to support xon/xoff of individual channels.
There is also a paper about how these window sizes can cause significant
performance problems and how dynamically resizing the windows can improve
throughput
http://www.psc.edu/networking/projects/hpn-ssh/papers/hpnssh-postertext.pdf
cheers
Costin
cheers
Nice writeup! I think that is a great way to rate and consider solutions!
I think that bufferbloat (which should be a real worry) might want an explicit mention . It sorta fits into your categorizations/goals but has implications for latency jitter and measurement (and fairness, but you have a category/goal for that)
A proposed flow-control window update solution:
Whenever a stream has more data that it wishes to send right now and the tcp connection is writable and the stream is restricted from sending by flow control, send an update for the stream indicating this to the other side.
This update could be a size update of size 0, or it could be another frame.
On the receiver side, receipt of this would indicate the sender is throttled by SPDY's flow control.
Figuring out the appropriate window for that stream would be implementation dependent, but, as a guideline, it should probably be the maximum window size for any stream you've recently increased or 50% more, whichever is greater. Of course, window updates should be done only so long as the receiver has space in b
This is meant to be a simple guideline that will hopefully work fairly well.
If good estimates for rtt and bw are available, use that to bound the max possible send size. If the size of the tcp window is known, use that as a max upper bound.
-=R
Also, are you saying you want the client to retransmit to the intermediary? SPDY is over a "reliable" connection, so any dropped packets should already be retransmitted. Can you clarify further?
Costin
cheers
Also, are you saying you want the client to retransmit to the intermediary? SPDY is over a "reliable" connection, so any dropped packets should already be retransmitted. Can you clarify further?Yes, TCP is "reliable" and has its own flow control, including the ability to drop packets when it needs to.SPDY frames are over TCP - so the frames can't be dropped, once you receive a frame you must keep it in (bound - 1.4) memory until the actual endpoint consumes them.We are copying 1/2 of TCP - the flow control window, but are missing the other 1/2, dropping packets and re-transmission. IP packets are dropped because they get "lost", but also if router buffers are full or they timeout.
CostinCostin
cheers
I do not believe we should allow dropping frames and retransmitting them individually.
HTTP intermediaries do not allow for dropping payloads once the TCP layer has already acknowledged receipt, so I do not see why we need support for this within a SPDY stream.
CostinCostin
cheers
The receiver can send its buffer size - but you can still get into a state where a lot of streams are slow and use the max buffer size they are allowed to use. If the total per-connection memory is filled with slow stream at some point the good streams no longer have space.
I do not believe we should allow dropping frames and retransmitting them individually.Not individual frames - but a proxy for example should be able to send a RST_STREAM or similar message indicating that a stream is too slow and is using too much of the buffer space.Ideally this will not be a simple RST_STREAM followed by another full upload attempt of the entire stream - you could improve by sending an indication of how much has been consumed, and the client could re-start.
HTTP intermediaries do not allow for dropping payloads once the TCP layer has already acknowledged receipt, so I do not see why we need support for this within a SPDY stream.HTTP intermediaries don't multiplex - each TCP stream can be as slow as it wants without affecting other streams.
I think documenting that RST_STREAM can be used for flow control would make us more similar with HTTP, where stuck TCP connections may be aborted if the proxy needs the memory.
We just have to be very careful about idempotent streams being reset, else we end up in the current HTTP pipeline problems.On 29 May 2012 22:44, Costin Manolache <cos...@gmail.com> wrote:I think documenting that RST_STREAM can be used for flow control would make us more similar with HTTP, where stuck TCP connections may be aborted if the proxy needs the memory.
But if the problem of idem potency can be solved, then RST is probably the only way we can reduce buffers allocated to a stuck stream. Although it will still rely on HTTP retry semantics and user interactions.
cheers
Simone and I have been debating offline various flow control algorithms and have not reached any significant conclusions. However I have come up with some formalisation of how we can rate various flow control proposals.
IMNSHO the primary factors that should rate flow control algorithms are:
1.1 Uses full all available bandwidth
The whole point of SPDY is to make the web faster, so if there is a willing stream producer, a willing stream consumer and available capacity in the pipe between SPDY implementations, then we should allow that capacity to be used up to and possibly even slightly beyond TCP/IP flow control.
1.2 No infinite postponement of a stream
The behaviour of one stream should not be able to infinitely postpone the delivery of data on another stream. We can control the behaviour of compliant SPDY senders and receivers, but we cannot control the behaviour of stream consumers and producers (the application). Thus we have to consider that stream producers and consumers may act maliciously (either by design, by error or by force majeure (eg DNS resolution suddenly blocks)). This does not mean that we have to avoid congesting the TCP pipe, just so long as we know the other end (the SPDY receiver) will eventually uncongest the pipe even if a stream consumer is not reading data - ie that the SPDY receiver has sufficient buffers.
1.3 Little incentive to use multiple connections
One of the problems with HTTP that has driven the development of SPDY is to remove the incentive for clients to use multiple connections. Any per connection limit (such as TCP/IP slow start windows) are incentives to have multiple connections. For example if we allocate a default initial per connection window, then a client that opens 2 connections will get twice times that limit and better initial performance. Worse still, because of TCP/IP slow start, once you open 2 connections, you'll probably open 6 or 8 to get multiple slow start window allocations as well.
1.4 No infinite buffering
When an endpoint accepts a connection and/or a stream, it must be able to know the maximum memory that it has to commit to provide in order to satisfy the specification. If accepting a connection/stream can result in unlimited memory commitments then we are open to denial of service attacks and unpredictable application performance.
The secondary factors that we should consider are many, but I think they should include:
2.1 Complexity
Avoiding complexity is a motherhood statement. Yes we want to be as simple as possible, but not to the point were we significantly compromise the primary factors. At the end of they day, there is likely to be perhaps a few hundred or a few thousand SPDY implementations that will provide infrastructure for millions of developers - better we suck up some complexity rather than fail a primary objective.
2.2 Fairness
Another motherhood statement. Yes we want to be fair between streams, but fair is really subjective and difficult to quantify. Is taking frames from streams in a round robin manner fair? some would say yes, but others would say that a stream that has not participated in any recent rounds should have more of a share than one that has sent in every round. I think the primary concern in the protocol spec is to avoid the possibility of infinite postponement and then we can mostly leave fairness measures to be implementation details/features.
2.3 Priorities
See fairness. If we can't work out what fair is, then it is even harder to work out what priorities mean. I think priorities are a nice to have feature, but should not compromise the primary objectives.
The other thing I've concluded is that there is no perfect solution. Even if the flow controlling algorithm was to ask an all knowing connection GOD if a frame could be sent or not, if that GOD gives free will to the stream consumers then GOD is fallible. GOD might see a single stream that is flowing well and allow it to keep sending frames right up to the capacity of a fat pipe, but then just as another frame is opened the consumer of the first frame may stop reading and all the data in the fat pipe will have to be buffered so that the new stream can send some frames. But that buffering may consume almost all the memory reservation of the receiver, so that now GOD cannot allow the new stream to go at a full rate because the receivers buffers are almost full and he cannot risk the new stream suddenly stopping consuming like the first did.
Once we accept that an all knowing algorithm can still be fallible in the face of streams with free will, then we just have to accept that we are looking for the best approximation of a perfect solution.
So the current 64KB window per stream actually rates pretty well on most of these concerns, except for one: 1.1 A 64KB window on a fast pipe will limit bandwidth utilisation if there is any significant latency; 1.2 A stream cannot be blocked by another stream not being consumed; 1.3 Resources are allocated per stream, so there is little incentive to use multiple connections, and with TCP slow start there is an incentive to use an already warmed up SPDY connection over a new one; 1.4 implementations know the commitment is 64KB to accept a new stream; 2.1 it's moderately simple; 2.2 so long as the sender does not create long frame queues, it can be fair in a round robin sense; 2.3 priorities could be implemented to affect round robin fairness.
So the only think we really a missing is the ability to use the full capacity of a TCP connection. The 64KB has already been demonstrated to slow throughput .
The proposals to introduce a per connection window or burst allowance do look attractive, but my concern is that they sacrifice 1.4 to meet 1.1. Ie any per connection limit will create an incentive to open multiple connections in order to obtain the benefits of multiple per connection allocations of resources.
Instead, I think that we should look at a system that allows a connection to grow the per stream window if the pipe supports the extra bandwidth. More over, that new streams created on that connection should be allocated the same grown window (perhaps adjusted down a little for the number of streams), so that they do not need to slow start and there is an incentive to open a new stream on an existing connection rather than create a new one. Growing the initial window size does not violate 1.4 as the size is known when a stream is accepted (and perhaps can be adjusted down if resources are short).
So how can we detect if a stream window can be grown? Sending the entire window before receiving a window update is not sufficient, as that can equally indicate a slow consumer or a fast pipe. Perhaps sending the entire window without seeing any TCP/IP flow control is one way? ie we can grow our stream windows until we reach either a limit or we see tcp/ip flow control? Assuming we can come up with a way to decide when to grow, then I think this style of flow control rates OK: 1.1 windows can grow to TCP capacity; 1.2 Streams cannot block other streams; 1.3 incentive to used warmed up connection 1.4 memory requirements known when accepting stream; 2.1 moderate complexity; 2.2 writer can implement fairness in frame selection 2.3; priorities can be used to influence fair frame selection.
Anyway... let me not get ahead of myself proposing solutions... what do people think for the criteria for rating flow control algorithms?
cheers
Hi,
You don't want to send window updates until you're sure data has being
On Wed, May 30, 2012 at 6:12 AM, Costin Manolache <cos...@gmail.com> wrote:
> One relatively easy solution - used on older versions of android - is to
> have the client hold to the data it sent, and the proxy or server to
> indicate how much was consumed, with negative numbers indicating that client
> needs to resend some data.
>
> For example you want to upload a 1M file, you start sending frames up to the
> window size ( say 128k ), but you don't delete/GC the frames you sent until
> you get the next window update from server. The window update will have an
> extra field indicating how much was consumed.
consumed by the application.
Otherwise you're duplicating what the TCP ACK is saying, and it is of
no interest to the sender.
On Wed, May 30, 2012 at 3:47 AM, Simone Bordet <sbo...@intalio.com> wrote:Hi,
You don't want to send window updates until you're sure data has being
On Wed, May 30, 2012 at 6:12 AM, Costin Manolache <cos...@gmail.com> wrote:
> One relatively easy solution - used on older versions of android - is to
> have the client hold to the data it sent, and the proxy or server to
> indicate how much was consumed, with negative numbers indicating that client
> needs to resend some data.
>
> For example you want to upload a 1M file, you start sending frames up to the
> window size ( say 128k ), but you don't delete/GC the frames you sent until
> you get the next window update from server. The window update will have an
> extra field indicating how much was consumed.
consumed by the application.Why ? Window updates indicate there is space for more bytes in the buffers, not that the bytes have been consumed.
Otherwise you're duplicating what the TCP ACK is saying, and it is of
no interest to the sender.The problem is duplicating only part of TCP flow control. ACK is a part of flow control, just like the window update. TCP relies on packet drops and ACKs to determine what to send and how fast.Even in HTTP the sender has to be able to deal with drops and re-transmits. The status code is a form of ACK, and plenty of problems have been caused by not dealing properly with drops and retries in http.Bufferbloat is mentioned quite a bit - maybe we should look at ECN ( congestion notification ), which is the alternative to ACK and dropping packets.SPDY duplicates stuff from lower layers - multiplexing, a part of flow control. It's likely to duplicate some of the problems and make other worse by not duplicating enough :-)
The receiver, upon receipt of such a frame, could increase the various window sizes as indicated by the frames which tell the receiver the number (and possible duration) of the blocked bytes (hopefully up to a maximum as estimated by the BDP).
This scheme doesn't require much additional complexity, and it meets all of the ratings targets proposed by Greg earlier.This scheme rapidly converges on the appropriate window size without too much overshoot.
The receiver, upon receipt of such a frame, could increase the various window sizes as indicated by the frames which tell the receiver the number (and possible duration) of the blocked bytes (hopefully up to a maximum as estimated by the BDP).I think I need more clarification on the server-side motivations here. Are you saying you want the client to provide information as to which streams need more buffers? Just to be clear, in this proposal, are we trying to address a deficiency compared to HTTP over TCP connections, or are we trying to provide better facilities to do better buffer management than is possible with HTTP over TCP?
FWIW, I'm lean towards Mike's POV more, although I do concede a need for flow control (and per-session windows in addition to per-stream windows). But I think that these windows should be sized so they only come into play in the less common cases (most streams are short-lived) and would like to see Chromium and Firefox and other SPDY clients agree on minimum sizes to require, so we prevent stupid servers from making things unnecessarily slow. And I think per-session+per-stream windows give enough knobs for the server to manage things appropriately, and don't really see a need for further knobs.
Hi,
I don't follow this point.
On Wed, May 30, 2012 at 7:43 PM, Roberto Peon <fe...@google.com> wrote:
> In natural language:
> if a sender is ever blocked, then it should send a frame to the receiver
> indicating the stream ID which is blocked, with then amount of bytes it
> would wish to send, but couldn't because of flow-control.
If the sender is blocked, how can it send a frame to the receiver ?
And even supposing that the receiver somehow receives it, what use can
it make of it ?
If a receiver is bound to a slow application, it knows that (it reads
more for that stream than the application consumes), and I can't
imagine what can it do with the information that it's flow controlled
- it probably already knows that by its own.
...
If the proxy has space in its buffers, it indicates such by increasing the per-connection window size.If the proxy is blockage for a particular stream, it doesn't update the window size for that stream, but otherwise the per-stream window size should be the per-connection window size.
This thread has gotten confusing, maybe someone can make a concrete proposal as to what we're talking about at this point?
My current summary:I liked Greg's summary of the goals of flow control.I believe that we have a tradeoff of full-pipe performance vs buffering which the current SPDY spec makes.I think Roberto is proposing something, but I'm not sure what it is. I am very much against more complexity, because I believe the entire problem is 100% contrived and unreal - simply not worth the complexity. The protocol can already deal with an over-buffer situation even without *any* flow control - just kill the stream. (See below for justification). I'd rather remove all flow control from SPDY than add more complexity.
I believe that we have a tradeoff of full-pipe performance vs buffering which the current SPDY spec makes.I think Roberto is proposing something, but I'm not sure what it is. I am very much against more complexity, because I believe the entire problem is 100% contrived and unreal - simply not worth the complexity. The protocol can already deal with an over-buffer situation even without *any* flow control - just kill the stream. (See below for justification). I'd rather remove all flow control from SPDY than add more complexity.
Justification:a) The client isn't going to throttle the downlink - it wants data as fast as it can get it.
d) If your backend server is down causing backlogs in your proxy, you can write code to deal with that (e.g. failover) or nuke the stream. Why expose it out to the whole protocol?
This thread has gotten confusing, maybe someone can make a concrete proposal as to what we're talking about at this point?
My current summary:I liked Greg's summary of the goals of flow control.I believe that we have a tradeoff of full-pipe performance vs buffering which the current SPDY spec makes.I think Roberto is proposing something, but I'm not sure what it is. I am very much against more complexity, because I believe the entire problem is 100% contrived and unreal - simply not worth the complexity. The protocol can already deal with an over-buffer situation even without *any* flow control - just kill the stream. (See below for justification). I'd rather remove all flow control from SPDY than add more complexity.Justification:a) The client isn't going to throttle the downlink - it wants data as fast as it can get it.
b) The server doesn't get huge uploads very often; so there isn't much to throttle here anyway.
On Wed, May 30, 2012 at 3:04 PM, Mike Belshe <mbe...@chromium.org> wrote:This thread has gotten confusing, maybe someone can make a concrete proposal as to what we're talking about at this point?
My current summary:I liked Greg's summary of the goals of flow control.I believe that we have a tradeoff of full-pipe performance vs buffering which the current SPDY spec makes.I think Roberto is proposing something, but I'm not sure what it is. I am very much against more complexity, because I believe the entire problem is 100% contrived and unreal - simply not worth the complexity. The protocol can already deal with an over-buffer situation even without *any* flow control - just kill the stream. (See below for justification). I'd rather remove all flow control from SPDY than add more complexity.+1 ( on removing all flow control and killing streams that miss-behave ).