Stalled HTTP/2 stream blocks entire session.

679 views
Skip to first unread message

Bence Béky

unread,
Apr 16, 2015, 12:00:05 PM4/16/15
to net-dev
Hi,

I'd like to solicit opinions about the following HTTP/2 issue. (I use
the words session and connection interchangeably.)

Problem: some SpdyBuffer consumers pause download by simply not
consuming data. For example, a user-paused download [1], or
downloading media [2, 3]. However, the server keeps sending data
until the stream level flow control window is exhausted. At that
point, SpdySession holds on to the data. In order to limit memory
usage, these data count towards the session level flow control window
as well, which is the same size, so it is also exausted. This stalls
the entire connection.

Proposed short term solution: currently both stream and session level
maximum receiving flow control windows are 10 MB [4]. I propose
different values for the two. For example, 6 MB for stream level, and
15 MB for session. This way, even two unconsumed streams would not
block the session. This can be done in a single CL. This is a
trade-off between a stream level window large enough to utilize
bandwidth even with a large bandwidth delay product and the client in
a CPU-starved environment, a session level window small enough to
limit memory consumption, and a ratio between the two large enough not
to let a small number of streams block the entire connection.

One proposed long term solution: it is clearly not optimal that the
only way a consumer can pause the stream is to stop reading. One
could implement PauseStream/ResumeStream methods that would disable
and re-enable SpdySession to send WINDOW_UPDATE frames for data
received on the stream, but would still send data on the session on a
paused stream once the data are consumed. It would be the consumer's
responsibility to continue consuming data until the entire stream
level window is exhaused, so that there would be no data taking up
memory indefinitely. The continued WINDOW_UPDATE frames on the
session would guarantee the session not stalling because of the paused
stream. (In case you are worried that stream level windows should
always sum up to some magic number, like the session level window, and
this invariant would be ruined by this change, I can assure you that
this is not the case: creating a stream already increases the sum of
stream level receiving windows without changing the session level
window. In fact, this is the very feature for which session level
flow control was added: that maximum buffering need on the receiving
side can be capped at a limit smaller than the sum of the stream level
windows, which has to be large if one wants to have many streams and
be ready to utilize bandwidth regardless of what stream the server
starts sending data on.) The advantage of this issue is that
currently data to the extent of the outstanding receive window size
are downloaded anyway, therefore there is no point in keeping those
data it memory instead of consuming them (which in case of download
means writing to disk, thus freeing memory that other streams on the
same session can use). The disadvantage is that the user might pause
download because they need bandwidth immediately for some other,
low-latency activity, so continuing their download for another 10 MB
is not a good user experience. Note that this would not be a
regression, because that is exactly what Chromium is doing right now,
but AFAIK it do not show in the progress bar at the moment.

Alternative long term solution: when the user pauses the download,
reset the stream. This corresponds best to what the user wants: no
more data are downloaded after a roundtrip. However, connection
resumption is a very trickly task, and I understand that there are
engineers working on it right now.

Yet another alternative long term solution: implement an HTTP/2
extension where the receiver can say "stop right there, whatever you
have sent is sent already, fine, but whatever is not should not be,
and set the window size to zero", and the server can respond "the
current window size was this, I set it so zero and stop sending data,
decrease your window size with the appropriate delta, and if you were
not sending WINDOW_UPDATES since you sent this request and you keep
updating your window size as data come in, then window size on your
side should go down to zero when you receive all data that I have sent
so far". This would make up for the fact that a WINDOW_UPDATE delta
can only be positive (obviously, because the sender doesn't know how
much data are in flight).

To be clear, I believe that the short term solution is needed even if
a decision on a long term solution is made, and I volunteer to make
that one change, but I do not currently have the resources to
implement any of the long term solutions. I am only writing to raise
awareness and start a discussion.

Thank you,

Bence

[1] https://crbug.com/473259
[2] https://crbug.com/464875
[3] https://crbug.com/162627
[4] https://code.google.com/p/chromium/codesearch#chromium/src/net/http/http_network_session.cc&l=55

Ryan Sleevi

unread,
Apr 16, 2015, 1:51:18 PM4/16/15
to Bence Béky, net-dev, Asanka Herath, Ricardo Vargas
On Thu, Apr 16, 2015 at 8:59 AM, Bence Béky <b...@chromium.org> wrote:
Alternative long term solution: when the user pauses the download,
reset the stream.  This corresponds best to what the user wants: no
more data are downloaded after a roundtrip.  However, connection
resumption is a very trickly task, and I understand that there are
engineers working on it right now.

Why is this not the right answer? Why do we not stop the download, then resume (with Range requests)? It's an HTTP/2 client - we can presume that much of the stupidity of servers is fixed.

To be honest, I don't understand entirely why we don't do this for HTTP/1.1 clients. I suspect it was to work around some broken implementations?

Ryan Hamilton

unread,
Apr 16, 2015, 2:49:46 PM4/16/15
to Ryan Sleevi, Bence Béky, net-dev, Asanka Herath, Ricardo Vargas
On Thu, Apr 16, 2015 at 10:51 AM, Ryan Sleevi <rsl...@chromium.org> wrote:
On Thu, Apr 16, 2015 at 8:59 AM, Bence Béky <b...@chromium.org> wrote:
Alternative long term solution: when the user pauses the download,
reset the stream.  This corresponds best to what the user wants: no
more data are downloaded after a roundtrip.  However, connection
resumption is a very trickly task, and I understand that there are
engineers working on it right now.

Why is this not the right answer? Why do we not stop the download, then resume (with Range requests)? It's an HTTP/2 client - we can presume that much of the stupidity of servers is fixed.

​Agreed.​
 
​This seems like the right solution.​

To be honest, I don't understand entirely why we don't do this for HTTP/1.1 clients. I suspect it was to work around some broken implementations?

--
You received this message because you are subscribed to the Google Groups "net-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to net-dev+u...@chromium.org.
To post to this group, send email to net...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/net-dev/CACvaWvZZbRa4dwRUOck%3DtBv%3DFRXQJ7q0o7O9yL%3DwDHdruEvsjw%40mail.gmail.com.

Alexey Baranov

unread,
Apr 16, 2015, 2:53:27 PM4/16/15
to rsl...@chromium.org, Bence Béky, net-dev, Asanka Herath, Ricardo Vargas
download is content intiated and thus require live renderer to be associated with. This is used by certain throttles (Safebrowsing) to show smth on that tab, some policy checks, finding browser window to show download shelf, etc. If the WebContents is still alive, one can resume interrupted download in Chrom(ium) given --enable-download-resumption switch. There is a lot of logic out there (in DownloadItemImpl) to handle server errors, range errors and so on. So the major stopper to my mind is where to get WebContents. Or not to get it, and dissociate downloading process after intiation for the web contents. But the problem arises where to show download shelf (active window? not to show at all?), etc. Plus detaching downloads from web contents is non-trivial architectural effort by itself.
 
We implemented more or less full-featured download resumption in Yandex browser more than 1 year ago using existing chromium code + some hacks (for instance, treat content length mismatch as an error for downloads). We have a little bit different UI for download manager and safebrowsing warnings, and it was easier to integrate resumption there. We also created fake web contents which does nothing, does not create renderer and so on and associate downloads with it if original tab has gone. This is very hacky but works. The hardest part was creating auto-re
 
16.04.2015, 20:51, "Ryan Sleevi" <rsl...@chromium.org>:
--

Alexey Baranov

unread,
Apr 16, 2015, 2:58:35 PM4/16/15
to rsl...@chromium.org, Bence Béky, net-dev, Asanka Herath, Ricardo Vargas
Oops early submit...
 
To cut a long story short, I meant that download resumption code exists, but there are problems with originating web contents.
From our data it works pretty well (with some fixes), but that may not apply to the chrome, since it has worldwide usage, meaning it has to interop well with much wider variety of serverside software, and there might be issues I am not aware of.
 
But still, the problems with media stream remains. I guess it is much more frequent in the wild than paused downloads.
 
16.04.2015, 21:53, "Alexey Baranov" <baran...@yandex-team.ru>:

David Benjamin

unread,
Apr 16, 2015, 3:03:04 PM4/16/15
to Alexey Baranov, rsl...@chromium.org, Bence Béky, net-dev, Asanka Herath, Ricardo Vargas
Issues with originating WebContents for browser-owned requests are a pretty bad wart around the ResourceLoader stack right now. I have some plans to refactor that and take the renderer out of the equation. PlzNavigate also rather needs this.

Chris Bentzel

unread,
Apr 16, 2015, 3:04:52 PM4/16/15
to Alexey Baranov, rsl...@chromium.org, Bence Béky, net-dev, Asanka Herath, Ricardo Vargas
asanka@ plans to revisit the download resumption within Chromium as well when he returns from a break. Looks like David x Alexey x Asanka should sync up on design plans when he returns. Thanks, Alexey, for letting us know what Yandex has done in this area.

On Thu, Apr 16, 2015 at 2:58 PM Alexey Baranov <baran...@yandex-team.ru> wrote:

Randy Smith

unread,
Apr 16, 2015, 3:11:40 PM4/16/15
to Ryan Hamilton, Ryan Sleevi, Bence Béky, net-dev, Asanka Herath, Ricardo Vargas
On Thu, Apr 16, 2015 at 2:49 PM, Ryan Hamilton <r...@chromium.org> wrote:
> On Thu, Apr 16, 2015 at 10:51 AM, Ryan Sleevi <rsl...@chromium.org> wrote:
>>
>> On Thu, Apr 16, 2015 at 8:59 AM, Bence Béky <b...@chromium.org> wrote:
>>>
>>> Alternative long term solution: when the user pauses the download,
>>> reset the stream. This corresponds best to what the user wants: no
>>> more data are downloaded after a roundtrip. However, connection
>>> resumption is a very trickly task, and I understand that there are
>>> engineers working on it right now.

I want to call out that even if this is implemented, it isn't a
complete panacea for the download case; specifically, we can't do this
with POSTs.

>>
>>
>> Why is this not the right answer? Why do we not stop the download, then
>> resume (with Range requests)? It's an HTTP/2 client - we can presume that
>> much of the stupidity of servers is fixed.
>
>
> Agreed.
>
> This seems like the right solution.

So I'm a bit surprised to hear you say that, so I'll push back to get
more information. The problem is that the network stack API allows
consumers to interfere with one another by stopping reading, filling
up the session flow control window and blocking other streams on the
same session. This is currently triggered by downloads, but fixing
downloads doesn't change the underlying vulnerability. If you want to
make it a consumer responsibility not to do this stupid thing, that's
fine, but I at least want to call out that the current API doesn't
discourage this behavior (not that I can think of an API change that
would) and I'm not sure how to frame that responsibility in an API
contract fashion.

I also want to call out that Bence's email suggested that it wasn't
only downloads that were triggering this behavior at the moment; media
streaming does as well. So if we're going to solve this at the
consumer level, we need to make sure to change both the downloads
system and whatever system does the media streaming (hopefully not
flash plugins :-J).

-- Randy

>
>> To be honest, I don't understand entirely why we don't do this for
>> HTTP/1.1 clients. I suspect it was to work around some broken
>> implementations?
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "net-dev" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to net-dev+u...@chromium.org.
>> To post to this group, send email to net...@chromium.org.
>> To view this discussion on the web visit
>> https://groups.google.com/a/chromium.org/d/msgid/net-dev/CACvaWvZZbRa4dwRUOck%3DtBv%3DFRXQJ7q0o7O9yL%3DwDHdruEvsjw%40mail.gmail.com.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "net-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to net-dev+u...@chromium.org.
> To post to this group, send email to net...@chromium.org.
> To view this discussion on the web visit
> https://groups.google.com/a/chromium.org/d/msgid/net-dev/CAJ_4DfQfYxCe2nwiZgeZXcnBo2BF6VCxwUKNsHHxQDEzo4WurQ%40mail.gmail.com.

Ryan Hamilton

unread,
Apr 16, 2015, 3:28:32 PM4/16/15
to Randy Smith, Ryan Sleevi, Bence Béky, net-dev, Asanka Herath, Ricardo Vargas
There's a fundamental balancing act between:
* Transferring data at line rate
* Avoiding head of line blocking
* Finite buffer
You can get any 2, but not 3. Infinite buffering is really bad because that can h0ze the whole browser. So that means we need to put limits on the amount of that that we'll buffer for a connection. Figuring out what those limits should be is hard. We cap the total data for the connection at C and the total data for a given stream at S. If S == C, then a single paused stream can consume the whole buffer and then that leads to head-of-line block. That's not awesome. If we, say, set S = C /100 then it takes 100 paused streams before we block the connection. That's great. However, it also means that each stream's window is (probably) so small that it can't go a line rate. That's bad: >

Now we can and should be more clever with the window size we use. Similar to TCP's receive window auto-tuning, we could aim to keep the per stream limits proportional to how much data we're actually able to process in an RTT. This would help us strike a better balance.

But to be clear, slow streams are a problem for HTTP/1.1 too. We have a hard cap on the number of connections to a server (6?). This means that once we have 6 requests in flight, we get head of line blocking. Pause a download, add some hanging GET, and you can hit that limit pretty easily. :/

That being said, I have lots of opinions and only some of them are correct, so take my opinions with a grain of salt!
 
I also want to call out that Bence's email suggested that it wasn't
only downloads that were triggering this behavior at the moment; media
streaming does as well.  So if we're going to solve this at the
consumer level, we need to make sure to change both the downloads
system and whatever system does the media streaming (hopefully not
flash plugins :-J).

​*nod*​

Alexey Baranov

unread,
Apr 16, 2015, 4:33:08 PM4/16/15
to Ryan Hamilton, Randy Smith, Ryan Sleevi, Bence Béky, net-dev, Asanka Herath, Ricardo Vargas
Just thinking of the more or less ideal solution for the world of ponies and rainbows...
 
 - update HTTP/2 spec as bnc@ suggested...
but that effectively would make stream-level flow control sort of advisory. After zeroing window, the peer must still accept some data from the wire (could be smth around BDP), and it is up to implementation how much data to allow  and in what timeframe, probably track rtt... That would make the protocol even more complex.
 
 - high-performant servers caring about memory footprint (I mean nginx) buffers data to the disk after some threshold (usually several memory pages). This is again what bnc@ suggested - the network stack could do the same thing (file-backed SpdyBuffer class, for instance).
but chromium is a bit more complicated here then nginx:
 - separate threads (not IO) should be used to write/read buffered data from disk. That could be cache thread, another separate thread, or a thread pool.
 - unlike the web server, browser cannot assume the disk io will be configured for the best performance. Not to mention, that storage on mobile devices may be slow from its physical nature.
 
16.04.2015, 22:28, "Ryan Hamilton" <r...@chromium.org>:
--
You received this message because you are subscribed to the Google Groups "net-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to net-dev+u...@chromium.org.
To post to this group, send email to net...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/net-dev/CAJ_4DfSQp636wq%3DZXfeXpiXYwu93NT3vLfy6CXQNegzfA9DWzA%40mail.gmail.com.

Randy Smith

unread,
Apr 16, 2015, 4:41:59 PM4/16/15
to Ryan Hamilton, Ryan Sleevi, Bence Béky, net-dev, Asanka Herath, Ricardo Vargas
So that all makes sense, but doesn't strike me as addressing why the
solution of "In the consumer, don't pause streams, just kill and
restart them" is preferred over the other two options Bence laid out
(for the eventually really long-term solution). Have I missed
something?

Now both of Bence's other solutions mean that someone has to buffer
everything that's in flight before the time when the flow-control
modification makes it back to the server, *and* both of Bence's
solutions rely on the consumer to behave properly in this case of
stopping reading data, so neither is particularly ideal. In fact, I
think the only advantage of those solutions over consumer restart is
that those solutions will work for POST downloads and consumer restart
won't, and I'm very open to the argument that simplicity trumps
handling POSTs -> just kill the darn connection if you have to pause
it. But I'd like to be explicit about the tradeoff we're making.

(I keep trying to come up with a solution that doesn't require
consumer involvement, but the only thing I come up with is switching
to a push API (so the consumer has to explicitly signal pauses), which
I think is a bad idea for other reasons.)

-- Randy

Randy Smith

unread,
Apr 16, 2015, 4:47:45 PM4/16/15
to Alexey Baranov, Ryan Hamilton, Ryan Sleevi, Bence Béky, net-dev, Asanka Herath, Ricardo Vargas
On Thu, Apr 16, 2015 at 4:33 PM, Alexey Baranov
<baran...@yandex-team.ru> wrote:
> Just thinking of the more or less ideal solution for the world of ponies and
> rainbows...
>
> - update HTTP/2 spec as bnc@ suggested...
> but that effectively would make stream-level flow control sort of advisory.
> After zeroing window, the peer must still accept some data from the wire
> (could be smth around BDP), and it is up to implementation how much data to
> allow and in what timeframe, probably track rtt... That would make the
> protocol even more complex.
>
> - high-performant servers caring about memory footprint (I mean nginx)
> buffers data to the disk after some threshold (usually several memory
> pages). This is again what bnc@ suggested - the network stack could do the
> same thing (file-backed SpdyBuffer class, for instance).
> but chromium is a bit more complicated here then nginx:
> - separate threads (not IO) should be used to write/read buffered data from
> disk. That could be cache thread, another separate thread, or a thread pool.
> - unlike the web server, browser cannot assume the disk io will be
> configured for the best performance. Not to mention, that storage on mobile
> devices may be slow from its physical nature.

If we can't find some way to make the proper behavior happen even if
the consumer of the network stack API is clueless or out to lunch, I'm
not sure it's worth a lot of protocol hacking to allow a somewhat
quicker response to a pause on behalf of the consumer.

Having said that, the alternative Bence called out that didn't require
protocol changes (an interface to tell the network stack to stop
extending the server stream window, and the consumer draining the data
until it did stop) I think would be useful for downloads for handling
the POST case (or, for handling any case until we get resumption
working, which based on how long we've been trying to get it working
may be a while).

-- randy

David Benjamin

unread,
Apr 16, 2015, 4:51:06 PM4/16/15
to Randy Smith, Ryan Hamilton, Ryan Sleevi, Bence Béky, net-dev, Asanka Herath, Ricardo Vargas

Requiring consumer involvement also causes problems as the web platform tries to grow more and more interesting fetch APIs. Because now we may have:

a) We would like JS code to participate in pushback when consuming streams.
b) We don't trust that JS code not to hang a stream.

Like Ryan says, this is a problem for HTTP/1.1 on the socket limit too. Since we globally pool and impose limits on sockets across mutually distrusting source origins (i.e. my request to google.com may go through the same socket as your request to google.com, not that requests to google.com and example.com share sockets. The latter is obviously crazy.), we need to act as arbiter on this somehow.

I don't have a good answer for this, but untrusted consumers throw yet another wrench into the mix.

David
 
>
>>
>> I also want to call out that Bence's email suggested that it wasn't
>> only downloads that were triggering this behavior at the moment; media
>> streaming does as well.  So if we're going to solve this at the
>> consumer level, we need to make sure to change both the downloads
>> system and whatever system does the media streaming (hopefully not
>> flash plugins :-J).
>
>
> *nod*
>

--
You received this message because you are subscribed to the Google Groups "net-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to net-dev+u...@chromium.org.
To post to this group, send email to net...@chromium.org.

Bence Béky

unread,
Apr 16, 2015, 4:52:44 PM4/16/15
to Alexey Baranov, Ryan Hamilton, Randy Smith, Ryan Sleevi, net-dev, Asanka Herath, Ricardo Vargas
On Thu, Apr 16, 2015 at 4:33 PM, Alexey Baranov
<baran...@yandex-team.ru> wrote:
> - update HTTP/2 spec as bnc@ suggested...
> but that effectively would make stream-level flow control sort of advisory.
> After zeroing window, the peer must still accept some data from the wire
> (could be smth around BDP), and it is up to implementation how much data to
> allow and in what timeframe, probably track rtt... That would make the
> protocol even more complex.

One can go around this problem this way: say the client has a window
of 10 MB, sends stop frame to server. When this frame reaches the
server, it has a window of 6 MB, because there is 4 MB already in
flight. Server says "I am decrementing my window by 6 MB to zero" in
a stop-ack frame. Client gets 4 MB of data, decrements window, then
gets stop-ack frame, decrements window by further 6 MB, sees that
window is zero and is happy. If window goes below zero, that's still
a protocol error.

That being said, I agree that this is fairly complex. And what about
servers not implementing this extension?

Bence

Alexey Baranov

unread,
Apr 16, 2015, 5:36:14 PM4/16/15
to Bence Béky, Ryan Hamilton, Randy Smith, Ryan Sleevi, net-dev, Asanka Herath, Ricardo Vargas
 
 
16.04.2015, 23:52, "Bence Béky" <b...@chromium.org>:
I guess fallback to your first solution about updating session-only window. Maybe gradually initial stream window. And probably go and store buffered data to the disk.



Bence

Alexey Baranov

unread,
Apr 16, 2015, 5:36:44 PM4/16/15
to David Benjamin, Randy Smith, Ryan Hamilton, Ryan Sleevi, Bence Béky, net-dev, Asanka Herath, Ricardo Vargas
 
 
16.04.2015, 23:51, "David Benjamin" <davi...@chromium.org>:
Given that HTTP2 is https-only feature (JS code is the code the site trusts), can we assume that website is not interested in shooting into its own leg by spawning tons for requests and bloating the connection window? Even now (like with plain socket limits) js can open max streams per session and hang the whole origin.
 
The other thing we could do is gradually decrement initial stream window by sending settings frames, if the browser gets bloated with unconsumed streams.
 
That all being said, I guess we have a handful of options of what to do, but I think firstly we need to agree upon what guarantees network stack as a platform should provide to the client (browser web pages or cronet users, and so on) for that case.

 
>
>>
>> I also want to call out that Bence's email suggested that it wasn't
>> only downloads that were triggering this behavior at the moment; media
>> streaming does as well.  So if we're going to solve this at the
>> consumer level, we need to make sure to change both the downloads
>> system and whatever system does the media streaming (hopefully not
>> flash plugins :-J).
>
>
> *nod*
>

--
You received this message because you are subscribed to the Google Groups "net-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to net-dev+u...@chromium.org.
To post to this group, send email to net...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/net-dev/CAFbEG_qnxXMf9_V%3D3oDJgh%2BQjy%3DvLq6iPtvWTFPEvHkhPS0Qnw%40mail.gmail.com.

 

--
You received this message because you are subscribed to the Google Groups "net-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to net-dev+u...@chromium.org.
To post to this group, send email to net...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/net-dev/CAF8qwaCfA649Yz5PsVu1Un9VJtC5Y0LsLzxk2%2BhBVE0mw4pmsw%40mail.gmail.com.

David Benjamin

unread,
Apr 16, 2015, 5:42:10 PM4/16/15
to Alexey Baranov, Randy Smith, Ryan Hamilton, Ryan Sleevi, Bence Béky, net-dev, Asanka Herath, Ricardo Vargas
On Thu, Apr 16, 2015 at 5:36 PM Alexey Baranov <baran...@yandex-team.ru> wrote:
16.04.2015, 23:51, "David Benjamin" <davi...@chromium.org>
Requiring consumer involvement also causes problems as the web platform tries to grow more and more interesting fetch APIs. Because now we may have:
 
a) We would like JS code to participate in pushback when consuming streams.
b) We don't trust that JS code not to hang a stream.

Like Ryan says, this is a problem for HTTP/1.1 on the socket limit too. Since we globally pool and impose limits on sockets across mutually distrusting source origins (i.e. my request to google.com may go through the same socket as your request to google.com, not that requests to google.com and example.com share sockets. The latter is obviously crazy.), we need to act as arbiter on this somehow.

I don't have a good answer for this, but untrusted consumers throw yet another wrench into the mix.

David
Given that HTTP2 is https-only feature (JS code is the code the site trusts), can we assume that website is not interested in shooting into its own leg by spawning tons for requests and bloating the connection window? Even now (like with plain socket limits) js can open max streams per session and hang the whole origin.

That's not sufficient because of CORS. https://some-common-origin.com/blah send CORS headers and then https://a.com/ and https://b.com/ can both be trying to fetch from it.

Plus, as you mention, this problem isn't limited to HTTP/2. We have an analogous issue with HTTP/1.1.
 

Alexey Baranov

unread,
Apr 16, 2015, 6:13:01 PM4/16/15
to David Benjamin, Randy Smith, Ryan Hamilton, Ryan Sleevi, Bence Béky, net-dev, Asanka Herath, Ricardo Vargas
 
 
17.04.2015, 00:42, "David Benjamin" <davi...@chromium.org>:
On Thu, Apr 16, 2015 at 5:36 PM Alexey Baranov <baran...@yandex-team.ru> wrote:
16.04.2015, 23:51, "David Benjamin" <davi...@chromium.org>
Requiring consumer involvement also causes problems as the web platform tries to grow more and more interesting fetch APIs. Because now we may have:
 
a) We would like JS code to participate in pushback when consuming streams.
b) We don't trust that JS code not to hang a stream.

Like Ryan says, this is a problem for HTTP/1.1 on the socket limit too. Since we globally pool and impose limits on sockets across mutually distrusting source origins (i.e. my request to google.com may go through the same socket as your request to google.com, not that requests to google.com and example.com share sockets. The latter is obviously crazy.), we need to act as arbiter on this somehow.

I don't have a good answer for this, but untrusted consumers throw yet another wrench into the mix.

David
Given that HTTP2 is https-only feature (JS code is the code the site trusts), can we assume that website is not interested in shooting into its own leg by spawning tons for requests and bloating the connection window? Even now (like with plain socket limits) js can open max streams per session and hang the whole origin.

That's not sufficient because of CORS. https://some-common-origin.com/blah send CORS headers and then https://a.com/ and https://b.com/ can both be trying to fetch from it.
True. But that also means that some-common-origin.com chose explicitly to trust a.com and b.com. Unless CORS is smth like * :)


Plus, as you mention, this problem isn't limited to HTTP/2. We have an analogous issue with HTTP/1.1.
The difference I feel here is that number of requests in HTTP/1.1  is very explicit thing the developer can control, the limits are not hidden, and there are a lot of articles in the web about that limit. For HTTP/2, recv/send windows as well as max stream limits are the internal details of HTTP2 and what is more, they may dynamically change over the lifetime of the connection. But the web developer cannot know it all, unless she wants to mess with net-internals, but this tool is much less widely known. Perhaps that parameters should be visible in dev tools. And performance guides for HTTP/2 yet to come I hope will explain all that issues.

David Benjamin

unread,
Apr 16, 2015, 6:19:21 PM4/16/15
to Alexey Baranov, Randy Smith, Ryan Hamilton, Ryan Sleevi, Bence Béky, net-dev, Asanka Herath, Ricardo Vargas
On Thu, Apr 16, 2015 at 6:13 PM Alexey Baranov <baran...@yandex-team.ru> wrote:
Given that HTTP2 is https-only feature (JS code is the code the site trusts), can we assume that website is not interested in shooting into its own leg by spawning tons for requests and bloating the connection window? Even now (like with plain socket limits) js can open max streams per session and hang the whole origin.

That's not sufficient because of CORS. https://some-common-origin.com/blah send CORS headers and then https://a.com/ and https://b.com/ can both be trying to fetch from it.
True. But that also means that some-common-origin.com chose explicitly to trust a.com and b.com. Unless CORS is smth like * :)

Right, * is precisely the case I'm thinking about. Access-Control-Allow-Origins: * on its own is a very reasonable header to add. Those requests are uncredentialed, so it's not "CSRF me" but "this is not an intranet site". If your service is some kind of REST endpoint that other web apps are meant to pull from, that's the right header to use.

https://annevankesteren.nl/2012/12/cors-101 

Plus, as you mention, this problem isn't limited to HTTP/2. We have an analogous issue with HTTP/1.1.
The difference I feel here is that number of requests in HTTP/1.1  is very explicit thing the developer can control, the limits are not hidden, and there are a lot of articles in the web about that limit. For HTTP/2, recv/send windows as well as max stream limits are the internal details of HTTP2 and what is more, they may dynamically change over the lifetime of the connection. But the web developer cannot know it all, unless she wants to mess with net-internals, but this tool is much less widely known. Perhaps that parameters should be visible in dev tools. And performance guides for HTTP/2 yet to come I hope will explain all that issues.

In the case of multiple untrusting origins hitting a common public server, the web developer can't control the number of requests because the other origin isn't under their control.
 

Bence Béky

unread,
May 18, 2015, 11:17:37 AM5/18/15
to David Benjamin, Alexey Baranov, Randy Smith, Ryan Hamilton, Ryan Sleevi, net-dev, Asanka Herath, Ricardo Vargas
Hi,

Thank you all for the very helpful feedback. While finding a
long-term solution will require quite some work, I propose to also
implement the short term solution as suggested in my first e-mail,
that is, setting the session level flow control window to 15 MB and
the stream level one to 6 MB. Please feel free to comment here or on
the CL in progress: https://crrev.com/1136293003.

Thank you,

Bence

Yutaka Hirano

unread,
Aug 26, 2015, 9:18:28 AM8/26/15
to Bence Béky, David Benjamin, Alexey Baranov, Randy Smith, Ryan Hamilton, Ryan Sleevi, net-dev, Asanka Herath, Ricardo Vargas, Takeshi Yoshino, Domenic Denicola
Hi,

I'm sorry that I was not aware of this discussion when I was implementing backpressure on fetch API, but as David said it may lead to the same problem.
Have you found a good solution to this problem, or are you looking for it?
For HTTP/1.1, maybe I should disable backpressure mechanism.

Thanks,
 

David Benjamin

unread,
Aug 26, 2015, 12:09:31 PM8/26/15
to Yutaka Hirano, Bence Béky, Alexey Baranov, Randy Smith, Ryan Hamilton, Ryan Sleevi, net-dev, Asanka Herath, Ricardo Vargas, Takeshi Yoshino, Domenic Denicola
Disabling back-pressure should be a last-ditch solution, and only as a temporary workaround along the way to a real solution. We shouldn't buffer unbounded data in memory in an API expressly intended to avoid this. Any case where back-pressure is disabled should be treated as a bug and have a plan for resolving it.

The network is not reliable, so I think we have a number of different knobs to play with. Thinking aloud:

- We could account for things at the bandwidth level. One of many things socket limits are actually trying to enforce is probably bandwidth usage. But if a socket is actually paused, we're not actually using that bandwidth, so perhaps it shouldn't be treated as as expensive as a live socket.

- Of course, if we then allow more sockets, perhaps all the paused ones could start reading, so we'd have to further track things. Maybe requests from the active tab gets first priority and we only service reads from other requests as bandwidth allows.

- Presumably we still want a socket limit even if I decide to open 1000s of connections to foo{1,2,3,...}.example.com and never read them, so they shouldn't actually count as zero. We can have per-tab or per-origin limits.

- At the end of the day, all that just kicks the can down the read and improves arbitration, but limits are still possible. Perhaps we start shooting down requests that are stalled, or from origins or tabs that have too many active requests open. The network is unreliable, so sites already have to be able to handle transient network errors. Just like servers probably already kill sockets if they're stalling without consuming data.

We're the arbitrator between mutually untrusting consumers of a limited shared resource. Like any other operating system, it's our job to apportion things properly.

Takeshi Yoshino

unread,
Aug 27, 2015, 2:44:37 AM8/27/15
to David Benjamin, Yutaka Hirano, Bence Béky, Alexey Baranov, Randy Smith, Ryan Hamilton, Ryan Sleevi, net-dev, Asanka Herath, Ricardo Vargas, Domenic Denicola
Regarding fetch() + Streams over HTTP/2, we could coordinate fetch()'s flow control with HTTP/2's stream-wise flow control.

- We'll have an interface between content/ and net/ where content/ gives quota to net/.
- net/ consumes the quota when pushing data to content/.
- It's content/'s duty to accept data pushed from net/ without blocking.
  - This is needed only if net/ shares the buffer with other HTTP/2 streams, and so we need to drain it asap, or if we need to reduce memory pressure in the browser process. The issue of occupying HTTP/2 session flow control window can be resolved even without buffer draining.
- net/ makes sure not to load too much data (up to the quota) from network so that received data is smoothly drained by content/.
  - For HTTP/1.1/TCP, this is roughly done by not to invoke read(2).
  - For HTTP/2, this is done precisely by controlling HTTP/2 flow control.
- content/ controls quota replenishment based on data consumption status of fetch().

Of course, the net stack may place some limit to override the flow control signal from fetch() (e.g. up to 6MB window).

When fetch() is invoked, to fill the pipe content/ gives some initial quota to net/. This could be calculated based on highWaterMark of the ReadableByteStream created for the fetch() operation. If ReadableByteStream.read() is not invoked for a while, the queue in the ReadableByteStream is filled up with received data and net/ runs out of the quota as content/ doesn't replenish it, and stops loading more data from wire (the received data in net/ are drained by content/ and the window of the HTTP/2 stream for the fetch() stays to be 0).

The quota should be gradually increased up to

    quota = (fetch() consumer's data consumption rate) * RTT

or a little more to avoid running out of data when the network gets congested.

Getting appropriate value for this calculated by the browser automatically (some heuristics on history of the ReadableByteStream's queue size) is ideal, but I think leaving this to the user of the fetch() + Streams is also a reasonable option to explore.

Takeshi

Yutaka Hirano

unread,
Sep 1, 2015, 12:55:38 PM9/1/15
to Takeshi Yoshino, David Benjamin, Bence Béky, Alexey Baranov, Randy Smith, Ryan Hamilton, Ryan Sleevi, net-dev, Asanka Herath, Ricardo Vargas, Domenic Denicola
Using HTTP/2 flow control sounds reasonable, but we need to buffer data somewhere.
Currently browser process memory (6MB / stream) is used to store the data. Is it a good idea to use the renderer process memory instead in such cases? Unfortunately doing that leads to an extra data copy.
Another option is setting quota per tab or origin (as David says), but I'm not sure if we can set a value that can be used in various environments.
Reply all
Reply to author
Forward
0 new messages