Proposal for stream dependencies in SPDY

458 views
Skip to first unread message

Michael Piatek

unread,
Oct 26, 2012, 5:14:45 PM10/26/12
to spdy...@googlegroups.com

Hi all,

I'd like to share a draft proposal for adding a notion of stream dependencies to SPDY. Stream dependencies are intended to improve performance by providing hints to the server about which streams are most important to the client (and the relationships between those streams).

A doc detailing the proposal:

Comments welcome.

Thanks!

-Michael

Roberto Peon

unread,
Oct 27, 2012, 1:50:08 AM10/27/12
to spdy...@googlegroups.com

As a bonus, in case you didn't figure it out already, sending the dependencies from browser to server means that servers will get far, far better information on which to make decisions about server push, especially when information about page visibility ends up being encoded in the ordering...

As usual, the design intent is to facilitate people writing their pages and letting the rest of the system do what needs be done to get the page rendered efficiently.

I have every hope that Jetty will continue to lead the way there ;)

-=R

Michael Piatek

unread,
Oct 31, 2012, 4:51:31 PM10/31/12
to spdy...@googlegroups.com

Any thoughts here? We're inclined to interpret the quiet here as a sign that no one disagrees :)

Simone Bordet

unread,
Nov 1, 2012, 5:30:25 AM11/1/12
to spdy...@googlegroups.com
Hi,

On Wed, Oct 31, 2012 at 9:51 PM, Michael Piatek <pia...@google.com> wrote:
>
> Any thoughts here? We're inclined to interpret the quiet here as a sign that
> no one disagrees :)

Just super-busy.
I skimmed your proposal, but I should get some time next week to look
at it in details.

Simon
--
http://cometd.org
http://webtide.com
Developer advice, training, services and support
from the Jetty & CometD experts.
----
Finally, no matter how good the architecture and design are,
to deliver bug-free software with optimal performance and reliability,
the implementation technique must be flawless. Victoria Livschitz

Simone Bordet

unread,
Nov 6, 2012, 3:31:04 PM11/6/12
to spdy...@googlegroups.com
Hi,

On Sat, Oct 27, 2012 at 7:50 AM, Roberto Peon <grm...@gmail.com> wrote:
> As a bonus, in case you didn't figure it out already, sending the
> dependencies from browser to server means that servers will get far, far
> better information on which to make decisions about server push, especially
> when information about page visibility ends up being encoded in the
> ordering...
>
> As usual, the design intent is to facilitate people writing their pages and
> letting the rest of the system do what needs be done to get the page
> rendered efficiently.

100% agree, but then I am not sure I get the proposed document.
More on another email.

Simone Bordet

unread,
Nov 6, 2012, 4:47:12 PM11/6/12
to spdy...@googlegroups.com
Hi,
tldr; I don't see the use cases listed in the document to be valid or
widespread enough to change the SPDY protocol to support them.

The long story.

document.write() is considered so evil that all javascript libraries
that I know of and all presentation, sessions, conferences I
attended/watched have either removed support for it entirely, or if
the developer insist to use it, just bail out and "now you're on your
own" (for example:
http://www.infoq.com/presentations/The-Once-And-Future-Script-Loader).
The use case presented of:

document.write('<script src="b.js"></script>');

will horrify all JS people I know :)

Besides, JavaScript being the language it is, I think most of
developers don't use it raw, but through libraries such as Dojo or
jQuery or others. Very few write that snippet nowadays, and libraries
don't, to my knowledge.
Having said that, I'll be glad to be corrected if Google gathered data
that shows that's not the case, but I understand that document.write()
is an evil that every JS library implementer is busy eradicating.

The use case of a user flipping between loading tabs and
reprioritizing resources seems to me a very marginal one.
The tab contents should be resources that the browser does not have in
cache; the user still has a view on one tab only (and therefore never
perceives that the other tab now got slower, or that the current one
got faster - can't compare by just watching one tab), the server may
have finished to send the data (that's now on the fly) by the time the
reprioritization arrives.
I don't think this use case is worth a new SPDY frame, but again, if
there is data and evidence that's not the case, I'll be glad to hear.

The use case of display: none is interesting. However, I think we need
data to know whether this is common practice.
SPDY has always been concerned about roundtrips: NPN and SPDY push
being the 2 primary examples, and in our experience (push especially)
they are worth the effort.
In case of display: none, the browser fetches (or prefetches) the
image *before* knowing that's not to be displayed (otherwise can defer
its load).
Then the stylesheet applies, and now the browser should send a
reprioritization (involving a roundtrip) to tell the server to
decrease the priority.
This only needs to be done if the image is not in the cache already.
If the image is being pushed meanwhile, the reprioritization still
taking a roundtrip, the image could be arrived meanwhile.
I am also guessing that if the image is big, then probably there is
little chance that is not being displayed (being a "real" image),
while 1x1 pixel images may be hidden but not worth to be
reprioritized.
I am just speculating here, though, but I think the use case, while
interesting, is marginal.

Using priorities to specify sequential resource ordering is weird to
me; perhaps I did not understand.
SPDY being multiplexed, I don't want any "sequentiality" otherwise
I'll go back to uniplex HTTP, no ?

However, I agree that the idea of having the browser specify resource
dependency is very appealing.

We have done this in Jetty with HTTP's Referer headers, but we had to
employ a number of corrections to avoid polluting the push cache with
non relevant information.
For example, we only establish a dependency for GET requests, without
a query string, with specific resource extensions (e.g. *.jpg, *.css,
etc.), with specific content-type (e.g.
http://en.wikipedia.org/wiki/File:Google_Chrome_screenshot.png is
actually an HTML page despite the extension), within a specific period
from the primary resource request, etc.

It would be so incredibly easier if the browser could tell the server:
I am requesting a secondary resource, and it's related to this primary
resource.
The browser has much if not all of this information, and feeding the
server will just make SPDY push a lot more efficient.
Priorities can be figured out by the browser smartly; it's true there
are no changes of priorities reacting to document parsing (but
document.write() is evil) or user behavior (switching between tabs not
a common use case IMHO), or stylesheet application (how's that common
?), but I think a standard push mechanism on the client will be a nice
step forward.
If this is implemented by browsers and server, we can crunch the
numbers and see what other low-hanging fruits can be picked (unless
Google already has this data, and the proposed document reflects the
reality of that data).
Furthermore, priority changes will only be effective if a connection
has two concurrent streams active and those are still active when a
reprioritization arrives. Not sure how common is this case (slow
connections ?).

Note that we can implement resource dependency with a HTTP header;
something along the lines of mod_spdy (although that's on
server-side):

GET /main.html

GET /style.css
X-SPDY-Referrer: /main.html

GET /background.png
X-SPDY-Referrer: /style.css

GET /image.png
X-SPDY-Referrer: /main.html

Finally, while I think the idea is appealing, I think the use cases
presented are not worth the effort, unless there is backing data.
I would prefer to have SPDY push standardized (on the clients) and
implemented by most/all servers, and widely deployed, and only then
gather the data about what we can improve with dynamic
reprioritization.

Thanks,

Michael Piatek

unread,
Nov 6, 2012, 5:38:20 PM11/6/12
to spdy...@googlegroups.com
Hi Simone,

Thanks for the detailed feedback! Responses inline.

On Tue, Nov 6, 2012 at 1:47 PM, Simone Bordet <sbo...@intalio.com> wrote:

<snip>
> tldr; I don't see the use cases listed in the document to be valid or
> widespread enough to change the SPDY protocol to support them.

I agree that the bar for protocol changes should be high. Broadly,
stream dependencies are designed to improve *every* pageload, not just
those with contention across tabs, push, display:none, or
document.write(). Details below.

(I see now that the doc might be tweaked to de-emphasize the toy
examples. The document.write() example in particular is designed
primarily to demonstrate the workings of the mechanism rather than
motivate it. Even still, the use of document.write() remains prevalent
for many sites, so we can't write it off completely.)

<snip>
> Using priorities to specify sequential resource ordering is weird to
> me; perhaps I did not understand.
> SPDY being multiplexed, I don't want any "sequentiality" otherwise
> I'll go back to uniplex HTTP, no ?
>
> However, I agree that the idea of having the browser specify resource
> dependency is very appealing.
<snip>

I agree. I expect the first bullet point in the doc---specifying a
transfer ordering---to be the dominant use case by far. Nearly every
pageload is a mix of relatively small resources critical to
time-to-interaction (JS, CSS, etc.) and bulk data that's less
important (images, e.g.). The browser wants the JS/CSS first,
serialized, and then the images later. Further, the browser will often
change its mind, and we need to preempt the existing ordering to
reflect that. That's not possible in HTTP today.

Ordering transfers can have a tremendous impact on performance,
particularly for mobile devices. We haven't done a comprehensive study
of the impact, I'll give you a preview of some lab results we
collected for one very popular website with a mix of large images and
JS.

Onload times for a Nexus S:
- Normal HTTP: 4.7s
- SPDY w/ JS first: 4.4s
- SPDY w/ *serialized* transfer of the JS; i.e., no sharing of
capacity between the JS transfers: 4.1s

Why the reduction? For this page, like many others on mobile, the
performance bottleneck is a mix of limited CPU and slow network (here,
artificially shaped for controlled experimentation.) Using HTTP as-is
wastes CPU capacity --- the JS is delayed by competing with the image
transfers, delaying parsing and execution, which actually contributes
significantly to overall load time. Transferring the JS first improves
matters; serializing the multiple JS files reduces load times further
still.

So, in short, we saw a 600ms reduction in load time on a baseline 4.7s
mobile pageload with no change in network capacity. And, this was
considering Javascript only -- I'm optimistic that further gains are
possible by applying the same rationale to CSS transfers since CSS
often delays layout computation.

Suppose we find many such speedups for many real sites -- would that
justify the change in your view?

> We have done this in Jetty with HTTP's Referer headers, but we had to
> employ a number of corrections to avoid polluting the push cache with
> non relevant information.
> For example, we only establish a dependency for GET requests, without
> a query string, with specific resource extensions (e.g. *.jpg, *.css,
> etc.), with specific content-type (e.g.
> http://en.wikipedia.org/wiki/File:Google_Chrome_screenshot.png is
> actually an HTML page despite the extension), within a specific period
> from the primary resource request, etc.
>
> It would be so incredibly easier if the browser could tell the server:
> I am requesting a secondary resource, and it's related to this primary
> resource.

This sure sounds like a dependency to me :).

> The browser has much if not all of this information, and feeding the
> server will just make SPDY push a lot more efficient.

Agreed!

<snip>
> Furthermore, priority changes will only be effective if a connection
> has two concurrent streams active and those are still active when a
> reprioritization arrives. Not sure how common is this case (slow
> connections ?).

There's another case: I want to preempt a priority I expressed
earlier. For example, I started an image transfer and now realize I
need to request more Javascript, which should preempt that current
transfer. I expect that to be far more common based on our individual
case studies.

> I would prefer to have SPDY push standardized (on the clients) and
> implemented by most/all servers, and widely deployed, and only then
> gather the data about what we can improve with dynamic
> reprioritization.

Hopefully we can do both :). Making push more effective is a use-case
for dynamic dependencies. At present, making push work well requires a
very sophisticated server. For example, sometimes pushing images is
good -- for example, small images that should be inlined for high RTT
clients. But, sometimes pushing images is bad -- for example, if the
image is large and/or competing with CSS/JS transfers. Getting these
decisions right without client feedback is tricky, and a bad decision
can actually hurt performance rather than help. As you point out,
having the browser express dependencies is incredibly useful tool
along these lines.

Again, thanks for the comments!

-Michael

Simone Bordet

unread,
Nov 7, 2012, 9:15:23 AM11/7/12
to spdy...@googlegroups.com
Hi,

On Tue, Nov 6, 2012 at 11:38 PM, Michael Piatek <pia...@google.com> wrote:
> Onload times for a Nexus S:
> - Normal HTTP: 4.7s
> - SPDY w/ JS first: 4.4s
> - SPDY w/ *serialized* transfer of the JS; i.e., no sharing of
> capacity between the JS transfers: 4.1s
>
> Why the reduction? For this page, like many others on mobile, the
> performance bottleneck is a mix of limited CPU and slow network (here,
> artificially shaped for controlled experimentation.) Using HTTP as-is
> wastes CPU capacity --- the JS is delayed by competing with the image
> transfers, delaying parsing and execution, which actually contributes
> significantly to overall load time. Transferring the JS first improves
> matters; serializing the multiple JS files reduces load times further
> still.

Can you expand on this example ?
You mention "JS first": is that 1 JS file only with high priority ?
What do you exactly mean by "serialized transfer of JS" ? It's a
document.write() that injects a <script> to load another JS ?
How many JS files and how many other resources are we talking here ?

I still think that even for this example, the browser can assign CSS,
JS and images a different, fixed, priority.
On slow networks, the second JS request triggered by a
document.write() will have high priority, and will effectively suspend
the image download (by default with a lower priority) to favor the JS.
Jetty does implement priorities.

So I agree on your points, but I think SPDY already supports this use case.
It is just matter of making the browser smarter, and having the server
implement priorities right.
Perhaps some guidelines to browser implementers, suggesting priorities as:

html: 0 - highest priority
js: 1
css: 2
css resources: 3
other resources: 5
favicon: 7 - lowest priority

The browser can be smart and use priority 7 (instead of 5) for images
that have CSS rule of "display: none", etc.

So, I am still not convinced we need to change SPDY (the protocol), as
I see the functionality for your case already present.

In your example I would see:

1. browser requests index.html, prio=0;
2. browser receives index.html content, start parsing.
3. browser sees a.js, requests it with prio=1
4. browser prefetches img.png with prio=5;
5. browser executes a.js which does document.write("<script src=b.js>");
6. browser requests b.js with prio=1; img.png download is preempted by
server to favor b.js
7. browser executes b.js that does document.write("<img src=dyn.jpg>")
8. browser needs to retrieve dyn.jpg, but document.write() is blocking
so figures to increase dyn.jpg priority, requesting it with prio=4
9. img.png is again preempted by dyn.jpg
10. etc.

Makes sense ?

> There's another case: I want to preempt a priority I expressed
> earlier. For example, I started an image transfer and now realize I
> need to request more Javascript, which should preempt that current
> transfer. I expect that to be far more common based on our individual
> case studies.

Why would not "standard" or "sensible" priorities dependent on the
request type be used instead ?
What cases you have where you want a priority for an image be
*dynamically* greater than that of a CSS or JS ?

> Hopefully we can do both :). Making push more effective is a use-case
> for dynamic dependencies. At present, making push work well requires a
> very sophisticated server. For example, sometimes pushing images is
> good -- for example, small images that should be inlined for high RTT
> clients. But, sometimes pushing images is bad -- for example, if the
> image is large and/or competing with CSS/JS transfers. Getting these
> decisions right without client feedback is tricky, and a bad decision
> can actually hurt performance rather than help. As you point out,
> having the browser express dependencies is incredibly useful tool
> along these lines.

Agreed on expressing dependencies, but not on implementing these with
dynamic priorities.
IMHO these are 2 orthogonal concepts.

Just to reiterate: we agree there are semantic differences in resources.
JS and CSS have effects on the layout, rendering and execution of the
page that plain images don't.
I am more inclined to suggest to implementers default priorities for
resources having "side-effect", like CSS and JS, assigned a higher
priority (say 0 or 1) and other resources assigned lower priorities
(say 3+).
And smart browsers can play with those priorities even more (but not
dynamically).
I am not that inclined to change the SPDY protocol until I see this
implemented by browsers and servers, or data that shows that a big
performance boost for common cases can only be achieved via dynamic
reprioritization (requiring your suggested protocol changes).

Michael Piatek

unread,
Nov 7, 2012, 10:35:44 AM11/7/12
to spdy-dev
On Wed, Nov 7, 2012 at 6:15 AM, Simone Bordet <sbo...@intalio.com> wrote:

<snip>
> Can you expand on this example ?
> You mention "JS first": is that 1 JS file only with high priority ?
> What do you exactly mean by "serialized transfer of JS" ? It's a
> document.write() that injects a <script> to load another JS ?
> How many JS files and how many other resources are we talking here ?

For this site, there were ~9 JS files. By JS first, I mean splitting
server bandwidth capacity among all of those JS transfers first before
other resources receive any allocation.

By serialized transfer of JS, I mean the server prefers to send all of
js1 before all of js2 before all of js3, etc. Instead of splitting
capacity, the server allocates as much capacity as possible to a
single JS transfer to hasten its completion.

> I still think that even for this example, the browser can assign CSS,
> JS and images a different, fixed, priority.
> On slow networks, the second JS request triggered by a
> document.write() will have high priority, and will effectively suspend
> the image download (by default with a lower priority) to favor the JS.
> Jetty does implement priorities.

document.write() is not a significant factor in this case. Rather, the
idea is that even if there are two JS files back-to-back in the HTML
itself, the browser benefits from receiving -all- of the first one
before -any- of the second (assuming the overall transfer rate is the
same). This allow the browser to start parsing and executing the first
transfer while the second is transferring. If the two transfers share
capacity, CPU time is wasted because of the contention.

> So I agree on your points, but I think SPDY already supports this use case.
> It is just matter of making the browser smarter, and having the server
> implement priorities right.
> Perhaps some guidelines to browser implementers, suggesting priorities as:
<snip>
> Makes sense ?

Indeed -- I agree that better use of the priorities along the lines
you describe would be very helpful, but it isn't a complete solution
for two reasons:

1). The number of priorities is small, and to express their desired
ordering, pages may need arbitrarily many priorities. To give a
concrete example, suppose a page is transferring a large file file
composed of many chunks. It wants all of chunk1 before all of chunk2
before all of chunk3, and so on, for many many chunks. This is an
extreme example, but the same is true for pages with many resources
and resource dependencies. At some point, the browser will simply run
out of priorities.

2). Priorities today are static. In the example you described above,
this isn't a problem because you had gaps between priorities that
allowed you insert new requests with the desired ordering. In general,
I don't see how the browser can choose gaps to ensure sufficient
wiggle room for later reordering. And, some kinds of reordering simply
can't be predicted (e.g., switching among tabs with active transfers.)

<snip>
> Just to reiterate: we agree there are semantic differences in resources.
> JS and CSS have effects on the layout, rendering and execution of the
> page that plain images don't.
> I am more inclined to suggest to implementers default priorities for
> resources having "side-effect", like CSS and JS, assigned a higher
> priority (say 0 or 1) and other resources assigned lower priorities
> (say 3+).
> And smart browsers can play with those priorities even more (but not
> dynamically).
> I am not that inclined to change the SPDY protocol until I see this
> implemented by browsers and servers, or data that shows that a big
> performance boost for common cases can only be achieved via dynamic
> reprioritization (requiring your suggested protocol changes).

It sounds like we're largely in agreement. The main question is what's
the performance gap between the optimal use of the priorities we have
and the optimal use of dependencies. That might be small, or not :)
But, there are some cases I still don't see how to support with
priorities alone, e.g., serializing video chunk transfers.

-Michael

Thomas Becker

unread,
Nov 7, 2012, 10:43:04 AM11/7/12
to spdy...@googlegroups.com, Michael Piatek
On 11/7/12 4:35 PM, Michael Piatek wrote:
> It sounds like we're largely in agreement. The main question is what's
> the performance gap between the optimal use of the priorities we have
> and the optimal use of dependencies. That might be small, or not :)
> But, there are some cases I still don't see how to support with
> priorities alone, e.g., serializing video chunk transfers.
One of the strengths of SPDY is it's simplicity. It's pretty straight
forward to implement server and client wise. So we've to be careful
here. Adding dependencies as you describe will make priority handling on
the server/client definetly more complicated. And my first guess is,
that's not only a tad more complicated to implement. So we've to be
really sure that the gains we'll achieve will overcompensate the new
complexity.

Cheers,
Thomas

William Chan (陈智昌)

unread,
Nov 7, 2012, 10:57:26 AM11/7/12
to spdy...@googlegroups.com
I'm going to break the proposal down into two components and motivate why I think they're necessary:
* Dynamic reprioritization
* Dependencies

I think dynamic reprioritization of some sort is strictly necessary to fix bugs. Here are my bugs:
* Two tabs to two pages on the same origin. One page hosts a large logfile (let's say the webserver log or something). This will be served at the highest SPDY priority, since it's the main document. The other page will have one main document which is served at equal priority and will eventually be served, but all of its subresources will be of lower priority and will be starved forever until the first tab's main document (which could be many megabytes) finishes downloading. Either you have to relax the strict starvation (and lose sequentiality) or you have to just give up on these edge cases.
* Now, if we have a forward proxy with a SPDY connection to an origin server, we have the analog of the two tab single user case. One client of the proxy may have the potential to request large, high priority resources that starves another user's low priority resources. That's bad.

I think dependencies fixes a bug that is suboptimally addressed in SPDY today: sequential ordering. You *could* simply cycle through all priority levels (and potentially increase the number of priorities). That's totally valid, if ugly and brittle. For an audio/video stream where the audio/video is chunked, and you want each chunk to come in sequential order, you *could* use a priority value for each audio/video chunk. That's clearly possible but suboptimal, and you would have to induce delays to reset the wrap the priority once you run out of priority values to use.


On Wed, Nov 7, 2012 at 6:15 AM, Simone Bordet <sbo...@intalio.com> wrote:

Bryan McQuade

unread,
Nov 7, 2012, 11:59:11 AM11/7/12
to spdy...@googlegroups.com
Thanks Simone for your feedback! I've tried to go into more detail about why fixed priorities per resource type are not optimal below.

We spent some time poring over the HTML5 spec to make sure that SPDY prioritization enables the browser to render web pages as quickly as possible. It turns out that the prioritization scheme devised in earlier versions of SPDY (fixed priority per resource type) is not particularly well suited to rendering HTML(5) web pages as quickly as possible.

It is not even clear that it's optimal for HTML to have highest priority. Consider the case where an HTML resource is large-ish (e.g. 100kB). HTML is parsed and rendered sequentially as it arrives. But making it pri 0 means the entire HTML must be downloaded before any subresources can be downloaded. Since many pages block their initial paint on JS and CSS in the head of the document, requiring download of all the HTML before downloading e.g. blocking CSS/JS in the head is very harmful for first paint.

If we consider the following case:
<html>
<head>
<script src="foo.js"></script>
<link rel="stylesheet" href="foo.css">
</head>
... additional bytes of content here...
</html>

The optimal transfer order to render the content to the screen as quickly as possible is:
HTML up through the script tag
all of foo.js
HTML up through the stylesheet tag
all of foo.css
rest of HTML, possibly interleaving additional blocking subresources as they are encountered.
This is an oversimplification since we likely want to give the preload scanner insight into what is coming soon in the HTML so it can stay 1RTT ahead of what is being transferred from the server, but the key takeaway is that sending all HTML up front is not optimal assuming our goal is to get the page rendering to the screen as quickly as possible.

This actually suggests that if anything, HTML should possibly be lower priority than CSS and JS. It might be possible to address this by simply reordering the existing static priorities we have today, but there are all sorts of additional complexities introduced by non-blocking scripts, non-matching sheets, etc.

Further, since JS and CSS are (usually) parser/renderer blocking, i.e. CSS and JS must be processed by the parser/renderer in the order they are declared, it is not beneficial to send all outstanding JS files concurrently at the same prority. Consider:

<html>
<head>
<script src="1.js"></script>
<script src="2.js"></script>
// possibly any number of other scripts/stylesheets
</head>
...

SPDY as currently implemented will send 1.js and 2.js concurrently since all JS has a fixed priority. This means that 2.js is competing for bandwidth with 1.js, even though the renderer can't do anything with 2.js until 1.js finishes executing. For 1.js to execute as quickly as possible (in other words, for the renderer to make progress as quickly as possible), it is better to devote all available bandwidth to 1.js first, and then send 2.js immediately after.

This pretty quickly devolves away from static priorities per resource type into an ordered list of resources. Images are likely different and should be sent concurrently but since they don't block the renderer I'm less concerned about them. The goal here is that we have a prioritization scheme in SPDY that enables the renderer to make progress as quickly as possible, and since parsing/rendering of HTML is highly serialized (i.e. scripts and sheets must be processed one after the other in the order they are encountered by the parser in the document), it's more efficient to communicate resource prioritization as a list (which generalizes into a DAG for cases like images and multiple CSS files referenced sequentially in an HTML file) than as a fixed priority per resource.

You'd asked why not just use existing HTTP. HTTP without pipelining is undesirable due to the round-trip between each resource. HTTP with pipelining is also suboptimal since the client must commit to an order for resources to be transferred up front, and it is not uncommon (the rule, rather than the exception) for the order that the renderer needs resources to be different from the order that the preload scanner discovers resources, due to e.g. document.write, CSS @import, CSS @font-face, child iframes (which are their own sub-contexts that don't block the main context) etc. Also, pipelining cannot send data for other resources to the client if the resource at the head of the pipeline stalls. So we still benefit greatly from the multiplexed nature of SPDY while transferring resources in a way that is optimal for parsing/rendering web pages as specified by the HTML5 specification.

Greg Wilkins

unread,
Nov 7, 2012, 8:05:55 PM11/7/12
to spdy...@googlegroups.com
I'm intrigued by the concept of request dependencies because I think may also contain the seed of a solution to another concern I have with SPDY.

SPDY allows unlimited requests to be issued in parallel, so for example a page with 80 resources could issue requests or all 80 requests in parallel and the server can serve them in any order it likes (or just fastest first).

In Jetty, we have initially implemented support for this by dispatching the handling of all requests to the normal thread pool.  Potentially for our example above this could mean 80 dispatched tasks - one for each stream/request.

My concern is that if the processing of these requests contains any per user logic (eg authenication/authorization) that access memory that is particular to that one user, then we will cause server inefficiencies as that memory is likely to be accessed by every core on the server and cache memory will frequently be invalidated.

So if you have 16 cores, then each is likely to process around 5 streams and if each one hits the session objects for that user, then that memory could bounce around all the CPU caches on the server.

It would be far more efficient from a server point of view to introduce some serialisation between those 80 requests, so that only a few cores would be used to service them and the chances of cache hits would be greatly increased.   The problem is striking a balance between serialising a connection to a few processors and avoid head of line blocking.   This is further complicated by proxies that may multiplex different users onto the same connection.

So rather than the server try to limit the cores used by applying arbitrary resource restrictions on a per connection basis - perhaps request dependencies could be used to induce a bit more serialisation within the server and achieve a similar effect.

Note that I'm really just thinking out loud at this stage and have no real data to say that this is a real issue and I've not thought very deeply about solutions yet.   I just wanted to throw the thought out there to see if it merits further consideration.

cheers




--
Greg Wilkins <gr...@intalio.com>
http://www.webtide.com
Developer advice and support from the Jetty & CometD experts.

Simone Bordet

unread,
Nov 8, 2012, 3:48:12 AM11/8/12
to spdy...@googlegroups.com
Hi,

On Wed, Nov 7, 2012 at 5:59 PM, Bryan McQuade <bmcq...@google.com> wrote:
> Further, since JS and CSS are (usually) parser/renderer blocking, i.e. CSS
> and JS must be processed by the parser/renderer in the order they are
> declared, it is not beneficial to send all outstanding JS files concurrently
> at the same prority. Consider:
>
> <html>
> <head>
> <script src="1.js"></script>
> <script src="2.js"></script>
> // possibly any number of other scripts/stylesheets
> </head>
> ...
>
> SPDY as currently implemented will send 1.js and 2.js concurrently since all
> JS has a fixed priority. This means that 2.js is competing for bandwidth
> with 1.js, even though the renderer can't do anything with 2.js until 1.js
> finishes executing. For 1.js to execute as quickly as possible (in other
> words, for the renderer to make progress as quickly as possible), it is
> better to devote all available bandwidth to 1.js first, and then send 2.js
> immediately after.

But this is not a problem of the SPDY protocol.
This is a problem of the user agent. That Chrome right now sends 1.js
and 2.js concurrently does not mean that other browsers do the same,
or that this behavior is carved in stone.
If Chrome figures out that sending 1.js and 2.js sequentially will
improve its internal rendering, then Chrome just have to queue those
requests and send them sequentially, instead of concurrently.

> This pretty quickly devolves away from static priorities per resource type
> into an ordered list of resources. Images are likely different and should be
> sent concurrently but since they don't block the renderer I'm less concerned
> about them. The goal here is that we have a prioritization scheme in SPDY
> that enables the renderer to make progress as quickly as possible, and since
> parsing/rendering of HTML is highly serialized (i.e. scripts and sheets must
> be processed one after the other in the order they are encountered by the
> parser in the document), it's more efficient to communicate resource
> prioritization as a list (which generalizes into a DAG for cases like images
> and multiple CSS files referenced sequentially in an HTML file) than as a
> fixed priority per resource.

What I understand you want would be the ability for a browser to tell
the server:

1. Send me a first chunk of index.html with high priority.
2. Send me 1.js with high priority, and reduce priority of the
index.html resource IFF by any chance you have not already sent it
all.
3. Now I got all 1.js, increase again index.html priority
4. Oh no, I got to load 2.js, so send me that with high priority and
reduce again index.html priority.

You want to be able to say that 1.js is related to index.html, and
that IFF index.html is still in the process of being sent, then the
request of 1.js should take priority over index.html.
You can already do this with SPDY by requesting index.html with prio=1
and if it turns out that 1.js is more important, request it with
prio=0.
Smart servers can avoid starving of lower priority streams via smart
scheduling algorithms.

I'm sorry to play the devil's advocate role, but I guess it fosters
the discussion :)

Simone Bordet

unread,
Nov 8, 2012, 4:07:03 AM11/8/12
to spdy...@googlegroups.com
Hi,

On Wed, Nov 7, 2012 at 4:57 PM, William Chan (陈智昌)
<will...@chromium.org> wrote:
> * Now, if we have a forward proxy with a SPDY connection to an origin
> server, we have the analog of the two tab single user case. One client of
> the proxy may have the potential to request large, high priority resources
> that starves another user's low priority resources. That's bad.

That's unavoidable right now.

What seems to take shape from these discussions is the existence of a
"super-stream" concept that aggregates requests that depend on each
other.
So we have a super-stream for index.html+1.js+2.js for userA, and
another super-stream for content.js+image.png for userB.
We want to be able to somehow play with priorities *within* the
super-stream, but not across super-streams in order to give every user
its share of bandwidth.
Within this model, I am still convinced that the browser can play well
enough with static priorities without the need of dynamic priorities.

> I think dependencies fixes a bug that is suboptimally addressed in SPDY
> today: sequential ordering. You *could* simply cycle through all priority
> levels (and potentially increase the number of priorities). That's totally
> valid, if ugly and brittle. For an audio/video stream where the audio/video
> is chunked, and you want each chunk to come in sequential order, you *could*
> use a priority value for each audio/video chunk. That's clearly possible but
> suboptimal, and you would have to induce delays to reset the wrap the
> priority once you run out of priority values to use.

Sorry I am not clear on this one.
Why you want to dynamically change priorities in such case ?
Want to listen to music using a background tab, but load fast pages in
other tabs ?
Is not this addressed by assigning a lower static priority to video
and audio streams ?

Michael Piatek

unread,
Nov 8, 2012, 10:33:34 AM11/8/12
to spdy-dev
On Thu, Nov 8, 2012 at 12:48 AM, Simone Bordet <sbo...@intalio.com> wrote:

<snip>
> But this is not a problem of the SPDY protocol.
> This is a problem of the user agent. That Chrome right now sends 1.js
> and 2.js concurrently does not mean that other browsers do the same,
> or that this behavior is carved in stone.
> If Chrome figures out that sending 1.js and 2.js sequentially will
> improve its internal rendering, then Chrome just have to queue those
> requests and send them sequentially, instead of concurrently.

Concurrent requests are needed to achieve good utilization between
client and server. Sending each request stop-and-wait will degrade
performance if the round trip time is high and the objects are small
(the common case). Pipelining is a work-around, it's not widely
supported, and even if it were, there's no way to reprioritize a
transfer in progress.

<snip>
> You want to be able to say that 1.js is related to index.html, and
> that IFF index.html is still in the process of being sent, then the
> request of 1.js should take priority over index.html.
> You can already do this with SPDY by requesting index.html with prio=1
> and if it turns out that 1.js is more important, request it with
> prio=0.
> Smart servers can avoid starving of lower priority streams via smart
> scheduling algorithms.
<snip>

You can easily get boxed into a corner when using priorities like
this. For example, suppose that the browser chooses priority 1 for
index.html as suggested, but then realizes wants to preempt that with
1.js and 2.js transferred back-to-back.

William Chan (陈智昌)

unread,
Nov 17, 2012, 5:09:21 PM11/17/12
to spdy...@googlegroups.com
I think you may be misunderstanding the new semantics offered by the proposal. Let me try to clarify.
Existing priority mechanism: static, per-stream integer priorities
Proposed new priority mechanisms:
* Grouping streams for prioritization (as you call them, superstreams)
  - Each group is a dependency tree.
  - Within a tree, parents will dominate children in terms of priority
  - Across trees, use integer priority values to control weighted scheduling.
* Dynamic reprioritization
  - Reparenting a stream
  - Reassigning priority to a stream (only pertinent for root nodes)

On Thu, Nov 8, 2012 at 1:07 AM, Simone Bordet <sbo...@intalio.com> wrote:
Hi,

On Wed, Nov 7, 2012 at 4:57 PM, William Chan (陈智昌)
<will...@chromium.org> wrote:
> * Now, if we have a forward proxy with a SPDY connection to an origin
> server, we have the analog of the two tab single user case. One client of
> the proxy may have the potential to request large, high priority resources
> that starves another user's low priority resources. That's bad.

That's unavoidable right now.

What seems to take shape from these discussions is the existence of a
"super-stream" concept that aggregates requests that depend on each
other.
So we have a super-stream for index.html+1.js+2.js for userA, and
another super-stream for content.js+image.png for userB.
We want to be able to somehow play with priorities *within* the
super-stream, but not across super-streams in order to give every user
its share of bandwidth.
Within this model, I am still convinced that the browser can play well
enough with static priorities without the need of dynamic priorities.

Superstreams do not exist yet. That's why this proposal includes dependency trees to provide this concept. But that does not remove the need for dynamic reprioritization. Let's say one user requests a long html document that is 100MB large. Within this document, it references image resources and others that fit within the viewport. Beyond a certain point, the rest of the many megabytes of the html document are not as important as the image resources that would be rendered in the viewport. With static priorities, even though the html document is originally higher priority that the subresources, the browser has no way of conveying that the html document is no longer high priority since the rest of the content is not in the viewport. That's simply a bug that SPDY would allow high priority resources to completely starve low priority resources like this.
 

> I think dependencies fixes a bug that is suboptimally addressed in SPDY
> today: sequential ordering. You *could* simply cycle through all priority
> levels (and potentially increase the number of priorities). That's totally
> valid, if ugly and brittle. For an audio/video stream where the audio/video
> is chunked, and you want each chunk to come in sequential order, you *could*
> use a priority value for each audio/video chunk. That's clearly possible but
> suboptimal, and you would have to induce delays to reset the wrap the
> priority once you run out of priority values to use.

Sorry I am not clear on this one.
Why you want to dynamically change priorities in such case ?
Want to listen to music using a background tab, but load fast pages in
other tabs ?
Is not this addressed by assigning a lower static priority to video
and audio streams ?

I failed to explain myself properly. Let's keep this simple, only talking about the video case. Let's say you have something like:
GET videoframe1
GET videoframe2
GET videoframe3
GET videoframe4
GET videoframe5
...
GET videoframeN

You do *not* want to interleave responses for videoframe1 and videoframe2. You want videoframe1 to starve videoframe2...videoframe2 is dependent on videoframe1. If you use the same priority levels, then the server will interleave them. The SPDY client could request videoframe1 with a higher priority than videoframe2 which then also needs a higher priority than videoframe3 and so on. Eventually you run out of priority values.

Simone Bordet

unread,
Nov 19, 2012, 6:52:28 AM11/19/12
to spdy...@googlegroups.com
Hi,

On Sat, Nov 17, 2012 at 11:09 PM, William Chan (陈智昌)
<will...@chromium.org> wrote:
> Superstreams do not exist yet. That's why this proposal includes dependency
> trees to provide this concept. But that does not remove the need for dynamic
> reprioritization. Let's say one user requests a long html document that is
> 100MB large. Within this document, it references image resources and others
> that fit within the viewport. Beyond a certain point, the rest of the many
> megabytes of the html document are not as important as the image resources
> that would be rendered in the viewport. With static priorities, even though
> the html document is originally higher priority that the subresources, the
> browser has no way of conveying that the html document is no longer high
> priority since the rest of the content is not in the viewport.

Well, in the old days the browser could stop reading the HTML stream
that would have been stalled by flow control.
It is only that now the flow control window size is so big that you
can't rely on flow control to do this anymore.
But that would have been "the" way to control this use case.

> That's simply
> a bug that SPDY would allow high priority resources to completely starve low
> priority resources like this.

But in the example about videoframes below you say that you *want* to
starve videoframe2.

> I failed to explain myself properly. Let's keep this simple, only talking
> about the video case. Let's say you have something like:
> GET videoframe1
> GET videoframe2
> GET videoframe3
> GET videoframe4
> GET videoframe5
> ...
> GET videoframeN
>
> You do *not* want to interleave responses for videoframe1 and videoframe2.
> You want videoframe1 to starve videoframe2...videoframe2 is dependent on
> videoframe1. If you use the same priority levels, then the server will
> interleave them. The SPDY client could request videoframe1 with a higher
> priority than videoframe2 which then also needs a higher priority than
> videoframe3 and so on. Eventually you run out of priority values.

I still don't get this example.
If you want videoframe1 *before* videoframe2, you don't ask for
videoframe2 until you got the whole videoframe1.
What's the point of making a request for videoframe2 but then wanting
to starve it completely until videoframe1 is completed ?

Just to make it clear, I understand the use case of the 100 MiB HTML
and the images on the viewport, and I agree it's a valid use case.

I'd say "use flow control to do that", but somehow now flow control
windows have become so big that we need another flow control mechanism
in the form of reprioritization.

I'm just discussing whether we're not complicating the protocol too
much to get things done.
Perhaps if we get flow control better, we don't need reprioritization;
perhaps there is no way to get flow control better and we need 2 flow
control mechanisms; perhaps we just need to introduce superstreams
explicitly with their own flow control; perhaps we improve the current
flow control so that window updates can be sent explicitly to stall a
stream that would have window space left but the client decided
otherwise; etc.

I see the value of this proposal and the problems it is addressing,
but I'm not totally sold on modifying the protocol yet.
Are we sure we're not adding stuff on top (the proposed changes to
SYN_STREAM and the new REPRIO frame) forgetting to fix broken bits at
the bottom that would have given us a better/simpler solution ?

Thanks,

Bryan McQuade

unread,
Nov 19, 2012, 9:43:06 AM11/19/12
to spdy...@googlegroups.com
This is a good point. The issue here is somewhat subtle, and probably should be addressed more clearly in the doc. Ideally, the client would not have to issue a request for videoframe2 until videoframe1 was complete. The issue here is that if the client waits like this, there will be an idle RTT between the end of vf1 being received by the client and the first byte of vf2 being received by the client. It'll go something like this:

vf1 finishes
client sends request for vf2
1/2 rtt later server receives request for vf2
server begins streaming vf2, hopefully w/o any server-side processing delay (which would add additional latency)
1/2 rtt later the first byte of vf2 arrives at the client

This is undesirable and SPDY is working to avoid these idle RTTs where bandwidth goes unused.

There are a couple ways to solve this problem. One is to send the request for vf2 to the server well before vf1 is complete, indicating that vf2 should not be sent until vf1 is finished (current proposal). Then, as soon as the server finishes streaming the last byte of vf1, it knows to immediately begin streaming of vf2, without any idle time. In this case we are making optimal use of available bandwidth, sending data prioritized in the intended order, without having to wait for any idle RTTs.

Alternatively, the client could try to anticipate when the server was about to finish vf1, and send the request for vf2 when vf2 was 1 RTT away from completion. The problem with this is that it's very hard/impossible for the client to estimate when vf1 (or more generally, any HTTP response) is 1RTT from completion. To estimate this, we need to know the RTT, the available bandwidth, the size of the outstanding response(s), and have a model of the underlying transport (TCP). We can estimate RTT, but the others are very much non-trivial and the client would be guessing. This is especially true for chunk encoded responses, for which the client does not know the size of the response until the transmission of that response is complete.

Taking all this into account, the only effective scheme I see for sending the frames in order without incurring network delays is for the client to send information about future resources it will need up to the server, as is being proposed currently.

William Chan (陈智昌)

unread,
Nov 19, 2012, 11:26:30 AM11/19/12
to spdy...@googlegroups.com
On Mon, Nov 19, 2012 at 3:52 AM, Simone Bordet <sbo...@intalio.com> wrote:
Hi,

On Sat, Nov 17, 2012 at 11:09 PM, William Chan (陈智昌)
<will...@chromium.org> wrote:
> Superstreams do not exist yet. That's why this proposal includes dependency
> trees to provide this concept. But that does not remove the need for dynamic
> reprioritization. Let's say one user requests a long html document that is
> 100MB large. Within this document, it references image resources and others
> that fit within the viewport. Beyond a certain point, the rest of the many
> megabytes of the html document are not as important as the image resources
> that would be rendered in the viewport. With static priorities, even though
> the html document is originally higher priority that the subresources, the
> browser has no way of conveying that the html document is no longer high
> priority since the rest of the content is not in the viewport.

Well, in the old days the browser could stop reading the HTML stream
that would have been stalled by flow control.
It is only that now the flow control window size is so big that you
can't rely on flow control to do this anymore.
But that would have been "the" way to control this use case.

Do you prefer the flow control mechanism? I argue that flow control should not be used for prioritization, but should be used for buffer management.
 

> That's simply
> a bug that SPDY would allow high priority resources to completely starve low
> priority resources like this.

But in the example about videoframes below you say that you *want* to
starve videoframe2.

Yep :) I think Bryan's explained that well now, thanks Bryan! In the previous case (large HTML), I would like to reprioritize the formerly high priority resource (the huge HTML doc) so that other resources can get served. In Bryan's case, it's not really that frame 1 is higher priority than frame 2, but that I explicitly want to specify an ordering, so we don't have to have any idle time.
These are very good questions. I've been planning on sending out a flow control proposal whitepaper for SPDY/4. It rehashes previous discussions on this mailing list and tries to cover the various use cases. I'll get that out ASAP. But I personally think we should try to keep flow control and prioritization separate.

Simone Bordet

unread,
Nov 19, 2012, 11:55:47 AM11/19/12
to spdy...@googlegroups.com
Hi,

On Mon, Nov 19, 2012 at 5:26 PM, William Chan (陈智昌)
<will...@chromium.org> wrote:
> Do you prefer the flow control mechanism? I argue that flow control should
> not be used for prioritization, but should be used for buffer management.

Can you elaborate on why flow control should not be used for reprioritization ?
Is not having the client sending a REPRIO for stream #1 to favor
stream #3 semantically equivalent to the client stalling stream #1 to
favor stream #3 ?

I am asking from the point of view of the semantic rather than that of
the implementation (I mean, perhaps now it's not practical to use flow
control, but let's forget about that for a moment to concentrate on
what the feature provides - a way for clients to control what to read
more urgently).

I'm stubborn, I know :)

Hasan Khalil

unread,
Nov 19, 2012, 12:02:04 PM11/19/12
to spdy...@googlegroups.com
If one of the objectives of prioritization for the server to be able to learn more about rendering order, sending a REPRIO is much more advantageous than implying this information.

Also, taking into account proxies, we'll always want to be able to separate flow control from priority.

    -Hasan

Michael Piatek

unread,
Nov 19, 2012, 12:45:20 PM11/19/12
to spdy-dev
1) Even though it requires more protocol mechanism, being explicit
about dependencies is actually simpler. Tweaking flow control
parameters to jointly optimize both priority *and* rate allocation is
tricky. Flow control is really about rates, fairness, starvation, and
so on. Dependencies are really about (re)ordering transfers. Trying to
decide how best to optimize all of these with only flow control
windows seems difficult to me, whereas it's easy for a browser to
express dependencies explicitly. And, we understand how to write a
rate controller if all we want to do is keep the pipes full with
little buffering.

2) You can't specify an a priori order with flow control that avoids
round trips; e.g., I want resource C after resource B after resource
A, but not before. The client would need to stall as flow control
updates propagate or accept an imprecise ordering.

Working through the examples in the doc with only flow control update
messages is illustrative: approximating the optimized order requires
more round trips and messages.

Roberto Peon

unread,
Nov 19, 2012, 2:52:42 PM11/19/12
to spdy...@googlegroups.com
On Mon, Nov 19, 2012 at 3:52 AM, Simone Bordet <sbo...@intalio.com> wrote:
Hi,

On Sat, Nov 17, 2012 at 11:09 PM, William Chan (陈智昌)
<will...@chromium.org> wrote:
> Superstreams do not exist yet. That's why this proposal includes dependency
> trees to provide this concept. But that does not remove the need for dynamic
> reprioritization. Let's say one user requests a long html document that is
> 100MB large. Within this document, it references image resources and others
> that fit within the viewport. Beyond a certain point, the rest of the many
> megabytes of the html document are not as important as the image resources
> that would be rendered in the viewport. With static priorities, even though
> the html document is originally higher priority that the subresources, the
> browser has no way of conveying that the html document is no longer high
> priority since the rest of the content is not in the viewport.

Well, in the old days the browser could stop reading the HTML stream
that would have been stalled by flow control.
It is only that now the flow control window size is so big that you
can't rely on flow control to do this anymore.
But that would have been "the" way to control this use case.

You'd still have bytes sent until the RCV window filled up. On a high-latency, high-bandwidth link, this could be a substantial amount of data
The idea is to always keep the pipeline full. Imagine if we had 250ms RTTs (common for mobile).
You really don't want the channel underutilized for the 250ms it'd take to get the next request.
You probably also don't want to second-guess to schedule requests to arrive as the previous one would have completed-- you're guessing and you will be getting it wrong. Also, it is substantially more complicated for the client than just expressing the dependency.
  

Just to make it clear, I understand the use case of the 100 MiB HTML
and the images on the viewport, and I agree it's a valid use case.

I'd say "use flow control to do that", but somehow now flow control
windows have become so big that we need another flow control mechanism
in the form of reprioritization.

Flow control is solving a different problem-- it is solving the proxy or server's desire to have finite memory requirements for what it is doing.
Prioritization is solving the client's problem.
Unfortunately, we haven't discovered a good mechanism for conflating these without undue compromise.
 

Also, flow control is REALLY EASY to get WRONG :)
Just look at SPDY3. Bleh.
As a base principle, we want to always be using all of the bandwidth for the channel if we have anything to send. Flow control always requires at least one RTT in order to influence the sender. Often that means you'd have stalled the transfer and you're underutilizing the channel for that RTT (and with large RTTs, the duty cycle of underutilization gets very high).
Essentially, using flow-control for prioritization requires predicting the future (one RTT's worth), and that is a very, very difficult unsolved problem :)
-=R

Patrick McManus

unread,
Nov 20, 2012, 1:29:52 PM11/20/12
to spdy...@googlegroups.com
Hola spdy-dev friends.

I wanted to comment on Michael's Stream Dependency proposal. First, thanks for doing this! Work totally needs to be done in this area. Second, My apologies for not commenting sooner.. The blame there is mostly my own work load and the complexity of the proposal, but its also what always happens when something is worked on in great detail like this and then dumped over the public wall..

My number one goal in this space is to support re-prioritization. People have talked about tab switching, which I agree is important. Perhaps as important is simply scrolling the viewport and changing which images it is you want to see. And changing the receiving priority of whatever server push has decided to send is key too. So I like that the proposal has a mechanism for doing that.

The paper makes a pretty convincing argument that some kind of sequencing is important to allow ordered pipelining of some objects (video frames being a clear example).

On the other hand the proposal is pretty complex, with its notion of multiple dependency trees and garbage collection and optional implementations.

The key question in my mind is if you can accomplish these goals without the dependency logic - i.e. a simple  32 bit priority field and a reprioritization frame.

As Will points out that gives you plenty of room to establish a "media priority range"  and set frame1 at N, frame2 at n+1, etc.. The fact that you do eventually run out of room is a minor demerit for the scheme (It would think it would happen pretty infrequently.. if each existing priority were given an equal sized range that would be 512million streams for each class.), but can be fixed without a pause by just reprioritizing all outstanding streams back to a base of N when you're about to wrap.

I'm concerned about what happens in the server doesn't effectively handle dependencies (i.e. it is timing them out too fast, or the client hasn't read the settings frame due to race condition, or "too fast" is subject to variations in rtt that are hard to predict).. When that happens the client sends SYN-STREAMs with effectively no prioritization information in them at all (because they contain useless dep info where a raw prio could be). Likewise, when the client figures out the server won't be doing dependencies it has to fallback to implementing a basic prioritization scheme anyhow - so part of me says we should just see if that alone can solve the problem.

I'm concerned that the deps are complicated enough that we won't see servers implementing them widely, where they might do simple priorities. That effectively bakes an optional-cannot-depend-on-it feature into the protocol and I think we all want to avoid that.

One thing a simple prioritization scheme doesn't solve - a proxy trying to merge two spdy sessions. Assuming they are each using the wrapping "media priority range" approach those values are basically relative with their own session but aren't mergable between sessions as raw values anymore. That doesn't bug me much, but I suspect it will bother someone else :)

From the discussion, not the document - I also don't think prioritization/sequencing should have anything to do with flow control. Such tactics are reserved for more desperate situations than a clean slate protocol.

Is this a fair summary?

-Patrick

(Happy and restful US thanksgiving to those happy to receive such wishes.)

Roberto Peon

unread,
Nov 20, 2012, 1:46:55 PM11/20/12
to spdy...@googlegroups.com
Yup, pretty good summary.
A few additions/notes:

The server gets to learn from the data provided, which in the case of dependencies is substantially more explicit and thus easier to act upon in the future (e.g. if doing heuristic push, which I suspect will be the way to go).
It takes more operations to change priority if using something which isn't dependency based, and you end up having to predict the future.

We know there are 4 resources we want to load: a,b,c,d. We can tell that b refers to a, and that d refers to c (thus a should load before b, and c should load before d). We later load the css, and discover that c,d are in a non-visible subtree of the DOM. Crap. No we have to change the priority for c,d. That means we need to *find* the correct priority for c,d, which in turn means we have to track what priorities we gave out to various things so we can discover the appropriate insertion point.
I assert that ends up being more complicated that simply stating the dependency.

Though things would work better if one does implement the full dep tree, one could simply implement a linked list instead. This should be pretty simple.

In the case where the server is timing out things too quickly it does degrade to a basic priority scheme, assuming that there is contention between streams.... on the other hand, if it is timing things out that quickly, it means that things are being transferred in less than an RT, and only the first prioritization could have mattered... so it doesn't matter what scheme you use, you'll always be too late (and that is probably a good thing.. we want things transferred that fast when possible!).

-=R

William Chan (陈智昌)

unread,
Nov 20, 2012, 2:21:28 PM11/20/12
to spdy...@googlegroups.com
This is a wonderful summary, thanks Patrick.

On Tue, Nov 20, 2012 at 8:29 AM, Patrick McManus <mcm...@ducksong.com> wrote:
Hola spdy-dev friends.

I wanted to comment on Michael's Stream Dependency proposal. First, thanks for doing this! Work totally needs to be done in this area. Second, My apologies for not commenting sooner.. The blame there is mostly my own work load and the complexity of the proposal, but its also what always happens when something is worked on in great detail like this and then dumped over the public wall..

I sense the subtle critique :) This gives me an opportunity to say again that I'm working on flow control whitepaper that I hope will elucidate why a session level flow control mechanism is needed in addition to the stream based flow control in SPDY/3, by going through all the use cases. Our previous discussion months back devolved and we felt the need to write up the use cases in a doc before resuming the discussion.

Greg Wilkins

unread,
Nov 28, 2012, 6:53:32 PM11/28/12
to spdy...@googlegroups.com
On 21 November 2012 05:46, Roberto Peon <fe...@google.com> wrote:
> We know there are 4 resources we want to load: a,b,c,d. We can tell that b
> refers to a, and that d refers to c (thus a should load before b, and c
> should load before d). We later load the css, and discover that c,d are in a
> non-visible subtree of the DOM. Crap. No we have to change the priority for
> c,d. That means we need to *find* the correct priority for c,d, which in
> turn means we have to track what priorities we gave out to various things so
> we can discover the appropriate insertion point.
> I assert that ends up being more complicated that simply stating the
> dependency.

I agree that a dependency tree is more expressive than just pure
priorities. But I am not sure that we need to tell the server about
it.

Whatever data structure is used to represent the relationships between
the streams, the sender has to eventually reduce the available content
to send to a linear list of frames... which is essentially a priority
ordered list. So what is the benefit of the client telling the server
the dependency tree rather than just the dynamic priorities derived
from such a tree? Is there any latency/round trips avoided by doing
so?
Any discovery made in the CSS of a non-visible subtree still needs to
be sent to the server, and it should not matter if it is sent as a
discovered tree or as resolved relative dynamic priorities.

Generally speaking, the client is going to have a lot more CPU
available to calculate such stuff than the server, plus it can
probably cope a lot better with any garbage generated from those
calculations. So if we can offload all the complex calculations of
dependencies to the client and just let the server deal with relative
dynamic priorities between the streams, then that is a big win.

cheers



--
Greg Wilkins <gr...@intalio.com>
http://www.webtide.com
Developer advice and support from the Jetty & CometD experts.

Roberto Peon

unread,
Nov 28, 2012, 7:21:15 PM11/28/12
to spdy...@googlegroups.com
On Wed, Nov 28, 2012 at 3:53 PM, Greg Wilkins <gr...@intalio.com> wrote:
On 21 November 2012 05:46, Roberto Peon <fe...@google.com> wrote:
> We know there are 4 resources we want to load: a,b,c,d. We can tell that b
> refers to a, and that d refers to c (thus a should load before b, and c
> should load before d). We later load the css, and discover that c,d are in a
> non-visible subtree of the DOM. Crap. No we have to change the priority for
> c,d. That means we need to *find* the correct priority for c,d, which in
> turn means we have to track what priorities we gave out to various things so
> we can discover the appropriate insertion point.
> I assert that ends up being more complicated that simply stating the
> dependency.

I agree that a dependency tree is more expressive than just pure
priorities. But I am not sure that we need to tell the server about
it.

I think the server would want to know (at least the one I am in charge of would!).
One can do far better learning from this explicit information than one can with referrer or relative priorities.
This allows for the server to do a much better job of doing server push for future sessions.
 

Whatever data structure is used to represent the relationships between
the streams, the sender has to eventually reduce the available content
to send to a linear list of frames... which is essentially a priority
ordered list.

We do end up figuring out an ordering for frames, yes, however, the client is setting policy for streams, not frames.

So what is the benefit of the client telling the server
the dependency tree rather than just the dynamic priorities derived
from such a tree?   Is there any latency/round trips avoided by doing
so?

The client can encode some level of uncertainty by having resources as peers of each-other.
The server then does what is best for those resources (e.g. if it sees that they're progressive jpgs, send some amount of data for each, etc.), and when in doubt, interleaves.
This would be hard and inefficient to do with a list or relative priorities.

In the case of a client which switches tabs, lowering the priority of a single node is quite a bit cheaper than lowering the priority of all individual streams/resources that might make up that page.
Even more fun is the case where the server is pushing a resource. What priority should it decide to use for that resource? In this case, it just attaches to the node that was associated, and we have a good idea, even when the client changes priorities for other objects.
This is a fun case because there is a race here. The client doesn't even yet know that the stream exists and it is still adjusting priorities for other things. It will be able to respond to this in (at minimum) one RTT, and if it has deprioritized the resource that caused the push, then, with any scheme other than a dependency-tracking one, we'll be doing suboptimal things for at minimum 1 RTT.

 
Any discovery made in the CSS of a non-visible subtree still needs to
be sent to the server, and it should not matter if it is sent as a
discovered tree or as resolved relative dynamic priorities.

Doing so without something like a list or a tree requires a lot more IO, as one must communicate a lot more about what resources/streams to move in priority.
The complexity of a list is very close to the complexity of a tree. In the case of a list, you'll also require more IO to accomplish the same thing. 


Generally speaking, the client is going to have a lot more CPU
available to calculate such stuff than the server, plus it can
probably cope a lot better with any garbage generated from those
calculations.   So if we can offload all the complex calculations of
dependencies to the client and just let the server deal with relative
dynamic priorities between the streams, then that is a big win.


IO is probably an order of magnitude more expensive than doing a lookup and swapping a couple of pointers. We should optimize for less of that instead if we're worried about the comparitive CPU cost.
-=R

Greg Wilkins

unread,
Nov 29, 2012, 5:43:19 PM11/29/12
to spdy...@googlegroups.com
On 29 November 2012 11:21, Roberto Peon <fe...@google.com> wrote:
> IO is probably an order of magnitude more expensive than doing a lookup and
> swapping a couple of pointers. We should optimize for less of that instead
> if we're worried about the comparitive CPU cost.

I'm not convinced that size is a significant issue here.

Firstly sending the initial tree (each stream declares a parent) or
priority (each stream declares a priority) is going to be pretty much
equivalent.

For dynamic changes, many of the use cases appear to be the temporary
elevation of a single stream, so sending a new priority for that is
equivalent to sending a new dependency.

In the case where a two disjoint trees are joined, then it may be
necessary to adjust the priorities of less than half of the streams
(if more than half, just adjust the others in the other direction).
Typically we are talking about 10s of streams in parallel from a
browser, but even if this grows to 100s, then we are still talking
about less than 1 mtu going in the uncongested direction.

Granted that the savings in CPU/memory of priorities vs trees are also
probably marginal - but I do think there is good savings in complexity
with just priorities. A complex tree algorithm optionally
implemented by the server is going to create a lot more variance in
how resources are served, making it less worthwhile for browsers to do
the work of adjusting priorities in the first place.

So implementation efficiencies aside, my question is are there any
functional benefits of the server knowing the dependency tree rather
than just a set of priorities that may be derived from a tree in the
client?

Roberto Peon

unread,
Nov 29, 2012, 6:33:31 PM11/29/12
to spdy...@googlegroups.com
On Thu, Nov 29, 2012 at 2:43 PM, Greg Wilkins <gr...@intalio.com> wrote:
On 29 November 2012 11:21, Roberto Peon <fe...@google.com> wrote:
> IO is probably an order of magnitude more expensive than doing a lookup and
> swapping a couple of pointers. We should optimize for less of that instead
> if we're worried about the comparitive CPU cost.

I'm not convinced that size is a significant issue here.

Firstly sending the initial tree (each stream declares a parent) or
priority (each stream declares a priority) is going to be pretty much
equivalent.

For dynamic changes, many of the use cases appear to be the temporary
elevation of a single stream, so sending a new priority for that is
equivalent to sending a new dependency.

One of the use-cases is tab prioritization, which would be changing potentially large numbers of stream priorities.
 

In the case where a two disjoint trees are joined, then it may be
necessary to adjust the priorities of less than half of the streams
(if more than half, just adjust the others in the other direction).
Typically we are talking about 10s of streams in parallel from a
browser, but even if this grows to 100s, then we are still talking
about less than 1 mtu going in the uncongested direction.

Agreed, at least for a single client. If a proxy experiences a lot of this, however, it is a bit more bothersome as it is multiplicative with the number of clients.
 

Granted that the savings in CPU/memory of priorities vs trees are also
probably marginal - but I do think there is good savings in complexity
with just priorities.      A complex tree algorithm optionally
implemented by the server is going to create a lot more variance in
how resources are served, making it less worthwhile for browsers to do
the work of adjusting priorities in the first place.


I think that the protocol may look marginally less complex, but the implementation of having to track the ordering of priorities ends up being far more complicated than the tree.
I think we're after the same goal (less complexity), but have different ideas about whether or not it will be more complex :/
 
How would you do the reprioritization using just priorities? When I've though about it I've always ended up doing the dep tracking and then having to do an additional translation and state tracking. In actual implemnentation, thus, my feeling is that it is worse. 

As an example,
I have:
  a -> b -> c -> d ->e

I want to change this to be (i.e. swapping the priority of b and c):
  a -> c-> b-> d -> e

with a tree (or at least a list), I'd remove the nodes from where they are and attach them to where they're going.
This is completely general, and would work for any reprioritization:

def ReprioritizeDep(node_to_move, new_parent)
node_a = FindNode(node_to_move) # O(lg N)
RemoveNode(node_a)  # O(1)
node_b = FindNode(new_parent) # O(lg N)
AddChild(node_b, node_a)  # O(1)

def RemoveNode(node): 
old_parent = node.parent
del node.parent.children[node]
for child in node.children:
  child.parent = old_parent

def AddChild(parent, node):
parent.children.append(node)
node.parent = parent
EMIT_MOVE_TO_WIRE_HANDLER(node.stream_id, parent.stream_id)

The general reprioritization function for something doing it by priority, on the other hand would be more like this:

ReprioritizeNode(node_to_move, new_parent):
p_a = FindPriority(node_to_move) # O(lg n)
MarkPriorityUnused(p_a)  # O(1)
p_b = FindPriority(new_parent)  # O(lg n)
next_unused_priority, distance = FindFirstUnusedPrioritiesAfter(p_b)  # O(n)
if distance == 1:
  AssignPriority(node_to_move, next_unused_priority)# O(1)
else:
  MoveAllPrioritiesAfterXBy(p_b, 1)  # O(n) # delta could be > 1 to help avoid future problems
  AssignPriority(node_to_move, p_b + 1) # O(1)

def MarkPriorityUnused(p):
del priorities[p]

def MoveAllPrioritiesAfterXBy(start_p, delta):
  for element in priorities_list:
    priority = element[0]
    if priority > start_p:
      element[0] += delta

def AssignPriority(node, p):
priorities[p] = node
EMIT_MOVE_TO_WIRE_HANDLER(node.stream_id, p)
     
This is more complicated, and likely more costly to implement.
Reprioritization code (for the case where we use only priorities) gets even more complicated if we attempt a function that moves multiple nodes at the same time. The only truly simple operation is a swap...

Do you see this working some other way?

So implementation efficiencies aside, my question is are there any
functional benefits of the server knowing the dependency tree rather
than just a set of priorities that may be derived from a tree in the
client?


  server push learning is a nice benefit of having the dependencies, and it provides proxies with a nice mechanism for changing priorities more elegantly (since these connections are multiplexed again to a completely different entity)

-=R

Greg Wilkins

unread,
Dec 3, 2012, 4:30:27 PM12/3/12
to spdy...@googlegroups.com
On 30 November 2012 10:33, Roberto Peon <fe...@google.com> wrote:
>
> Do you see this working some other way?

I think I had been considering that frequently only a few streams
would need to be re-prioritized for things like tab switching.

I was thinking that you would typically have a 3 deep dependency tree
- HTML to a bunch of resources and some of those resources are CSS
with their own bunch of resources.
Because the dependent resources are discovered as the parent resource
is parsed, then adjusting the priority of the parent should be
sufficient.

However - I now realize the flaw in my thinking. The dependency tree
is only discovered on the client - the server can know the dependency
tree in advance due to what ever push algorithm is used to work it
out. Thus the server has a greater knowledge of the tree (or at
least more knowledge in advance), so my desire of offloading work from
the server to the client is unlikely to be workable as the client has
less knowledge.

Thus I can now see that having a dependency tree on the server side is
a good idea. The browser might only know that it is loading
resource A in one tab and resource B in another tab and be oblivious
to the tree of dependent resources that might be pushed for A and/or
B. Thus the browser sending A>B or A<B when tabs are switched can be
used to re prioritise all the dependent resources that might be about
to be pushed unbeknown to the client.

Actually, it is likely to be an directed acyclic graph of resources on
the server side rather than a tree - either way I'm now convinced that
it is a good idea to have this on the server.

Michael Piatek

unread,
Dec 28, 2012, 2:46:10 PM12/28/12
to spdy...@googlegroups.com

A first cut at updating the spec to reflect dependencies (largely identical to the proposal discussed)

Comments welcome. Thanks!

Ilya Grigorik

unread,
Dec 28, 2012, 7:53:28 PM12/28/12
to spdy...@googlegroups.com
Michael, looks great. A couple of notes and questions as I'm reading through it: 
  • The exclusive nature of priority and deps *is* a tad confusing
  • Is there no case at all when you want to weight multiple children at different priorities? I understand that the hints are advisory, but perhaps we shouldn't eliminate the case entirely? Admittedly, the reuse of the same field is a cute hack.. ;-)
  • "Both priorities and stream dependencies are advisory hints" - I understand why that's phrased as it is, but I'm wondering if this should be made stronger to push people to support it.. What is optional will be left out and forgotten.
  • The timeout MS estimation seems like a chicken and the egg problem. We don't know PLT until we load it for that client.. and it's pretty much arbitrary given all the varying network conditions. Given that, do we even need the MS resolution? Seems like an LRU limit can do the job just fine. And I think we should specify a default recommended value.
ig

Michael Piatek

unread,
Dec 28, 2012, 8:43:38 PM12/28/12
to spdy-dev
Thanks for the feedback. Comments inline.

On Fri, Dec 28, 2012 at 4:53 PM, Ilya Grigorik <igri...@gmail.com> wrote:

> The exclusive nature of priority and deps *is* a tad confusing
> Is there no case at all when you want to weight multiple children at
> different priorities?

I wouldn't say there's no use case, just no compelling example that
we've thought of yet. One example is images above and partially below
the fold, but this is fairly contrived. Any others?

> "Both priorities and stream dependencies are advisory hints" - I understand
> why that's phrased as it is, but I'm wondering if this should be made
> stronger to push people to support it.. What is optional will be left out
> and forgotten.

Processing the messages is not optional. But, as a practical matter,
servers have the final say over their bandwidth allocation policy. I
welcome suggestions for how to reword to encourage productive use of
the information.

> The timeout MS estimation seems like a chicken and the egg problem. We don't
> know PLT until we load it for that client.. and it's pretty much arbitrary
> given all the varying network conditions.

Yes, it's a bit like choosing your TCP timeout. I suspect a
conservative default will cover most cases.

> Given that, do we even need the MS
> resolution? Seems like an LRU limit can do the job just fine. And I think we
> should specify a default recommended value.

Clients do need some signal indicating if the server supports
dependency scheduling. But, you're right that LRU + max nodes covers
most use cases. One exception is if the server supports dependency
scheduling but needs to garbage collect quickly, e.g., in response to
a transient load spike. I'm not sure that's sufficiently compelling,
but others do, so it seems worth discussing. Thoughts?

Ilya Grigorik

unread,
Dec 28, 2012, 10:07:44 PM12/28/12
to spdy...@googlegroups.com
I wouldn't say there's no use case, just no compelling example that
we've thought of yet. One example is images above and partially below
the fold, but this is fairly contrived. Any others?

a) Progressive loading of images, with priority to fill A faster than B? .. That's effectively the same case though.
b) Priorities for long lived streams, ala WebSocket, Server-Sent Events pipe? This one seems far more interesting...
 
> "Both priorities and stream dependencies are advisory hints" - I understand
> why that's phrased as it is, but I'm wondering if this should be made
> stronger to push people to support it.. What is optional will be left out
> and forgotten.

Processing the messages is not optional. But, as a practical matter,
servers have the final say over their bandwidth allocation policy. I
welcome suggestions for how to reword to encourage productive use of
the information.

Need to review the wording, but this comment was in part as a reaction to Will's recent post:

If the browser relies on priorities, and server doesn't support it.. we can actually make things worse. But if the client now has to hedge against the client not supporting it, then it becomes a much more complicated dance, which will (likely, I think) make it much, much less effective in the long run. 
 
> Given that, do we even need the MS
> resolution? Seems like an LRU limit can do the job just fine. And I think we
> should specify a default recommended value.

Clients do need some signal indicating if the server supports
dependency scheduling. But, you're right that LRU + max nodes covers
most use cases. One exception is if the server supports dependency
scheduling but needs to garbage collect quickly, e.g., in response to
a transient load spike. I'm not sure that's sufficiently compelling,
but others do, so it seems worth discussing. Thoughts?

In such cases, the server should be able to just evict at will. We shouldn't make any hard promises on "X records will always be there". Plus, having two mechanisms only makes things more complicated: would it keep X records up to Y ms, what happens if I'm under X but over Y ms? vice versa?

ig

Roberto Peon

unread,
Jan 7, 2013, 6:08:45 PM1/7/13
to spdy...@googlegroups.com
I think we can be a bit less confusing in describing priorities.

There are two obvious options.
1) Larger numbers indicate higher importance (and just say so).
2) Leave the numbers as is and only change the wording to something like:

"Lower numbers in the priority field indicate increased importance. A value of '0' in the priority field indicates highest importance"


Some of the text in other sections needs modification, e.g.:

"The user-agent is free to prioritize requests"
I'd change this to:
"The user-agent is free to prioritize or reprioritize"

More in the section containing the above (HTTP Request/Response under Request) needs changing.


I'd suggest that a smple python implementation for server-side responding to the prioritization be included.


Some additional text describing some baseline recommendations for how browsers are supposed to use priorities is probably warranted.
As an example, a browser SHOULD structure dependencies such that elements in a tab can have their priorities changed with one message, browsers SHOULD prioritize above-the-fold resources higher, and browsers SHOULD prioritize resources in the visible tabs higher than those in hidden tabs.

"created with a priority and not a parent"
should probably read
"which has never had a parent"


          <t>Server scheduling should reflect guidance from dependencies, but it need not be strict.  If all streams in a dependency structure have data available to write at the server, writes should be serviced first for root nodes, then children, then grandchildren, and so on.  But, children that are ready to write should not starve to enforce a scheduling dependency.  In other words, scheduling dependencies should not lead servers to waste capacity.  If data is not available to continue writing the root, for example, a child ready to write should do so.</t>

I'd restate:
The server should always send data from the highest  importance installed stream. This may not be the highest importance stream, as that stream may be stalled by flow control or because the server has yet to produce data.

"the use of dependencies as a DoS vector"
DoS should be spelled out as denial of service.

"servers are free to drop dependency or priority data at any time without sacrificing correctness."
I like adding a blurb in there:
"since prioritization merely indicates an ordering which serves to improve user experience, servers are free..."

The section beginning "Otherwise we envision servers..." is probably describing the problem backwards.

A race exists between the expiration of priority data about stream on the server and when the client sends repri frames for that stream.
When this happens, we potentially slow the transfer of items (assuming upload bandwidth is in short supply) and we certainly waste both bandwidth and server CPU time. The race itself should be described explicitly instead of implied.


-=R


On Fri, Oct 26, 2012 at 2:14 PM, Michael Piatek <pia...@google.com> wrote:

Hi all,

I'd like to share a draft proposal for adding a notion of stream dependencies to SPDY. Stream dependencies are intended to improve performance by providing hints to the server about which streams are most important to the client (and the relationships between those streams).

A doc detailing the proposal:

Comments welcome.

Thanks!

-Michael


Michael Piatek

unread,
Jan 8, 2013, 4:57:05 PM1/8/13
to spdy-dev
Thanks for the comments -- edits are reflected here:
https://github.com/CSEMike/SPDY-Specification/commit/f324c8f30ef99b0dbe255cab032e43dbd8faae56

I've also moved the prioritization changes to a feature branch based
on gh-pages to ease future merging.

Detailed responses inline --

On Mon, Jan 7, 2013 at 3:08 PM, Roberto Peon <fe...@google.com> wrote:

> 2) Leave the numbers as is and only change the wording to something like:
>
> "Lower numbers in the priority field indicate increased importance. A value
> of '0' in the priority field indicates highest importance"

Done

> Some of the text in other sections needs modification, e.g.:
>
> "The user-agent is free to prioritize requests"
> I'd change this to:
> "The user-agent is free to prioritize or reprioritize"

Done.

> More in the section containing the above (HTTP Request/Response under
> Request) needs changing.

I tweaked the discussion of prioritization to reflect the new changes
and policy discussion in Section 6. What else did you have in mind?

> I'd suggest that a smple python implementation for server-side responding to
> the prioritization be included.

You mean a mock scheduler? I think we'll need to chat a bit about what
goes into this.

> Some additional text describing some baseline recommendations for how
> browsers are supposed to use priorities is probably warranted.
> As an example, a browser SHOULD structure dependencies such that elements in
> a tab can have their priorities changed with one message, browsers SHOULD
> prioritize above-the-fold resources higher, and browsers SHOULD prioritize
> resources in the visible tabs higher than those in hidden tabs.

Do others agree? I felt like discussion of the corner cases we treat
in the prioritization doc (e.g., above the fold) might wash out the
main point: serialize transfers in browser parse order.

> "created with a priority and not a parent"
> should probably read
> "which has never had a parent"

Done.

> <t>Server scheduling should reflect guidance from dependencies,
> but it need not be strict. If all streams in a dependency structure have
> data available to write at the server, writes should be serviced first for
> root nodes, then children, then grandchildren, and so on. But, children
> that are ready to write should not starve to enforce a scheduling
> dependency. In other words, scheduling dependencies should not lead servers
> to waste capacity. If data is not available to continue writing the root,
> for example, a child ready to write should do so.</t>
>
> I'd restate:
> The server should always send data from the highest importance installed
> stream. This may not be the highest importance stream, as that stream may be
> stalled by flow control or because the server has yet to produce data.

I'm not sure I understand the restatement -- is an installed stream
defined somewhere? (perhaps in yet to be published flow control
edits?)

> "the use of dependencies as a DoS vector"
> DoS should be spelled out as denial of service.

Done.

>
> "servers are free to drop dependency or priority data at any time without
> sacrificing correctness."
> I like adding a blurb in there:
> "since prioritization merely indicates an ordering which serves to improve
> user experience, servers are free..."

Done.

> The section beginning "Otherwise we envision servers..." is probably
> describing the problem backwards.
>
> A race exists between the expiration of priority data about stream on the
> server and when the client sends repri frames for that stream.
> When this happens, we potentially slow the transfer of items (assuming
> upload bandwidth is in short supply) and we certainly waste both bandwidth
> and server CPU time. The race itself should be described explicitly instead
> of implied.

Done.

Roberto Peon

unread,
Jan 8, 2013, 5:44:20 PM1/8/13
to spdy...@googlegroups.com
On Tue, Jan 8, 2013 at 1:57 PM, Michael Piatek <pia...@google.com> wrote:
Thanks for the comments -- edits are reflected here:
https://github.com/CSEMike/SPDY-Specification/commit/f324c8f30ef99b0dbe255cab032e43dbd8faae56

I've also moved the prioritization changes to a feature branch based
on gh-pages to ease future merging. 

Excellent :)
 

Detailed responses inline --

On Mon, Jan 7, 2013 at 3:08 PM, Roberto Peon <fe...@google.com> wrote:

> 2) Leave the numbers as is and only change the wording to something like:
>
> "Lower numbers in the priority field indicate increased importance. A value
> of '0' in the priority field indicates highest importance"

Done

> Some of the text in other sections needs modification, e.g.:
>
> "The user-agent is free to prioritize requests"
> I'd change this to:
> "The user-agent is free to prioritize or reprioritize"

Done.

> More in the section containing the above (HTTP Request/Response under
> Request) needs changing.

I tweaked the discussion of prioritization to reflect the new changes
and policy discussion in Section 6. What else did you have in mind?

I think I saw some descriptions in that section which conflicted with what you had in the new prioritization section. I'll go over it again to be sure and point it out more clearly :)
 

> I'd suggest that a smple python implementation for server-side responding to
> the prioritization be included.

You mean a mock scheduler? I think we'll need to chat a bit about what
goes into this.

Yup. Essentially, a simple scheduler showing how things are linked/unlinked, etc.
Happy to chat :)
installed was a typo for unstalled (or un-stalled) :/
 
-=R

Peter Lepeska

unread,
Jul 8, 2013, 7:43:58 PM7/8/13
to spdy...@googlegroups.com
Did this proposal end up getting adopted in any form? I'm not seeing the parent dependency information in SYN_STREAM in the latest SPDY draft. 

I really like the idea of the browser sending dependency information that will allow proxies to re-construct the full tree of a page starting with the initial click URL.

Thanks,

Peter


On Sat, Oct 27, 2012 at 1:50 AM, Roberto Peon <grm...@gmail.com> wrote:

As a bonus, in case you didn't figure it out already, sending the dependencies from browser to server means that servers will get far, far better information on which to make decisions about server push, especially when information about page visibility ends up being encoded in the ordering...

As usual, the design intent is to facilitate people writing their pages and letting the rest of the system do what needs be done to get the page rendered efficiently.

I have every hope that Jetty will continue to lead the way there ;)

-=R

Michael Piatek

unread,
Jul 8, 2013, 7:47:13 PM7/8/13
to spdy-dev
We're discussing the latest revision of this here:
https://github.com/CSEMike/SPDY-Specification/commit/485331a8c05820b07586955c018b2ec6eee12962

comments welcome
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "spdy-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to spdy-dev+u...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Peter Lepeska

unread,
Jul 8, 2013, 9:14:38 PM7/8/13
to spdy...@googlegroups.com
Thanks. I'm glad to see this is still being discussed. Have you considered ideas to expand it further to enable better page modelling and therefore smarter PUSH algorithms in SPDY proxies?

If I am a SPDY proxy and I am trying to perform a URL learning scheme, I can use stream dependencies along with referrer headers to attempt to figure out which child streams are associated with which page, or parent URL, which is the URL that initiated the page load. If I subsequently see that parent URL requested again, I can then then have a better idea of which child resources to PUSH. But this approach is imperfect b/c I could have two page loads going to the same origin server happening concurrently in two different tabs. In this case, figuring out which child objects should be associated with which tab/page is tricky. Also, a SPDY proxy will typically only want to push resources that are needed to load the initial screenshot of the page and not resources that are subsequently requested after the page is loaded. But there is no way for the SPDY proxy to distinguish between objects loaded before and after the page has been rendered.

To address these shortcomings, have you considered something along the lines of the following?

1) adding a "page ID" that will be added to each SYN_STREAM that will be populated by the browser to indicate the web page with which the stream is associated and 
2) some way for the browser to indicate that the page has been fully rendered? Perhaps a "fully rendered" event message in a HEADERS frame?

Perhaps the Jetty team has some thoughts along these lines...

Thanks,

Peter 

Michael Piatek

unread,
Jul 8, 2013, 11:35:58 PM7/8/13
to spdy-dev
Speaking only for myself, I'm hopeful that benefits from push won't
require further protocol changes. Some thoughts inline below.

On Mon, Jul 8, 2013 at 6:14 PM, Peter Lepeska <bizzb...@gmail.com> wrote:

> To address these shortcomings, have you considered something along the lines
> of the following?
>
> 1) adding a "page ID" that will be added to each SYN_STREAM that will be
> populated by the browser to indicate the web page with which the stream is
> associated and

This strikes me as quite similar to the referrer header already
provided by browsers.

> 2) some way for the browser to indicate that the page has been fully
> rendered? Perhaps a "fully rendered" event message in a HEADERS frame?

The concept of fully rendered tends to be specific to a given page.
onload isn't quite right; e.g., some JS activity often occurs only
after onload fires.

Roberto Peon

unread,
Jul 9, 2013, 1:15:56 PM7/9/13
to spdy...@googlegroups.com
Providing more information to the server so that it can infer better PUSH orderings was one of the design goals.
page ID, etc. was not something we considered adding, nor a fully rendered flag, as the definitions for either of these can get complicated :)

I've had thoughts about how nice it would be if one could ask the browser to POST timing information about the page after some amount of time had past (e.g. the waterfall data), but that is a low-priority document and I haven't had time for it.

-=R
Reply all
Reply to author
Forward
0 new messages