Full page load test in stone ridge

Nick Hurley

unread,

Nov 30, 2012, 5:04:21 PM11/30/12

to dev-tech...@lists.mozilla.org

All,

I'm wondering just how strict we want to be in getting the stone ridge
pageload test to be exactly what we would do in browser. I've been working
under the assumption that we want to measure as little non-necko code as
possible, so I've come up with a scheme to just create channels for each
resource. This scheme would look like this:

1. Create a channel for the top level page (index.html or whatever it is)
2. When onStopRequest fires in that channel, kick off channels for each of
the subresources (images, stylesheets, scripts, iframes, ...)
3. If there are any subresources that have subresources of their own
(iframes come to mind), this process continues recursively.
4. Once every resource is loaded, the test is done.

I have everything ready client-side to do this kind of test (including a
script to take a pageload recorded with web-page-replay and figure out what
resources need to be loaded when).

The problem that I see with this approach is that it misses some behaviors
that may be important to page load time that necko may or may not be in
control of: resource prioritization, the actual order in which the
resources are requested (due to placement in the document), etc. The
easy(?) way to get this would be to load a page the way we normally do,
just without any browser chrome (nsIDocshell? I honestly don't know what
the right place to do that would be). Of course, this has the problem of
adding in quite a few non-necko bits that could throw off our timing.

We could also try to simulate the important things that are missing from
the first approach, but this both makes the initial implementation more
difficult, as well as potentially turning the page load test(s) into a
maintenance nightmare, if we change things with resource prioritization,
etc in the future.

I'm inclined to continue down the path I'm already taking (bunch of
channels fired off marginally intelligently) until we find significant
issues with that approach, but I'm asking the intelligent minds on this
list for opinions so that my preconceptions and/or misconceptions don't get
us into trouble sooner rather than later.

-Nick

Patrick McManus

unread,

Dec 1, 2012, 10:52:17 AM12/1/12

to Nick Hurley, dev-tech...@lists.mozilla.org

nick, great mail. Thanks.

It sounds like you're asking if we care more about network performance of
firefox or network performance of standalone necko.

While both are interesting, I feel strongly that's a no brainer - I care
about firefox a lot more than the necko library. You cite a bunch of
reasons (and I can cite more) of where the interaction matters a lot.
Efforts to replicate that interaction are, as you mention, likely going to
be fragile and inaccurate and in the end don't really save anything.. all
we end up doing is trading the testing of something that we care about
(firefox content/layout) for something we don't (stoneridge docshell
emulation code).

hopefully 792438 illustrates strongly how much the interaction between
content and necko matters to overall performance. I don't tihnk there is
any way of getting around this.

I acknowledge the obvious downside is that a regression (or improvement) in
this test is not necessarily a necko regression - and that creates bugs
that are harder to chase (even when they are necko bugs).. Some thought on
how to make diagnosing these things easier from people better schooled in
testing would be welcome (maybe the answer to subsequently add a layer of
necko only tests that can be used as diagnostics but not tracked as
metrics), but I'm pretty confident what we need to be optimizing for are
real use cases on realistic networks and right now firefox doesn't have any
of that at all. So that's what I would want to prioritize.

Even tp5 numbers, using existing talos drivers, would be a huge step
forward if we actually measured them on a variety of networks.

Let's use 792438 as an example. This work was basically blocked on
stoneridge until it became undeniable we had to do something sooner. From
anecdotal tests I am more than convinced that doing that work is the right
thing to do - but how do we use stoneridge to pursue that in a more
disciplined manner? What are the missing pieces to do that meaningfully
compared to what we have in stoneridge now? I don't think it is a big list,
but all of them are critical to having a metric that is worth optimizing
for on that particular use case:

1. Firefox's loading pattern.. including the speculative stylesheet
loader
2. a better metric than "page loaded" (i.e. page usable)
3. a network model with configurable IW sizes
4. a network that models queue sizes for induced latency and loss rather
than just stochastic jitter and loss.

And of course without push-to-stoneridge there is no way to do that kind
metric driven development.

I'm very concerned about boiling the ocean and ending up with nothing. But
in many cases (pinterest, 136, cnn, etc..) those things all matter greatly
to perceived performance.

Honza Bambas

unread,

Dec 1, 2012, 1:52:07 PM12/1/12

to dev-tech...@lists.mozilla.org

Patrick, thanks, good answer.

I'm also more tending to have a talos-like testing on multiple network
confings (that stoneridge has). To somehow simulate parallel loads by
some hard coded test is simply a useless effort to me. We are dependent
strongly on how things are organized on the main thread too and there
are lots of rendering and layout operations that may block necko
callbacks. And that is significant and can't be obviously simulated
just by some small js test.

Page usable is a good idea. I think we can get information from layout
that all what's visible has been rendered, like images. That is a start IMO.

-hb-

> _______________________________________________
> dev-tech-network mailing list
> dev-tech...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-tech-network
>

Jason Duell

unread,

Dec 4, 2012, 1:29:52 PM12/4/12

to dev-tech...@lists.mozilla.org

> I have everything ready client-side to do this kind of test (including a
> script to take a pageload recorded with web-page-replay and figure out what
> resources need to be loaded when).

I agree with Patrick and Honza that the most important priority is
emulating real firefox behavior. But if you're really got the
dumb-script approach working already, that would be a useful data point
to have too. It could be useful to have that to help us analyze if a
regression is purely within necko or not.

Jason

Nick Hurley

unread,

Dec 4, 2012, 4:41:48 PM12/4/12

to dev-tech...@lists.mozilla.org

OK, that all makes sense. So for the first iteration of this, I'll go ahead
and target an honest to ${DEITY} full pageload. If any one knows of any way
to do this headless, I'd love to hear it, otherwise we're going to have to
go ahead and use a modified Talos pageloader extension to do what we want
(which requires some way to display the browser, which adds new headaches
to the infrastructure).

Also, since we're doing full page loads, we can have any of the data points
from the navigation timing API in our results. What ones do we want to
have? I'm thinking, at a minimum, the "global picture" (loadEventEnd -
navigationStart), but we can break it down further (for reference:
available points, according to the spec, are at
http://www.w3.org/TR/navigation-timing/#processing-model - if we don't
implement all of those points, then obviously the unimplemented ones aren't
available to us). Thoughts?

> ______________________________**_________________
> dev-tech-network mailing list
> dev-tech-network@lists.**mozilla.org <dev-tech...@lists.mozilla.org>
> https://lists.mozilla.org/**listinfo/dev-tech-network<https://lists.mozilla.org/listinfo/dev-tech-network>
>

Patrick McManus

unread,

Dec 6, 2012, 12:50:50 PM12/6/12

to Nick Hurley, dev-tech...@lists.mozilla.org

might want to consider other metrics being proposed in
https://bugzilla.mozilla.org/show_bug.cgi?id=818711