Redesigning the docshell/loadgroup/document interaction

Boris Zbarsky

unread,

May 20, 2016, 10:56:19 AM5/20/16

to

Background: We have a problem right now where the thing representing a
"collection of loads" (a loadgroup) is attached to a docshell, not an
individual document. This causes issues like loads started from unload
events being blamed on the new page and whatnot, though it's possible
that loadinfo will help with some of those.

Proposal: We already have a concept of "document loadgroup"; right now
it's just the one for the docshell the document is in. I'm proposing
that we make it a separate loadgroup, whose parent is the docshell
loadgroup. That is, as you navigate a docshell, each document will get
its own loadgroup, all of these will be contained in the docshell's
loadgroup, and subresources loaded for each document will be placed in
the _document_'s loadgroup.

There is one conceptual problem here, which is where to put the document
request itself. I think there are two options:

1) Put it in the docshell loadgroup. This is probably the simplest
thing to do, but it does have one significant drawback: it means onload
handling has to either stay in the docshell loadgroup or needs hacks to
keep track of the document request if it's moved to the document
loadgroup, since the document request itself is not in the document
loadgroup.

2) Create the loadgroup when we're starting the document request. The
channel would just keep it alive while it's active; if we end up
creating a document, the document can grab it from the channel and store
it. This might just work out OK. It will require a bit more up-front
work to handle onload correctly.

In both cases, we will likely need to think a bit about the various
notification-firing docloader does (web progress listeners, basically)
and how to keep that working reasonably.

I do think we want to keep the docshell-level loadgroup, so we can do
things like cancel everything for all documents when the docshell is
torn down. Not that there should be much in the way of loads for
non-active documents, but I'm not _that_ confident in this.

I believe this setup can be implemented somewhat incrementally, in that
we can land an initial change and then start simplifying various stuff
separately...

Thoughts? Any obvious problems with this plan?

-Boris

Ben Kelly

unread,

May 20, 2016, 11:04:41 AM5/20/16

to Boris Zbarsky, dev-pl...@lists.mozilla.org

On Fri, May 20, 2016 at 10:56 AM, Boris Zbarsky <bzba...@mit.edu> wrote:

> Thoughts? Any obvious problems with this plan?
>

We have the concept of network requests made from workers not associated
with a single document. For example, SharedWorker attached to multiple
documents or a ServiceWorker servicing a push event. In theory we want the
same load grouping behavior to abort requests if those workers terminate,
etc.

If we are refactoring things, can we include that concept as well?

Thanks.

Ben

smaug

unread,

May 20, 2016, 11:28:58 AM5/20/16

to Boris Zbarsky

Sounds good.

(1) vs (2) isn't quite clear to me though. Do you mean with (2) that docshell wouldn't have any loagroup, but a channel?
But then you say there would be docshell-level loadgroup... so doesn't that mean that document's loadgroup would be in it, so (1).

I assume we'd reuse loadgroup in case same inner window is used for many documents (initial about:blank), right?

-Olli

Boris Zbarsky

unread,

May 20, 2016, 1:11:36 PM5/20/16

to

On 5/20/16 11:04 AM, Ben Kelly wrote:
> We have the concept of network requests made from workers not associated
> with a single document. For example, SharedWorker attached to multiple
> documents or a ServiceWorker servicing a push event. In theory we want the
> same load grouping behavior to abort requests if those workers terminate,
> etc.
>
> If we are refactoring things, can we include that concept as well?

Hmm. So the idea here is that each _worker_ has a group of requests
associated with it, right?

That seems somewhat orthogonal to how we handle documents, at first
blush, and seems like we can just do it: associate a loadgroup on the
main thread with each worker.

-Boris

Boris Zbarsky

unread,

May 20, 2016, 1:14:54 PM5/20/16

to

On 5/20/16 11:28 AM, smaug wrote:
> (1) vs (2) isn't quite clear to me though. Do you mean with (2) that
> docshell wouldn't have any loagroup, but a channel?

No. I mean that the docshell would have a loadgroup. When you navigate
a docshell it would create the new channel, create a _new_ loadgroup,
put the channel in the new loadgroup, put the new loadgroup in the
docshell loadgroup.

If we end up creating a document from the channel, that document would
grab the channel's loadgroup and use it as its own loadgroup.

> But then you say there would be docshell-level loadgroup... so doesn't
> that mean that document's loadgroup would be in it, so (1).

In both proposals the document's loadgroup is in the docshell's
loadgroup. The only difference is which loadgroup the document channel
itself lives in.

> I assume we'd reuse loadgroup in case same inner window is used for many
> documents (initial about:blank), right?

Yeah, this part could be complicated. :( Especially if we want to put
the document channel in the document loadgroup: we'd have to move it in
the reuse case or something.

-Boris

Jonas Sicking

unread,

May 20, 2016, 3:13:15 PM5/20/16

to Ben Kelly, Boris Zbarsky, dev-pl...@lists.mozilla.org

On Fri, May 20, 2016 at 8:04 AM, Ben Kelly <bke...@mozilla.com> wrote:
> On Fri, May 20, 2016 at 10:56 AM, Boris Zbarsky <bzba...@mit.edu> wrote:
>

>> Thoughts? Any obvious problems with this plan?
>

> We have the concept of network requests made from workers not associated
> with a single document. For example, SharedWorker attached to multiple
> documents or a ServiceWorker servicing a push event. In theory we want the
> same load grouping behavior to abort requests if those workers terminate,
> etc.
>
> If we are refactoring things, can we include that concept as well?

I *think* the current setup that we have dedicated workers is that
each worker gets its own loadgroup, and that all
XHR/fetch()/importScripts() loads are added to that loadgroup. Ideally
that loadgroup should be added as a child loadgroup of the owning
document, or the owning worker. But I'm not sure that is the case.

For SharedWorkers/ServiceWorkers, we should just skip the last step.
I.e. we should not make the loadgroup be a child of any other
loadgroup.

Is that not what we're currently doing?

/ Jonas

Ben Kelly

unread,

May 20, 2016, 3:28:59 PM5/20/16

to Jonas Sicking, Boris Zbarsky, dev-pl...@lists.mozilla.org

On Fri, May 20, 2016 at 3:12 PM, Jonas Sicking <jo...@sicking.cc> wrote:

> On Fri, May 20, 2016 at 8:04 AM, Ben Kelly <bke...@mozilla.com> wrote:
> > On Fri, May 20, 2016 at 10:56 AM, Boris Zbarsky <bzba...@mit.edu>
> wrote:
> >

> >> Thoughts? Any obvious problems with this plan?
> >

> > We have the concept of network requests made from workers not associated
> > with a single document. For example, SharedWorker attached to multiple
> > documents or a ServiceWorker servicing a push event. In theory we want
> the
> > same load grouping behavior to abort requests if those workers terminate,
> > etc.
> >
> > If we are refactoring things, can we include that concept as well?
>
> I *think* the current setup that we have dedicated workers is that
> each worker gets its own loadgroup, and that all
> XHR/fetch()/importScripts() loads are added to that loadgroup. Ideally
> that loadgroup should be added as a child loadgroup of the owning
> document, or the owning worker. But I'm not sure that is the case.
>
> For SharedWorkers/ServiceWorkers, we should just skip the last step.
> I.e. we should not make the loadgroup be a child of any other
> loadgroup.
>

We don't use "child" load groups at all AFAIK. We inherit the document
LoadGroup for dedicated workers:

https://dxr.mozilla.org/mozilla-central/source/dom/workers/WorkerPrivate.cpp#4356

I thought we also inherited the load group from a parent worker, but I
don't see the code that does it.

For SharedWorkers or workers that don't get a LoadGroup (service workers)
we override with a new load group:

https://dxr.mozilla.org/mozilla-central/source/dom/workers/WorkerPrivate.cpp#4461

We then add the tab child when a document attaches to the SharedWorker:

https://dxr.mozilla.org/mozilla-central/source/dom/workers/WorkerPrivate.cpp#3603

We then do special logic to get the first live tab child we see for the
SharedWorker out:

https://dxr.mozilla.org/mozilla-central/source/dom/workers/WorkerPrivate.cpp#2071

This tab child stuff was to support security checking on b2g I think. I'm
not sure we strictly need it any more.

Maybe all of this is fine. The behavior of SharedWorker and ServiceWorker
seem a bit fragile, though. And it seems we get broken network stuff on
workers periodically for things that assume a document. So I just wanted
to avoid any more document specific behavior if we can. Maybe thats not a
concern here.

Jonas Sicking

unread,

May 20, 2016, 6:14:24 PM5/20/16

to Ben Kelly, Boris Zbarsky, dev-pl...@lists.mozilla.org

On Fri, May 20, 2016 at 12:28 PM, Ben Kelly <bke...@mozilla.com> wrote:
> On Fri, May 20, 2016 at 3:12 PM, Jonas Sicking <jo...@sicking.cc> wrote:
>>
>> On Fri, May 20, 2016 at 8:04 AM, Ben Kelly <bke...@mozilla.com> wrote:
>> > On Fri, May 20, 2016 at 10:56 AM, Boris Zbarsky <bzba...@mit.edu>
>> > wrote:
>> >

>> >> Thoughts? Any obvious problems with this plan?
>> >

>> > We have the concept of network requests made from workers not associated
>> > with a single document. For example, SharedWorker attached to multiple
>> > documents or a ServiceWorker servicing a push event. In theory we want
>> > the
>> > same load grouping behavior to abort requests if those workers
>> > terminate,
>> > etc.
>> >
>> > If we are refactoring things, can we include that concept as well?
>>
>> I *think* the current setup that we have dedicated workers is that
>> each worker gets its own loadgroup, and that all
>> XHR/fetch()/importScripts() loads are added to that loadgroup. Ideally
>> that loadgroup should be added as a child loadgroup of the owning
>> document, or the owning worker. But I'm not sure that is the case.
>>
>> For SharedWorkers/ServiceWorkers, we should just skip the last step.
>> I.e. we should not make the loadgroup be a child of any other
>> loadgroup.
>
>
> We don't use "child" load groups at all AFAIK. We inherit the document
> LoadGroup for dedicated workers:
>
> https://dxr.mozilla.org/mozilla-central/source/dom/workers/WorkerPrivate.cpp#4356

That doesn't sound good. We should give each worker its own loadgroup.
Independent of if it's a dedicated, shared or service worker.

Or is there a reason to share loadgroup with the document that I'm missing?

> I thought we also inherited the load group from a parent worker, but I don't
> see the code that does it.
>
> For SharedWorkers or workers that don't get a LoadGroup (service workers) we
> override with a new load group:
>
> https://dxr.mozilla.org/mozilla-central/source/dom/workers/WorkerPrivate.cpp#4461
>
> We then add the tab child when a document attaches to the SharedWorker:
>
> https://dxr.mozilla.org/mozilla-central/source/dom/workers/WorkerPrivate.cpp#3603

This looks like the worker loadinfo, which is different from the
loadgroup (and also different from nsILoadInfo, despite the same name
sadly).

> https://dxr.mozilla.org/mozilla-central/source/dom/workers/WorkerPrivate.cpp#2071
>
> This tab child stuff was to support security checking on b2g I think. I'm
> not sure we strictly need it any more.
>
> Maybe all of this is fine. The behavior of SharedWorker and ServiceWorker
> seem a bit fragile, though. And it seems we get broken network stuff on
> workers periodically for things that assume a document. So I just wanted to
> avoid any more document specific behavior if we can. Maybe thats not a
> concern here.

It's generally a concern that we still have a lot of code which rely
on loads happening through a document.

In part because it's convenient to have a single object through which
you can get all needed information when doing "networky stuff".

The best solution i've been able to think of so far is to rely less on
nsILoadContext (which is strongly tied to docshells, which workers
don't always have) and more on nsILoadInfo (which is information
sitting directly on channels). We're at the point now where we should
be able to start deprecating stuff off of nsILoadContext.

Something like this would also be very helpful:
https://bugzilla.mozilla.org/show_bug.cgi?id=1259873

/ Jonas

Jonas Sicking

unread,

May 20, 2016, 7:41:42 PM5/20/16

to Ben Kelly, Boris Zbarsky, dev-platform

On Fri, May 20, 2016 at 4:04 PM, Ben Kelly <bke...@mozilla.com> wrote:

> On May 20, 2016 6:14 PM, "Jonas Sicking" <jo...@sicking.cc> wrote:
>> That doesn't sound good. We should give each worker its own loadgroup.
>> Independent of if it's a dedicated, shared or service worker.
>>
>> Or is there a reason to share loadgroup with the document that I'm
>> missing?
>

> Not sure. I think it's been that way for a long time. I just added the
> code to create a load group if we didn't get one from the document.

Yeah, I think this was done the wrong way from the beginning. It
didn't matter that much when we just had dedicated workers. But now
that we have ServiceWorkers/SharedWorkers, I think it's making our
world more complicated.

/ Jonas

Ben Kelly

unread,

May 20, 2016, 8:55:16 PM5/20/16

to Jonas Sicking, Boris Zbarsky, dev-pl...@lists.mozilla.org

On May 20, 2016 6:14 PM, "Jonas Sicking" <jo...@sicking.cc> wrote:
> That doesn't sound good. We should give each worker its own loadgroup.
> Independent of if it's a dedicated, shared or service worker.
>
> Or is there a reason to share loadgroup with the document that I'm
missing?

Not sure. I think it's been that way for a long time. I just added the
code to create a load group if we didn't get one from the document.

I thought the load group was related to the tab child because the group has
notification callbacks, etc.

Ben

Anne van Kesteren

unread,

Jul 7, 2016, 6:33:44 AM7/7/16

to Boris Zbarsky, dev-platform

On Fri, May 20, 2016 at 4:56 PM, Boris Zbarsky <bzba...@mit.edu> wrote:
> 2) Create the loadgroup when we're starting the document request. The
> channel would just keep it alive while it's active; if we end up creating a
> document, the document can grab it from the channel and store it. This
> might just work out OK. It will require a bit more up-front work to handle
> onload correctly.

How does this approach work for <embed> or <object> where we fetch
first (presumably as part of the parent group) and then decide whether
it becomes its own thing the moment we have all HTTP headers?

--
https://annevankesteren.nl/

Boris Zbarsky

unread,

Jul 7, 2016, 9:45:30 AM7/7/16

to

The way it works right now is that the <embed> or <object> load is part
of the loadgroup for the parent document until we decide it's creating a
child browsing context (docshell). At that point we create the child
browsing context and move the load into the loadgroup associated with
that browsing context.

So in the new world, we'd keep doing the same, but in addition to
creating the child docshell and its loadgroup at the point when we
detect that we need a child browsing context we would also create the
child document loadgroup and move the embed/object load into _that_.

-Boris

Henri Sivonen

unread,

Jul 15, 2016, 6:33:39 AM7/15/16

to dev-platform

On Fri, May 20, 2016 at 5:56 PM, Boris Zbarsky <bzba...@mit.edu> wrote:
> Background: We have a problem right now where the thing representing a
> "collection of loads" (a loadgroup) is attached to a docshell, not an
> individual document.

Slightly off-topic but:

The notion of attaching load or loadingness into the docshell instead
of the document has been a problem even for understanding the status
of the document itself, so attaching stuff to the doc instead of the
docshell seems generally good.

While I've been looking at other things, have we already gained an
object that tracks the load of a document all the way from when the
navigation starts at link click through redirects before the document
object exists? (Basically what's described at
https://github.com/servo/servo/pull/3714#issuecomment-67650099 )

--
Henri Sivonen
hsiv...@hsivonen.fi
https://hsivonen.fi/

Boris Zbarsky

unread,

Jul 15, 2016, 9:52:14 AM7/15/16

to

On 7/15/16 6:33 AM, Henri Sivonen wrote:
> While I've been looking at other things, have we already gained an
> object that tracks the load of a document all the way from when the
> navigation starts at link click through redirects before the document
> object exists?

Sort of. There's the LoadInfo.

-Boris