On 5/28/12 1:17 AM, Bill McCloskey wrote:
> We log this to the error console under "Mark Roots:". On my 6-year-old iMac I'm seeing 8ms for 11 tabs, which is a little higher than I expected, but it is an older computer. However, that includes all root marking, which encompasses a lot of things. I'm not sure what percentage of that time is for XPConnect roots. That would be interesting to check.
Indeed. My "Mark Roots" times, on an MBP that's less than 2 years old
are in the 20ms range, but I have probably around 20-30 active tabs and
another 50-60 not-yet-loaded ones. But again, it's not clear what
fraction of that is the XPConnect roots.
> I guess it might be useful to know what a common worst-case scenario would be. For example, how much more memory would we use when loading a big static table?
That depends on how big the JSObjects end up being, in practice.
Basically at least 4 words plus 8 bytes (assuming one reserved slot),
plus whatever the amortized cost of shapes, etc, etc is, right?
> I don't even have a ballpark estimate for how many DOM objects a typical page uses per table cell. Is it closer to 1 or 10 or 50?
Actual DOM nodes, 1 per cell, plus one per row, plus 2 for the table. I
_think_ most of the other things involved lazily allocate their entire
DOM reflection, so wouldn't have DOM objects at all by default.
So figure 1 DOM object per cell.
Yes, I'm well aware. ;)
> In the common case (loading from fixed slots), it's 3 loads from (likely) 2 distinct cache lines.
Yep.
> If we knew that we were accessing a fixed slot, which seems quite feasible, then it could be reduced to a single 8-byte load.
That would help significantly, yes. Would still be a noticeable perf
hit in some cases for 32-bit builds due to the increased cache pressure
(e.g. see
https://twitter.com/bz_moz/status/73784940755566592 for a case
where that effect is visible with 64-bit vs 32-bit Gecko builds), but
would make this a lot more palatable.
> For stores we would still have to worry about a write barrier, but we're trying to enforce those anyway.
And stores are a lot less common anyway.
> Well, I think this is something we could do gradually. For example, I looked through about half the NS_IMPL_CYCLE_COLLECTION_TRACE_BEGIN implementations. They all seem to trace a single field of a C++ object. So it seems like a lot of stuff is of the simple variety.
Indeed. Is there a benefit to doing this partway? Seems like if we
need extra complexity to support the unconverted cases then we need it
anyway, and the other approach trades off cycle-collection complexity in
the C++ for complexity in implementing members...
>> We might be able to expose the needed number of reserved slots as a
>> static method on the underlying object class. Except in some cases
>> we
>> have different C++ implementations, with different sets of members,
>> but
>> sharing a single API, for the same JSClass.
>
> I don't understand very well how the new DOM bindings are generated.
The input is an IDL file.
The output is the following things:
1) A JSClass.
2) A bunch of JSNatives that know how to get the C++ object from
|this|, convert arguments to C++ things, and call a function on the
C++ objects.
The rest of the work is done in the C++ object. The functions called in
#2 above might be virtual functions, with different implementations in
different objects. Multiple different C++ classes can share a single
JSClass as described above, as long as they have a common superclass.
In fact, there are various cases in the DOM specs in which two objects
are required to have the same prototype (and hence the same JSNatives as
described above) and have totally different behavior for those
methods/properties. Right now we implement this via polymorphism in
C++. It _could_ be done with a simple implementation class and branches
on something, in theory, but in practice trying to use a single impl
class for both nsComputedDOMStyle and DOMCSSDeclarationImpl, say, would
be ... difficult.
> It seems to me like code generation is a *perfect* way to address these issues: it gives us a simple way to do experiments by changing the implementation for all the new DOM objects at once
We don't code-generate the implementation. We code-generate glue code
that calls into already-existing implementations. That's why they're
called "bindings". ;)
> However, it sounds like a mixture of code generation and normal human-coded C++ is used. What's the dividing line?
The code generation is used to implement WebIDL, basically. It handles
conversion from JSAPI stuff to WebIDL-like types and invocation of the
actual implementation methods, using the IDL files as input and
processing them according to the WebIDL rules.
Since the behavior of the implementation methods themselves is typically
described in specification prose, not in a machine-readable format, it's
rather impossible to codegen those....
-Boris