RuleProcessorData and performance

Showing 1-4 of 4 messages
RuleProcessorData and performance Boris Zbarsky 10/18/09 10:37 PM
The RuleProcessorData constructor pretty routinely shows up in profiles
as a few % of total time.  For profiles of things like querySelector or
frame construction it can be closer to 10%.

The fundamental design assumption with RuleProcessorData is that a
single element is compared to a bunch of selectors, so that it makes
sense to pre-grab all sorts of information about the element and then
have fast access to it.

But in practice, not all RuleProcessorDatas are created equal.  The
querySelector ones only ever get compared to one selector.  The ones
created by the LHS selector of various combinators might not be compared
to as many selectors as the RHS.

I did some measuring (excluding XUL elements for now due to bug 523047,
but I should consider measuring even with those rules in), and after a
bit of gmailing I have some data in histogram form (sadly the histogram
bar sizes seem to be the log_2 of the number, not the number itself).
"RHS selectors" refers to rule processor data that was not created due
to needing to test an ancestor or prev sibling due to a combinator.
"LHS" selectors" refers to data created due to combinators.  "Matches
against ___ selectors" are the number of SelectorMatches calls for data
of that type.  "Matches against non-tag ___ selectors" are numbers of
such calls that make it past the tag/namespace check up front.  "Matches
against non-tag non-class non-id ___ selectors" are numbers of such
calls that make it past the tag, class, and id checks (I moved the class
and id checks earlier in SelectorMatches).

Summary: for the "LHS" case, 88% of the RuleProcessorData only need
tag/namespace/id/class to not match and 95% are only matched against one
selector that includes anything other than tag/namespace/id/class.  If I
only look at tag/namespace, then the numbers are 2 and 11% respectively.
  For the "RHS" case, the tag/namespace/id/class numbers are 70% and 71%
respectively.  The tag/namespace numbers are 26% and 50%.  Note that in
general there are 3x as many "LHS" SelectorMatches calls as there are
"RHS" ones on gmail.

I'm going to try to get some numbers from our Tp set as well, but at
first glance it looks like at least for LHS selectors we should consider
lazy-initializing everything other than tag/namespace/id/class.  For RHS
selectors it might even be worth lazy-initializing id/class, since a
quarter of the selectors don't need even those, and they aren't that
cheap to initialize...

Histogram data:

mean Matches against LHS selectors 12.516, std. deviation 10.1721, max 50
         [     0]:        0
         [     1]:    21314 ***************
         [     2]:    60864 ****************
[     3,      4]:    31481 ***************
[     5,      8]:    67105 *****************
[     9,     16]:    64733 ****************
[    17,     32]:     8112 *************
[    33,     64]:        0
[    65,    128]:        0
[   129,    256]:        0
[   257,    512]+        0

mean Matches against non-tag LHS  selectors 10.3095, std. deviation
9.08071, max 47
         [     0]:     4244 *************
         [     1]:    22969 ***************
         [     2]:    68128 *****************
[     3,      4]:    50734 ****************
[     5,      8]:    42523 ****************
[     9,     16]:    56968 ****************
[    17,     32]:     8043 *************
[    33,     64]:        0
[    65,    128]:        0
[   129,    256]:        0
[   257,    512]+        0

mean Matches against non-tag non-class non-id LHS selectors 0.20821,
std. deviation 0
.699949, max 10
         [     0]:   224166 ******************
         [     1]:    17347 ***************
         [     2]:     6340 *************
         [     3]:     1045 ***********
         [     4]:     4040 ************
         [     5]:      608 **********
         [     6]:       47 ******
         [     7]:        0
         [     8]:        0
         [     9]:        0
         [    10]+       16 ****

mean Matches against RHS selectors 16.7178, std. deviation 35.2128, max 501
         [     0]:    22216 ***************
         [     1]:    21129 ***************
         [     2]:        2 *
[     3,      4]:      171 ********
[     5,      8]:    21572 ***************
[     9,     16]:     9827 **************
[    17,     32]:     7552 *************
[    33,     64]:      718 **********
[    65,    128]:     3460 ************
[   129,    256]:      160 ********
[   257,    512]+        0

mean Matches against non-tag RHS selectors 8.09371, std. deviation
15.0481, max 75
         [     0]:    60925 ****************
         [     1]:      454 *********
         [     2]:        4 **
[     3,      4]:     2015 ***********
[     5,      8]:     4961 *************
[     9,     16]:    10514 **************
[    17,     32]:     7471 *************
[    33,     64]:      463 *********
[    65,    128]:        0
[   129,    256]:        0
[   257,    512]+        0

mean Matches against non-tag non-class non-id RHS selectors 7.94906,
std. deviation 1
5.0236, max 75
         [     0]:    61272 ****************
         [     1]:      107 *******
         [     2]:      190 ********
[     3,      4]:     3136 ************
[     5,      8]:     3816 ************
[     9,     16]:    10376 **************
[    17,     32]:     7447 *************
[    33,     64]:      463 *********
[    65,    128]:        0
[   129,    256]:        0
[   257,    512]+        0

-Boris

Re: RuleProcessorData and performance Boris Zbarsky 10/19/09 7:07 PM
On 10/19/09 1:37 AM, Boris Zbarsky wrote:
> I'm going to try to get some numbers from our Tp set as well

tp for non-XUL nodes histograms are below.  Summary:  For the "LHS"
case, 80% of RuleProcessorData only need tag/namespace/id/class to not
match.  Anothe 12% are only matched against one selector after matching
on tag/namespace/id/class.

For the "RHS" case, the numbers are 72% and 73% respectively.  For the
RHS case, also, just the tag+namespace accounts for a pretty high
proportion of non-matches (60%).

There are 2.25 times as many "LHS" datas as "RHS" ones.

Since RHS datas always need the tag/id/classes due to ContentEnumFunc
needing that information, I propose we stick to eagerly getting those in
the RuleProcessorData ctor for now but switch to lazily getting
mIsLink/mLinkState and mEventState.  Will file a bug on this.

Raw data:

mean Matches against LHS selectors 37.6586, std. deviation 81.5598, max 933
         [     0]:        0
         [     1]:  2828614 **********************
         [     2]:  1750936 *********************
[     3,      4]:  1960725 *********************
[     5,      8]:  1862980 *********************
[     9,     16]:  2247511 **********************
[    17,     32]:  3441357 **********************
[    33,     64]:  1546569 *********************
[    65,    128]:   622194 ********************
[   129,    256]:   123740 *****************
[   257,    512]+   104106 *****************

mean Matches against non-tag LHS  selectors 22.8737, std. deviation
40.857, max 738
         [     0]:  1337259 *******
         [     1]:  2985507 *******
         [     2]:  1246616 *******
[     3,      4]:  1984317 *******
[     5,      8]:  2052932 *******
[     9,     16]:  2990938 *******
[    17,     32]:  2641725 *******
[    33,     64]:   998051 ******
[    65,    128]:   142351 ******
[   129,    256]:   100812 ******
[   257,    512]+     8224 ****

mean Matches against non-tag non-class non-id LHS selectors 0.713781,
std. deviation 3.45092, max 130
         [     0]: 13125146 ************************
         [     1]:  2022830 *********************
         [     2]:   431570 *******************
         [     3]:   186678 ******************
         [     4]:   161101 ******************
         [     5]:    97153 *****************
         [     6]:    62568 ****************
         [     7]:    37605 ****************
         [     8]:    62699 ****************
         [     9]:    45887 ****************
         [    10]+   255495 ******************

mean Matches against RHS selectors 19.8253, std. deviation 48.354, max 986
         [     0]:  1389274 *******
         [     1]:  1838839 *******
         [     2]:   155109 ******
[     3,      4]:   168683 ******
[     5,      8]:  2058315 *******
[     9,     16]:   732783 ******
[    17,     32]:   438646 ******
[    33,     64]:   293408 ******
[    65,    128]:   192554 ******
[   129,    256]:    41034 *****
[   257,    512]+     6984 ****

mean Matches against non-tag RHS selectors 15.104, std. deviation
45.6789, max 960
         [     0]:  4403536 *******
         [     1]:   576197 ******
         [     2]:   149844 ******
[     3,      4]:    63938 *****
[     5,      8]:   528818 ******
[     9,     16]:   711039 ******
[    17,     32]:   430642 ******
[    33,     64]:   278387 ******
[    65,    128]:   134934 ******
[   129,    256]:    31510 *****
[   257,    512]+     6784 ****

mean Matches against non-tag non-class non-id RHS selectors 14.416, std.
deviation 45.6187, max 960
         [     0]:  5252708 *******
         [     1]:    74599 *****
         [     2]:    19193 *****
[     3,      4]:    76845 *****
[     5,      8]:   337087 ******
[     9,     16]:   687484 ******
[    17,     32]:   417248 ******
[    33,     64]:   277267 ******
[    65,    128]:   137064 ******
[   129,    256]:    29350 *****
[   257,    512]+     6784 ****

-Boris

Re: RuleProcessorData and performance Boris Zbarsky 10/19/09 7:29 PM
On 10/19/09 10:07 PM, Boris Zbarsky wrote:
> tp for non-XUL nodes histograms are below.

Just did this measurement for Txul including XUL nodes and the patch
from bug 523047.  Results:

2/3 of the LHS datas only needed tag/classes/id.  Same for about 40% of
the RHS datas.

So I think we're safe here too.

-Boris

Re: RuleProcessorData and performance Boris Zbarsky 10/19/09 9:17 PM
On 10/19/09 10:07 PM, Boris Zbarsky wrote:
> Will file a bug on this.

https://bugzilla.mozilla.org/show_bug.cgi?id=523288

-Boris