x86_64 JIT enabled by default now

17 views
Skip to first unread message

David Anderson

unread,
Sep 15, 2009, 10:02:49 PM9/15/09
to
I have just turned on the x86_64 JIT by default in the TraceMonkey
tree (see bug 489146 and related dependencies).

For the JS team I think we want to consider this as a tier 1 platform
now - even if moz-central won't pick up on it for a while. We're
committed to portability and making sure the x64 backend+engine stays
at x86 parity. We don't have automated test coverage yet but I'm
working on it. Adobe has been testing the common Nanojit backend on
Windows+Linux+Mac. For now I've only tested on Linux and Mac.

For anyone interested in portability problems, or hacking on TM and
making sure Nanojit LIR is safe and whatnot, I've made a short writeup
on MDC:

https://developer.mozilla.org/En/SpiderMonkey/Internals/64-bit_Compatibility

Comments and bug reports welcome!

David Anderson

unread,
Sep 15, 2009, 10:47:59 PM9/15/09
to
I forgot to add that both the x64 interpreter and JIT were about 20%
than their x86 counterparts, on my Linux laptop, running SunSpider.
The gain from interp->JIT was slightly less on x64 (something like
2.2X faster on x86 to 2.1X faster on x64).

Nicholas Nethercote

unread,
Sep 16, 2009, 1:30:23 AM9/16/09
to David Anderson, dev-tech-...@lists.mozilla.org
On Wed, Sep 16, 2009 at 12:02 PM, David Anderson <dva...@alliedmods.net> wrote:
> I have just turned on the x86_64 JIT by default in the TraceMonkey
> tree (see bug 489146 and related dependencies).

This is great stuff. I just played with it. Some observations:

- SunSpider on my linux box takes 671ms; the 32-bit version takes
854ms. That's a 1.27x speed-up. This makes me wonder how much faster
a 64-bit Firefox would be overall, and whether the rather stately
(ahem) progress towards shipping a 64-bit Firefox should be
accelerated.

- The trace-test basic/testBug507425.js causes my machine to thrash
and become barely usable. This is because this test takes a string
and doubles its length 80 times. At least, it tries to and runs out
of memory. On my 32-bit build this didn't slow things down too much
but on the 64-bit box it does, presumably because of the much larger
address space.

- I get another trace-test failure in debug builds,
forbasic/testEliminatedGuardWithinAnchor.js. The output is:

Trace stats check failed: got 1, expected 3 for sideExitIntoInterpreter


Nick

David Anderson

unread,
Sep 16, 2009, 1:48:36 AM9/16/09
to
On Sep 15, 10:30 pm, Nicholas Nethercote <n.netherc...@gmail.com>
wrote:

> - SunSpider on my linux box takes 671ms;  the 32-bit version takes
> 854ms.  That's a 1.27x speed-up.

Awesome!

> - The trace-test basic/testBug507425.js causes my machine to thrash
> and become barely usable.

Yeah, I pass -x to trace-test.py exclude this test. Once bug 513348
lands maybe it won't be a problem.

>   Trace stats check failed: got 1, expected 3 for sideExitIntoInterpreter

This might be because we don't do integer speculation for div/mod on
x64 yet. Filed bug 516898.

-dvander

Nicholas Nethercote

unread,
Sep 16, 2009, 1:50:06 AM9/16/09
to David Anderson, dev-tech-...@lists.mozilla.org
More interesting stuff... here are Cachegrind results for 32-bit and
64-bit incarnations running 3d-raytrace.js (chosen for no particular
reason).

32-bit:

==26154== I refs: 212,297,439
==26154== I1 misses: 1,388,516
==26154== L2i misses: 7,236
==26154== I1 miss rate: 0.65%
==26154== L2i miss rate: 0.00%
==26154==
==26154== D refs: 112,639,965 (69,920,753 rd + 42,719,212 wr)
==26154== D1 misses: 319,653 ( 217,925 rd + 101,728 wr)
==26154== L2d misses: 79,475 ( 25,712 rd + 53,763 wr)
==26154== D1 miss rate: 0.2% ( 0.3% + 0.2% )
==26154== L2d miss rate: 0.0% ( 0.0% + 0.1% )
==26154==
==26154== L2 refs: 1,708,169 ( 1,606,441 rd + 101,728 wr)
==26154== L2 misses: 86,711 ( 32,948 rd + 53,763 wr)
==26154== L2 miss rate: 0.0% ( 0.0% + 0.1% )
==26154==
==26154== Branches: 28,749,208 (28,472,679 cond + 276,529 ind)
==26154== Mispredicts: 1,981,659 ( 1,872,096 cond + 109,563 ind)
==26154== Mispred rate: 6.8% ( 6.5% + 39.6% )

64-bit:

==26210== I refs: 196,746,804
==26210== I1 misses: 1,693,068
==26210== L2i misses: 7,210
==26210== I1 miss rate: 0.86%
==26210== L2i miss rate: 0.00%
==26210==
==26210== D refs: 73,555,363 (47,043,770 rd + 26,511,593 wr)
==26210== D1 misses: 599,522 ( 408,367 rd + 191,155 wr)
==26210== L2d misses: 142,213 ( 45,597 rd + 96,616 wr)
==26210== D1 miss rate: 0.8% ( 0.8% + 0.7% )
==26210== L2d miss rate: 0.1% ( 0.0% + 0.3% )
==26210==
==26210== L2 refs: 2,292,590 ( 2,101,435 rd + 191,155 wr)
==26210== L2 misses: 149,423 ( 52,807 rd + 96,616 wr)
==26210== L2 miss rate: 0.0% ( 0.0% + 0.3% )
==26210==
==26210== Branches: 27,197,460 (26,608,640 cond + 588,820 ind)
==26210== Mispredicts: 2,089,896 ( 1,831,248 cond + 258,648 ind)
==26210== Mispred rate: 7.6% ( 6.8% + 43.9% )

Interesting that the number of instructions (I refs) is down about 16
million, but the number of data memory accesses (D refs) is down
almost 40 million. I guess less spilling (both in GCC-generated code
and TM-generated code) accounts for a big chunk of that difference.

Nick

Benjamin Smedberg

unread,
Sep 16, 2009, 9:44:49 AM9/16/09
to
On 9/16/09 1:30 AM, Nicholas Nethercote wrote:

> - SunSpider on my linux box takes 671ms; the 32-bit version takes
> 854ms. That's a 1.27x speed-up. This makes me wonder how much faster
> a 64-bit Firefox would be overall, and whether the rather stately
> (ahem) progress towards shipping a 64-bit Firefox should be
> accelerated.

*Who* shipping? The distros already do. mozilla.org does Linux releases
primarily for testing purposes: we intend that end-users get Firefox
primarily from their distro, not from mozilla.org. Adding another
architecture to the set of mozilla.org releases requires a lot of QA
resources which I think could best be spent elsewhere.

--BDS

Reply all
Reply to author
Forward
0 new messages