Is --trace-deopt actually usable? If so, how are you supposed to use it?

2,809 views
Skip to first unread message

Kevin Gadd

unread,
Jul 17, 2012, 2:34:33 PM7/17/12
to v8-u...@googlegroups.com
I've spent the last couple hours trying to actually get anything useful out of --trace-deopt. Unfortunately, I've had no success. I'm primarily going off the information from http://floitsch.blogspot.com/2012/03/optimizing-for-v8-inlining.html, as it's the only detailed info I've found on the internet about --trace-deopt.

From what I can tell, the only way to use this feature seems to be in a debug build of d8, and to map the offsets (I think they're offsets? Unable to find information on this) in the deopt spew back to the generated assembly from the JIT, using the --print-code option. I.E.:

**** DEOPT: MouseCursor_get_ClientBoundsWidth at bailout #7, address 0x0, frame size 12
            ;;; @36: deoptimize.

In this spew I *think* @36 refers to instruction #36 in the generated IR from the JIT? It's unclear whether this is the high level IR or the low level IR (hydrogen.cfg, when you can actually get c1visualizer to load it, claims there are two kinds of IR - more on that later).

So, right off the bat, this seems less than helpful - --print-code generates an absolutely titanic dump of all sorts of data and none of it is correlated - nothing in the output ASM maps it back to the source JS, and the IR (the IR that shows up in hydrogen.cfg) doesn't even seem to be there. It's unclear what the @36 in this case would actually point to, or how once I had located 36 I would map it back to a defect in my original source JavaScript or even to a particular operation in the IR. Mapping my JavaScript to V8's IR seems like something I can manage if necessary - most of the opcodes are KIND OF self explanatory if you spend forever understanding V8. But mapping the raw JIT assembly back to JS is just plain nuts - there's way too much going on there to even understand what a given deoptimization means if this is the only information I have.

Really, all I need here is a rough mapping that tells me what part of a function is triggering a deoptimization. If V8 can't give me a reason (this seems to almost universally be the case), then fine, I'll just figure it out with brute force - but without knowing roughly what part of the function is responsible it's impossible to do any real debugging or optimization here (trying to binary search the function manually won't work, because changing the size and structure of the function will change whether it deoptimizes and where it deoptimizes).

http://floitsch.blogspot.com/2012/03/optimizing-for-v8-hydrogen.html suggests that it's possible to get at the IR using some debug flags, and when I tried them out they indeed generate an enormous file named hydrogen.cfg. Unfortunately, the tool the post suggests using - c1visualizer - is either broken or does not support the version of the .cfg format Chrome now spits out, because it fails to load almost every meaningfully large .cfg file I've ever managed to get. When it does successfully load one, most of the functions I care about are missing, which suggests that the file is incomplete or their parser stopped early. The file is large and noisy enough that I don't think I'd be able to make sense of it by hand without a visualizer tool. Is there a working replacement for c1visualizer that the Chrome team uses now? I searched for one but couldn't find anything. Do I have to write my own visualizer?

Even if the enormous spew of raw assembly from d8 with --print-code and --trace-deopt were usable (at present it doesn't seem usable without more complete information), it doesn't feel like enough to solve real performance issues. I'm looking at deoptimizations right now that only seem to occur when actually running an application in Chrome (i.e. interacting with APIs like WebGL and Canvas), which is where I actually care about performance - I can't simply reduce all of these deoptimized functions into test cases because they don't work without access to the rest of their dependencies.

It seems like maybe if I built a debug version of Chromium myself, I'd be able to pass *it* --print-code, but then I'd be missing codecs like MP3, and, I'd still have to deal with the enormous spew of data and raw assembly there.

All I really want is line numbers. Is this possible? Could I possibly get it by hand-patching v8 in the right place and building my own Chromium?

Also, I could rant about how cryptic --trace-bailout is, but that feature at least seems to work and provide actionable data (if you're willing to grep through the v8 source code and try and understand what the terminology means and set breakpoints to follow consequence chains and understand *why* a particular bailout actually occurred). Really right now the deoptimizations are my biggest concern because hundreds of them are happening every second versus a very small number of bailouts.

Thanks,
-kg

Jakob Kummerow

unread,
Jul 17, 2012, 6:32:35 PM7/17/12
to v8-u...@googlegroups.com
Seems like you're finding out the hard way what a complicated beast a modern JS engine is ;-)

--trace-deopt is certainly usable; but it is not geared towards ease-of-use for JavaScript developers, that's for sure.

Helpful flags are --print-opt-code and --code-comments. Those two together will print the disassembled code for *optimized* functions, interleaved with some comments, such as the IDs of the HIR instructions that generated the code (which is what the "@36" in your example refers to), and the deoptimization IDs ("bailout #7" in your example). And yes, there can be a lot of disassembled code that gets printed when you feed a lot of JavaScript code to V8 -- redirecting output to a file and opening that with an editor makes this easier to manage.

Since disassembly is involved, --print-opt-code only works when V8 has been built with disassembler support. That's the case in a debug mode build, or in release mode when you specify GYPFLAGS="-Dv8_enable_disassembler=1" (when you build d8 with GYP/make, we have a convenience option for that: "make disassembler=on ia32.release"). If you build your own Chromium anyway, you can also change the flag's default value in src/v8/build/common.gypi, if you find that easier than setting an environment variable.

Due to the optimizations that the optimizing compiler does, there is no mapping from assembly instructions (or deopt points) to line numbers. I'm not sure if and with how much effort it'd be possible to hack up support for that. I agree that it would be great if Chrome's dev tools could show you where deopts happened, and why...

c1visualizer is still state of the art to visualize what the optimizing compiler does. Yes, it's a somewhat sorry state of things, but it can be very helpful. Probably more helpful for debugging the compiler than for debugging JavaScript, though. Upping the max memory helps a lot when loading large hydrogen.cfg dumps.

There aren't all that many reasons for deopts, and it's relatively easy to learn which JS constructs can cause a deopt at all: mainly stuff that hasn't happened earlier, e.g. accessing a property of an object with a new type, or a double value suddenly being undefined, or code after a long-running loop that's never been executed before, or an array access out of bounds that was within bounds earlier, and so on. So with some experience, and assuming you're not running into bugs/deficiencies of V8, staring at assembly code won't even be necessary. Also, if you find that by refactoring your function (e.g. splitting it into two smaller functions) you can prevent the deopt, that's not really a problem, is it? It's kind of the solution you were looking for in the first place, right?
[Btw, there's a pretty recent video of a talk that explains some of this, and mentions some common pitfalls to avoid: http://www.youtube.com/watch?v=UJPdhx5zTaw ]

Ranting about flags or their output being cryptic may help you let off some steam, but beyond that doesn't get you anywhere. V8's command-line flags let you peek at V8's inner workings, and making sense of their output requires some understanding of these internals. Nobody ever claimed anything else.

Kevin Gadd

unread,
Jul 17, 2012, 6:42:14 PM7/17/12
to v8-u...@googlegroups.com
Thanks for the link to that video, I'll give it a look. Based on your
suggestion I'll try doing a custom build of Chromium and see if the
disassembly will let me make sense of things.

The reason this is a real problem for me (and why I find the lack of
documentation for this stuff in chromium frustrating) is that I'm
machine-translating code from other languages to JS - hand-editing it
to make it faster is something I can do for code I'm writing myself,
but I can't do it in a compiler. The current nature of the performance
feedback from V8 makes it more or less a black box and this is
worsened by the fact that a large amount of the documentation I've
found out there that claims to describe V8 performance characteristics
is either wrong or outdated. When you profile an application in V8 and
the web inspector says you're spending 50% of your time in a simple
function, your only choice is to dig deeper to try and understand why
that function is slow. You could solve this by offering line-level
profiling data in your profiler, but I think that's probably a ton of
work, so I'm not asking for that. ;)

To provide one example: I did some spelunking around with
--trace-deopt and --trace-bailouts and found that in my codebase,
basically any use of the 'arguments' object - even just checking
'arguments.length' - causes the entire function to be deoptimized. Of
course, there isn't a ton of documentation here, but
http://s3.mrale.ph/nodecamp.eu/#57 along with other sources claim that
this is not the case. So, either something creepy is happening in my
test cases - more verbose feedback from V8 here, or concrete docs from
the devs themselves would help - or the information being given to the
public is wrong. Non-v8/non-chrome devs saying false things about V8
performance isn't your fault, but it wouldn't hurt to try and prevent
that by publishing good information in textual form on the web.

I hope I'm not giving the impression that I think V8 is the only
problem here either; JS performance in general is a labyrinth. Based
on my experiences however, the best way to get info about V8
performance tuning is to talk to a Chromium dev directly or hunt down
YouTube videos of hour long presentations. This is pretty suboptimal
for a developer who's trying to tackle a performance issue in the
short term - Google is surprisingly bad at finding either of those two
info sources when you dig around or search for diagnostic messages.

I've personally been documenting everything concrete I learn about
this stuff on my wiki, but once I stop doing this day-to-day, all that
information will get outdated and mislead future developers. I think
that sucks, and the only good solution is for JS runtime devs to
provide concrete information on performance pitfalls and such in an
easily found location and keep it somewhat up to date. I don't think
you guys necessarily need to make --trace-deopt or --trace-bailout
super user friendly, but if you don't, there needs to be some basic
guidance out there for developers so that JS flags aren't their only
option for understanding Chrome performance.

Thanks for the info about c1visualizer - I bet the memory limit was
probably responsible for the flakiness and if I fiddle with JVM
parameters it might work. I'll give it another try later on.

-kg
> --
> v8-users mailing list
> v8-u...@googlegroups.com
> http://groups.google.com/group/v8-users



--
-kg

Vyacheslav Egorov

unread,
Jul 18, 2012, 2:52:48 AM7/18/12
to v8-users
Hi Kevin,

To be absolutely honest all these flags historically were made by V8
developers for V8 developers. You usually can't interpret what they
print without understanding of how V8 works internally, how optimizing
compiler IR looks like etc. We advocate them for JS developers only
because there is nothing else available at the moment.

[I was always convinced that V8 needs a more gui-sh thing that would
overlay events from the optimizing compiler over the source of
function but that is not so easy. I was playing with some prototypes
but at some point I gave up... It requires attaching source position
information to individual IR instructions (plus merging this
information somehow when we optimize code and remove redundant
instructions) and to make it worse: AST does not even have a span
information attached to each node... you can't say that expression a +
b starts a position X and ends at position Y to correctly highlight
the whole offending expression.]

The deoptimization that you are mentioning in the first message
indicates that either the execution reached a part of the function
that was optimized before typefeedback for it was available [this
happens a lot for big functions or functions with complicated control
flow and rarely executed parts] or you have a polymorphic property
access site that had a small degree of polymorphism at the moment of
compilation, but now it saw some new hidden class.

> To provide one example: I did some spelunking around with
> --trace-deopt and --trace-bailouts and found that in my codebase,
> basically any use of the 'arguments' object - even just checking
> 'arguments.length' - causes the entire function to be deoptimized.

Can you provide more information about this? What kind of --trace-
deopt/trace-bailout output made it look like arguments.length causes
deoptimization?

> Non-v8/non-chrome devs saying false things about V8
> performance isn't your fault

To be precise presentation your are linking to was made by me and I am
V8 dev.

> Thanks for the info about c1visualizer - I bet the memory limit was
> probably responsible for the flakiness and if I fiddle with JVM
> parameters it might work. I'll give it another try later on.

C1Visualizer has a major problem with its memory consumption. Big IR
dumps usually have to be either split into separate files (I do it
with a small script) or minimized by applying --hydrogen-filter=foo
flag to block optimization of all functions that are not called foo.

--
Vyacheslav Egorov



On Jul 18, 12:42 am, Kevin Gadd <kevin.g...@gmail.com> wrote:
> Thanks for the link to that video, I'll give it a look. Based on your
> suggestion I'll try doing a custom build of Chromium and see if the
> disassembly will let me make sense of things.
>
> The reason this is a real problem for me (and why I find the lack of
> documentation for this stuff in chromium frustrating) is that I'm
> machine-translating code from other languages to JS - hand-editing it
> to make it faster is something I can do for code I'm writing myself,
> but I can't do it in a compiler. The current nature of the performance
> feedback from V8 makes it more or less a black box and this is
> worsened by the fact that a large amount of the documentation I've
> found out there that claims to describe V8 performance characteristics
> is either wrong or outdated. When you profile an application in V8 and
> the web inspector says you're spending 50% of your time in a simple
> function, your only choice is to dig deeper to try and understand why
> that function is slow. You could solve this by offering line-level
> profiling data in your profiler, but I think that's probably a ton of
> work, so I'm not asking for that. ;)
>
> To provide one example: I did some spelunking around with
> --trace-deopt and --trace-bailouts and found that in my codebase,
> basically any use of the 'arguments' object - even just checking
> 'arguments.length' - causes the entire function to be deoptimized. Of
> course, there isn't a ton of documentation here, buthttp://s3.mrale.ph/nodecamp.eu/#57along with other sources claim that
> ...
>
> read more »

Kevin Gadd

unread,
Jul 18, 2012, 12:06:41 PM7/18/12
to v8-u...@googlegroups.com
Thanks for the detailed response. Unfortunately I didn't write down
the example I saw with arguments.length causing it - it may have been
me misreading the output, or perhaps it was from inlining? However,
there are certainly a bunch of uses of fn.apply(this, arguments),
which the presentation also said would be fine, and those are bailing
out. Here are two examples (both generated by code at runtime, so if I
can change the generated code to fix this, I'd love to know about it
:D)

return (function JSIL_ArrayEnumerator() {
return state.ctorToCall.apply(this, arguments);
});

Bailout in HGraphBuilder: @"JSIL_ArrayEnumerator": bad value context
for arguments value

return (function System_Threading_Interlocked_CompareExchange() {
var argc = arguments.length;
if (argc === 4) {
return thisType["CompareExchange`1$559[!!0],!!0,!!0=!!0"].apply(this,
arguments);
}

throw new Error('No overload of ' + name + ' can accept ' +
(argc - offset) + ' argument(s).')
});

Bailout in HGraphBuilder:
@"System_Threading_Interlocked_CompareExchange": bad value context for
arguments value

In total I see something like 30 'value context for arguments value'
bailouts when starting this test case and almost all of them look like
they should be okay based on that slide, so I must either have
misinterpreted the slide or it's not correct anymore.

Your explanation on why the no-message deopts occur is helpful; if I
assume that they indicate polymorphism maybe I can use that
information to try and zero in on locations within the function where
polymorphism might be occurring and make some headway that way.
Thanks.

--hydrogen-filter sounds like *exactly* what I need, so thank you very
much for mentioning that. :D

-kg

Vyacheslav Egorov

unread,
Jul 18, 2012, 12:26:15 PM7/18/12
to v8-u...@googlegroups.com
>
> return (function JSIL_ArrayEnumerator() {
> return state.ctorToCall.apply(this, arguments);
> });
>
> Bailout in HGraphBuilder: @"JSIL_ArrayEnumerator": bad value context
> for arguments value

Interesting. There is a "small" detail that my slides do not mention:
.apply must be the builtin apply function and expression should be
monomorphic.

Monomorphic example that will be optimized:

function apply() { arguments[0].apply(this, arguments); }

function foo() { }
function bar() { }

apply(foo);
apply(foo);
apply(bar);
apply(bar);
// Both foo and bar have same hidden classes.

Polymorphic example that will not be:

function apply() { arguments[0].apply(this, arguments); }

function foo() { }
function bar() { }

bar.foo = "aaa"; // After this point foo and bar have different hidden classes.

apply(foo);
apply(foo);
apply(bar);
apply(bar);

// now .apply expression inside apply is not monomorphic and compiler
will say "bad value context for arguments value".

Did you patch Function.prototype.apply or add properties to your
functions? This might explain why .apply optimization gets confused.

> Bailout in HGraphBuilder: @"System_Threading_Interlocked_CompareExchange": bad value context for arguments value

This one might be tricky. Assumptions V8 makes during compilation are
all based on type-feedback. If argc was never equal to 4 before V8
tried to optimize System_Threading_Interlocked_CompareExchange V8 just
will not know that .apply there is a built-in Function.prototype.apply
so it will bailout. I suggest avoiding .apply on rarely executed
branches in hot functions if possible.

Of course there might be still a possiblity that .apply access is
polymorphic as described above.

> Your explanation on why the no-message deopts occur is helpful

To be precise I was referring to deoptimization that happens on
"deoptimize" instruction you quoted in your first mail.

[please do not hesitate to ask more questions!]

--
Vyacheslav Egorov

Kevin Gadd

unread,
Jul 18, 2012, 12:33:46 PM7/18/12
to v8-u...@googlegroups.com
Oh, if functions have hidden classes and changing them prevents the
fast-path for .call and .apply, then setting debugName and displayName
on all my functions isn't a very good idea... thanks. That makes the
slide's advice make more sense, and it also explains why my attempts
to move the bailout into a child function weren't a success.

I was
under the impressions that bailouts were based on the shape of the
code and deopts were based on type information - if the bailout can
also occur because it doesn't have IC type information to show that
.apply is builtin and the fn is a standard Function, that explains how
I'm getting it in some of these contexts. I will test this out some
and if I get good results I'll definitely write it up on my
optimization page.

https://github.com/kevingadd/JSIL/wiki/JavaScript-Performance-For-Madmen

If any of the information on the above about V8 is wrong, please let
me know so I can fix it :)

P.S. Every example I've ever seen for Hidden Classes uses Objects. I
foolishly assumed that as a result, they only applied to user-created
objects - do they apply to anything that can have properties (strings,
functions, etc) as well? Does modifying an object's prototype cause
its hidden class to change and deopt any functions that use it - like
if I were to alter String.prototype or Number.prototype after some
code had JITted? Is a function's hidden class just based on whether
you've made changes, or do, say, .bind() functions have a different
hidden class from native ones like console.log?

Thanks,
-kg

Vyacheslav Egorov

unread,
Jul 18, 2012, 1:26:33 PM7/18/12
to v8-u...@googlegroups.com
> Oh, if functions have hidden classes and changing them prevents the
> fast-path for .call and .apply, then setting debugName and displayName
> on all my functions isn't a very good idea... thanks.

Well actually if you set same fields in the same order on _all_
functions that come into this .apply site then it should be fine
(unless you set too many fields, more than 14, or delete properties
--- which would cause property backing store to be normalized) => they
will all have the same hidden class.

> I was under the impressions that bailouts were based on the shape of the
> code and deopts were based on type information

Yep, we have some corner cases where compile time optimization is
limited to certain good cases that can be detected by looking at type
feedback. So if type feedback does not look good we just bailout.

> do they apply to anything that can have properties (strings, functions, etc) as well?

Well... How should I answer, not too be confusing :-) Short answer is
yes: objects, functions, value wrappers (String, Number, etc) have
hidden class that change when you add/remove properties, primitive
values that don't carry properties like strings and numbers don't have
one (or rather they don't change it, because you can't add/remove
properties on them).

Long answer is: strictly speaking _every_ object in V8 heap has a
thing called Map, that describes its layout. Objects that can carry
around properties (inheriting from v8::internal::JSObject:
https://github.com/v8/v8/blob/master/src/objects.h#L57-71) _might_
have their map changed when you add and remove properties. It does not
always happen, because not evey map is describing "fast" properties
layout.

You can sometimes see deoptimizations that mention check-map
instruction. It's the one that checks object layout by comparing
object's map to an expected map.

> Does modifying an object's prototype cause
> its hidden class to change and deopt any functions that use it - like
> if I were to alter String.prototype or Number.prototype after some
> code had JITted?

If you add property to a prototype then JS object that represents
prototype will get a new hidden class. If some optimized code was
checking this prototype and expecting certain map --- this check will
fail when executed and code will deopt. If some inline cache stub was
checking it --- this check will fail when this IC is used and IC will
miss.

> Is a function's hidden class just based on whether
> you've made changes, or do, say, .bind() functions have a different
> hidden class from native ones like console.log?

Yeah, they actually have different ones due to the way we do bootstrap
of the built-ins. Built-in functions are actually slightly different
from normal functions because they does not have .prototype property
by default. But even if you add one manually they will not transition
to the same hidden class as a normal function with .prototype; they
are just not linked together with a transition and are completely
separated. I am not exactly sure why.

Functions coming from different contexts (iframes) will have different
hidden classes.

Strict functions ("use strict";) will have different hidden classes
from non-strict ones.

--
Vyacheslav Egorov
Reply all
Reply to author
Forward
0 new messages