Compiling Scala.js to Strong Mode - the good, the bad, and the ugly

261 views
Skip to first unread message

Sébastien Doeraene

unread,
May 2, 2015, 3:07:41 PM5/2/15
to streng...@googlegroups.com
Hello all,

As you might have read earlier on this list, I have been working on a strong mode backend for the Scala.js [1] toolchain. I am basically done, with a PR awaiting review [2]. I believe it is time to give a small(-ish) report of what I found good, bad, and ugly about strong mode while doing so, so here it is.

tl;dr Skip to Conclusion

The good (not necessarily super interesting -- skip ahead to "The bad" unless you're interested)

As soon as I had learned about strong mode, I read the strawman, and thought it would be a very good fit for Scala.js. There are quite a number of fundamental properties of Scala.js that follow strong mode's philosophy.

G1. Strictness.
Scala.js was always defined to follow and target strict mode.

G2. let/const, declare-before-use (intra method)
The rules on var/val usage in Scala are such that their naive translation to let/const directly follows the restrictions in strong mode.

G3. Classes are frozen, instances are sealed.
Coming from a statically typed language with classes, it is obvious that you emit only classes that follow this restriction. Classes in a statically typed language *are* frozen by nature, and their objects sealed as well.

G4. Arrays.
Scala.js had already taken the side of favoring non-sparse arrays. The API exposed about js.Arrays in Scala.js (the type of a JS Array) assumes the arrays do not contain holes.

G5. The `arguments` magic variable
In Scala.js, the `arguments` magic variable is not exposed to the developer. There is no way to access it. Some very isolated compiler-generated code manipulates `arguments` to export vararg methods to JS, but these usages are only rest-parameter-like, so we could almost trivially turn these into using ...rest params instead.

G6. Constructors
As harsh as the restrictions of strong mode on constructors are, it turns out Scala.js emits just that kind of constructors (except for one thing, see below). The reason is that in Scala, constructors can be overloaded. So just like all other methods, we give different mangled names to the different overloads of constructors, and turn them into `init__xyz` methods. So instantiating a Scala.js object produces JS code that looks like `new TheClass().init___I(5)`. The JS constructor, being unique, only initializes all fields to the zero of their type, according to JVM semantics (0 for numbers, false for booleans, null for pretty much everything else).

G7. undefined
undefined is basically considered as a keyword by the codegen, so it's not possible to change its meaning. Moreover, the value undefined is always read as `void 0` instead of `undefined`.

G8. eval
Direct eval is not even allowed in Scala.js. Only indirect eval is possible.

G9. switch
switch statements are only emitted for `match` expressions in Scala, which naturally follow the restrictions on switch in strong mode.

G10. == and !=
These do not exist in Scala.js.

G11. for..in
As such, it does not exist in Scala.js. There is a method in the standard library, called propertiesOf, that returns a js.Array[String] with the keys returned by a direct loop on for..in, though. But this function is implemented once as such:
function propertiesOf(x) {
  const r = [];
  for (const p in x)
    r.push(p);
  return r;
}

It is enough to declare this function in strict mode surrounding the strong mode block, and use it from strong code.

G12. Implicit conversions
Being a statically typed language, operators are checked at compile time anyway. Currently we still emit `string + nonString`, though (we have to fix that by calling `.toString()` explicitly on `nonString`).

The bad

Now coming to the bad. There are a few things that didn't work out quite as well as I hoped.

B1. `null | 0` and `+null`

These two would return 0 in strict mode, but throws in strong mode. It turns out that the conversion of `null` to 0 with | and + is *very* useful to the codegen of Scala.js. Due to a pretty obscure aspect of Scala's specification, the pervasive "unboxing" operations `x.asInstanceOf[Int]` and `x.asInstanceOf[Double]` (which are inserted in *many* places by the compiler) must return 0 if x === null or x itself if it is an Int/Double (in Scala.js any other value is an undefined behavior), which translates extremely well to `x | 0` and `+x` respectively. Also, the `null` case happens in fact extremely rarely. So it's actually a very good thing for Scala.js code that the VM will optimistically optimize this for the number case. I don't really care that there's a deoptimization if x turns out to be null, because in practice it rarely happens (for most codebases, it never happens, actually, but detecting it at compile time is undecidable).
So now I have to emit `(x || 0) | 0` for the Int case instead, and, worse, `uD(x)` where uD is defined as
function uD(x) {
  return x === null ? 0 : +x;
}

(I cannot use `+(x || 0)` because that turns NaN into 0 as well, and I don't want that).
All in all, I'm pretty disappointed about this conversion being dropped. I'd very much like to see strong mode still allowing this. But of course, from a pure JS perspective, why would anyone think that `null | 0 === +null === 0` is a useful behavior worth preserving? So I am my own devil's advocate to this change, much to my dismay :-(

B2. No way to define a class-wide *field*

In non-strong mode, Scala.js uses a class-wide field, added on each class' prototype, to record RTTI about the class:
SomeClass.prototype.$classData =
  new TypeData(<some metadata about SomeClass>);

(in the future, this will be a constant Int instead, with each class having its own value, but the issue is the same)
These are mostly used for instance-of tests, such as `x.isInstanceOf[Foo]`. Note that Foo can be an *interface*, which doesn't exist at runtime except at the meta level, so using the JS `instanceof` operator is not an option.
In strong mode, I cannot define a field neither on the class' prototype, nor on the class function itself, since there is no ES6 construct allowing to define it, and both are frozen. So, I basically have to duplicate this $classData field in *every* instance of every Scala.js class. Even though it might end up having faster access when it's needed, it's mostly a huge memory penalty (penalty which might eventually be compensated for thanks to the headers of strong objects being smaller). Currently, the initialization of this field is done in the generated constructor(), and so it is rewritten by constructors in subclasses, which currently violates one of the restrictions of constructors. If this violation is confirmed (see also the mailing list thread [3]), I'll find some other way (overwrite in the init__ methods), but for now I have avoided solving it because it appears to be non-trivial.

The ugly

U1. Interoperability

The really ugly thing is *interoperability* with JS code, which we expect to be not strong. Strong mode argues it can interoperate with weak mode, but there are things that don't really work.
I am mostly talking about `x.p = v` when `p` is not a property of `x`, as well as `delete x.p` (and, less importantly, about selecting `x.p` when `p` does not exist).
The problem is that, when you manipulate objects received from a weak library, or meant to be passed to a weak library, or both, you really want those things to behave like in weak mode. The most obvious example is: you're manipulating an object which is used and meant to be used as a map with string keys. It doesn't matter that Map is a better replacement: the weak library does not use Map, it uses an Object. But now, how do you add an element to that Object? You need an incredible amount of ceremony with defineProperty to replicate the semantics of `x.p = v`. Much worse: how do you *remove* an element from that dictionary? You *cannot*.
This will, IMO, be a problem for strong mode in general, not just as a compilation target of Scala.js.

In Scala.js, we work around this issue by declaring the three following functions in strict mode:
function weakSelect(x, p) { return x[p]; }
function weakAssign(x, p, v) { x[p] = v; }
function weakDelete(x, p) { delete x[p]; }
Then we reroute all selects, assigns and deletes to these functions instead of using syntax. This is possible in Scala.js because the codegen knows whether we're selecting something of a Scala.js object (for which we can use syntax) or manipulating JS objects external to Scala.js. We use the weak functions only in the latter case, which is hopefully not too common.

But this is really Ugly. My proposal to fix this would be to behave like weak code when manipulating weak objects, at least for `x.p = v` and `delete x.p` (selection is less problematic). I would argue that strong mode should still allow syntactically `delete x.p` and `delete x[p]`, but that they throw if `x` is a strong object. Similarly, `x.p = v` and `x[p] = v` would be able to create a field if `x` is weak. I believe this would dramatically improve the promise of strong mode to be interoperable with weak code.

Conclusion

There are many things that the semantics of Scala.js and strong mode have in common, and that makes strong mode a really good target for Scala.js. There a few bad and ugly things, but we do have workarounds for them, even if some of them can impact negatively performance or memory consumption.

1. I very much regret that `null | 0` and `+null` are not allowed anymore.
2. There is a need for class-wide fields, which is currently hard to get by.
3. Interoperability with weak mode is not as good as it appears.

I would gladly receive your opinions on these matters!

Cheers,
Sébastien

PS: early benchmarks (not good ^^)

Since we're a compiler, we can easily take an existing codebase and compile it with different output modes for comparing benchmarks. I did that for the benchmarks Richards, Tracer and DeltaBlue of V8's benchmark suite, which we've been using to track Scala.js' performance for quite some time. The current state, using iojs nightly of 4/28 with `--harmony-rest-parameters --strong-mode` (uses V8 4.2.77.18), can be found at [4], and yielded the following results. Results are in microseconds, so lower is better. The different versions are:

fastopt: ES 5.1 output
fullopt: ES 5.1 output, piped through the Google Closure Compiler
es6: ES 6 output, mostly leveraging classes, rest parameters (also uses let and const)
strongmode: ES 6 output compliant with strong mode (modulo the exceptions I mentioned above)
js: reference implementation written in JavaScript (directly taken from V8's benchmarks)

For a fair comparison, compare es6 and strongmode to fastopt.

richards [fastopt]     : 106.88899577788467 us
richards [fullopt]     : 129.30755802676666 us
richards [es6]         : 379.21880925293897 us
richards [strongmode]  : 377.5009437523594 us
richards [js]          : 116.72016340822877 us

tracer [fastopt]       : 1297.6653696498054 us
tracer [fullopt]       : 1076.4262648008612 us
tracer [es6]           : 14000 us
tracer [strongmode]    : 14678.83211678832 us
tracer [js]            : 1166.8611435239206 us

deltablue [fastopt]    : 275.1031636863824 us
deltablue [fullopt]    : 330.57851239669424 us
deltablue [es6]        : 2560.819462227913 us
deltablue [strongmode] : 2992.526158445441 us
deltablue [js]         : 131.27666557269444 us


So far, these benchmarks are not looking good for ES 6, and even less so for Strong Mode (they're basically 4x to 10x slower than fastopt). This is expected of course. Classes have barely made it out of staging in V8, so we cannot expect them to be optimized yet. I expect classes and strong mode to get faster with time.
[2] https://github.com/scala-js/scala-js/pull/1620
[3] https://groups.google.com/forum/#!topic/strengthen-js/qSTV7pBOkyE
[4] https://github.com/sjrd/scalajs-benchmarks/tree/es6-and-strongmode

Andreas Rossberg

unread,
May 5, 2015, 1:20:38 PM5/5/15
to Sébastien Doeraene, streng...@googlegroups.com
Hi Sébastien,

thanks a lot for this thoughtful write-up, that is really useful! Follow-ups inline.


The good (not necessarily super interesting -- skip ahead to "The bad" unless you're interested)
[...] 
G12. Implicit conversions
Being a statically typed language, operators are checked at compile time anyway. Currently we still emit `string + nonString`, though (we have to fix that by calling `.toString()` explicitly on `nonString`).

A more uniform pattern for explicit conversions would perhaps be to call the respective constructors, e.g. String(..), Number(..) etc.


The bad

Now coming to the bad. There are a few things that didn't work out quite as well as I hoped.

B1. `null | 0` and `+null`

These two would return 0 in strict mode, but throws in strong mode. It turns out that the conversion of `null` to 0 with | and + is *very* useful to the codegen of Scala.js. Due to a pretty obscure aspect of Scala's specification, the pervasive "unboxing" operations `x.asInstanceOf[Int]` and `x.asInstanceOf[Double]` (which are inserted in *many* places by the compiler) must return 0 if x === null or x itself if it is an Int/Double (in Scala.js any other value is an undefined behavior), which translates extremely well to `x | 0` and `+x` respectively. Also, the `null` case happens in fact extremely rarely. So it's actually a very good thing for Scala.js code that the VM will optimistically optimize this for the number case. I don't really care that there's a deoptimization if x turns out to be null, because in practice it rarely happens (for most codebases, it never happens, actually, but detecting it at compile time is undecidable).
So now I have to emit `(x || 0) | 0` for the Int case instead, and, worse, `uD(x)` where uD is defined as
function uD(x) {
  return x === null ? 0 : +x;
}

(I cannot use `+(x || 0)` because that turns NaN into 0 as well, and I don't want that).
All in all, I'm pretty disappointed about this conversion being dropped. I'd very much like to see strong mode still allowing this. But of course, from a pure JS perspective, why would anyone think that `null | 0 === +null === 0` is a useful behavior worth preserving? So I am my own devil's advocate to this change, much to my dismay :-(

:)  Yes, x|0 and +x are common patterns for int/double conversion in JS code generators (see also asm.js). I am kind of torn whether it is worth making an exception for them. The "clean" thing in human-written code would be to use Number(x) and Number(x)|0 for conversions, but I realise that's more verbose.

For a code generator, would you consider the need for helpers like the above an actual problem, or is it just an inconvenience?

 
B2. No way to define a class-wide *field*

In non-strong mode, Scala.js uses a class-wide field, added on each class' prototype, to record RTTI about the class:
SomeClass.prototype.$classData =
  new TypeData(<some metadata about SomeClass>);

(in the future, this will be a constant Int instead, with each class having its own value, but the issue is the same)
These are mostly used for instance-of tests, such as `x.isInstanceOf[Foo]`. Note that Foo can be an *interface*, which doesn't exist at runtime except at the meta level, so using the JS `instanceof` operator is not an option.
In strong mode, I cannot define a field neither on the class' prototype, nor on the class function itself, since there is no ES6 construct allowing to define it, and both are frozen. So, I basically have to duplicate this $classData field in *every* instance of every Scala.js class. Even though it might end up having faster access when it's needed, it's mostly a huge memory penalty (penalty which might eventually be compensated for thanks to the headers of strong objects being smaller). Currently, the initialization of this field is done in the generated constructor(), and so it is rewritten by constructors in subclasses, which currently violates one of the restrictions of constructors. If this violation is confirmed (see also the mailing list thread [3]), I'll find some other way (overwrite in the init__ methods), but for now I have avoided solving it because it appears to be non-trivial.

It's true that ES6 classes don't allow static fields. However, they do allow static getters/setters. Would that help your use case? For each class you could have a local variable storing the classData, and a getter for it on the class. That should actually be optimised quite well these days.


The ugly

U1. Interoperability

The really ugly thing is *interoperability* with JS code, which we expect to be not strong. Strong mode argues it can interoperate with weak mode, but there are things that don't really work.
I am mostly talking about `x.p = v` when `p` is not a property of `x`, as well as `delete x.p` (and, less importantly, about selecting `x.p` when `p` does not exist).
The problem is that, when you manipulate objects received from a weak library, or meant to be passed to a weak library, or both, you really want those things to behave like in weak mode. The most obvious example is: you're manipulating an object which is used and meant to be used as a map with string keys. It doesn't matter that Map is a better replacement: the weak library does not use Map, it uses an Object. But now, how do you add an element to that Object? You need an incredible amount of ceremony with defineProperty to replicate the semantics of `x.p = v`. Much worse: how do you *remove* an element from that dictionary? You *cannot*.
This will, IMO, be a problem for strong mode in general, not just as a compilation target of Scala.js.

In Scala.js, we work around this issue by declaring the three following functions in strict mode:
function weakSelect(x, p) { return x[p]; }
function weakAssign(x, p, v) { x[p] = v; }
function weakDelete(x, p) { delete x[p]; }
Then we reroute all selects, assigns and deletes to these functions instead of using syntax. This is possible in Scala.js because the codegen knows whether we're selecting something of a Scala.js object (for which we can use syntax) or manipulating JS objects external to Scala.js. We use the weak functions only in the latter case, which is hopefully not too common.

But this is really Ugly. My proposal to fix this would be to behave like weak code when manipulating weak objects, at least for `x.p = v` and `delete x.p` (selection is less problematic). I would argue that strong mode should still allow syntactically `delete x.p` and `delete x[p]`, but that they throw if `x` is a strong object. Similarly, `x.p = v` and `x[p] = v` would be able to create a field if `x` is weak. I believe this would dramatically improve the promise of strong mode to be interoperable with weak code.

Indeed, this is the most delicate change that strong mode imposes. We have actually considered the alternative you describe.

There actually is no strong reason to disallow delete on weak objects. Assignments are probably less critical as well. So I can imagine allowing both of those on weak objects in strong mode.

Unchecked reads would be a pity, though, given what a huge source of errors and debugging pain silent 'undefined' is in JS. The downside of the semantics you describe is that even in strong mode, you wouldn't get any checking for uses of a weak library (including all the JS base library). I expect such uses to be quite dominant, even in strong mode. On the other hand, I'd expect the number of objects that actually behave like dictionaries to be a small percentage only, although that's hard to estimate. Do you have pointers to concrete examples of APIs that require manipulating dictionary objects on the client side?


PS: early benchmarks (not good ^^)

Since we're a compiler, we can easily take an existing codebase and compile it with different output modes for comparing benchmarks. I did that for the benchmarks Richards, Tracer and DeltaBlue of V8's benchmark suite, which we've been using to track Scala.js' performance for quite some time. The current state, using iojs nightly of 4/28 with `--harmony-rest-parameters --strong-mode` (uses V8 4.2.77.18), can be found at [4], and yielded the following results. Results are in microseconds, so lower is better. The different versions are:

fastopt: ES 5.1 output
fullopt: ES 5.1 output, piped through the Google Closure Compiler
es6: ES 6 output, mostly leveraging classes, rest parameters (also uses let and const)
strongmode: ES 6 output compliant with strong mode (modulo the exceptions I mentioned above)
js: reference implementation written in JavaScript (directly taken from V8's benchmarks)

For a fair comparison, compare es6 and strongmode to fastopt.

richards [fastopt]     : 106.88899577788467 us
richards [fullopt]     : 129.30755802676666 us
richards [es6]         : 379.21880925293897 us
richards [strongmode]  : 377.5009437523594 us
richards [js]          : 116.72016340822877 us

tracer [fastopt]       : 1297.6653696498054 us
tracer [fullopt]       : 1076.4262648008612 us
tracer [es6]           : 14000 us
tracer [strongmode]    : 14678.83211678832 us
tracer [js]            : 1166.8611435239206 us

deltablue [fastopt]    : 275.1031636863824 us
deltablue [fullopt]    : 330.57851239669424 us
deltablue [es6]        : 2560.819462227913 us
deltablue [strongmode] : 2992.526158445441 us
deltablue [js]         : 131.27666557269444 us


So far, these benchmarks are not looking good for ES 6, and even less so for Strong Mode (they're basically 4x to 10x slower than fastopt). This is expected of course. Classes have barely made it out of staging in V8, so we cannot expect them to be optimized yet. I expect classes and strong mode to get faster with time.

Oh dear, I knew that our ES6 performance still sucks (we don't optimise it much yet), but that's awful. :(

However, have you tried a more recent version of V8 (latest 4.4)? I suspect that a large part of the overhead you see is due to block scoped declarations not being stack-allocated until recently, and thus easily being 20x slower than 'var'. I would be very interested in knowing how much the recent change there improves those numbers.

Thanks again for your analysis!

/Andreas


--
You received this message because you are subscribed to the Google Groups "Strengthen JS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to strengthen-j...@googlegroups.com.
To post to this group, send email to streng...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/strengthen-js/CAJwkOg5RLG7LWPhNtKUkyQ4D5YtFwCrEiEZSEyQJ82gwC7Wsvg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Sébastien Doeraene

unread,
May 6, 2015, 5:37:49 AM5/6/15
to Andreas Rossberg, streng...@googlegroups.com
Hi,

Thanks for your answer. Replies inline.

A more uniform pattern for explicit conversions would perhaps be to call the respective constructors, e.g. String(..), Number(..) etc.

True, I had not considered using String() and Number().

But as a compiler writer, I'm not interested in the most uniform pattern. I'm interested in a) the most efficient thing, and b) the shortest thing. Usually, most efficient means avoid using stuff from the global scope, or at least snapshot them in local constants, because lookups in the global scope are costly. If I can use operators (whose meaning cannot be overridden in any way), I expect the VM to (eventually) make it as fast as it can be.

So far, my best finding to call reliably the internal ToString(x) is to emit the template string `${x}`, which is what I do now in strong mode to convert something to a string. Snapshotting String would also work, I guess.

Question: are snapshots really as fast as they can be? For example, if I start the mega function surrounding my entire script with
const ToString = global.String;
can I then use ToString(x) and expect the VM to compile calls to ToString(x) to dispatch-free code? It seems to me that this should be possible in theory, since a) ToString is first evaluated, and at that point the VM knows it is the %String% builtin, then b) references to ToString can be resolved statically, because it's a local const that I close over so c) the compiler should know how to compile it away completely. But this process is a priori harder than optimizing code using only operators.

Currently, I use snapshots only for Math.imul and Math.fround, because for these there is no equivalent using only operators.
 
:)  Yes, x|0 and +x are common patterns for int/double conversion in JS code generators (see also asm.js). I am kind of torn whether it is worth making an exception for them. The "clean" thing in human-written code would be to use Number(x) and Number(x)|0 for conversions, but I realise that's more verbose.

For a code generator, would you consider the need for helpers like the above an actual problem, or is it just an inconvenience?

Again, it depends on whether a snapshot for Number would make Number(x) as fast as +x and Number(x)|0 as fast as x|0. If it is as fast, then I would say it is just an inconvenience, and a minor one. I would snapshot
const N = Number;
and use N(x) instead of +x and N(x)|0 instead of x|0. No big deal.
It's a few more characters, but if it is as fast (in most modern VMs, not just V8), then it's fine.

If it is slower, then, as a compiler writer, I lose performance in strong mode that I have in strict mode. Since one of the two main goals of strong mode is to be fast, this is awkward.
 
It's true that ES6 classes don't allow static fields. However, they do allow static getters/setters. Would that help your use case? For each class you could have a local variable storing the classData, and a getter for it on the class. That should actually be optimised quite well these days.

That would be optimized well if I was selecting $classData in monomorphic contexts.
The problem is that x.$classData is used most of the time in highly megamorphic contexts, on an x that, at runtime, assumes instances of many different classes, even at any given point in the program.
If I use a getter (either static method with empty parens of a real getter), calling the getter is a megamorphic dispatch versus selecting a field on the direct prototype of the object. I did not do any benchmarking of both approaches at this point, but my general understanding of compiler technology and VMs suggests that the megamorphic dispatch would be significantly slower.

Am I right? Or am I completely overlooking something?
 
Indeed, this is the most delicate change that strong mode imposes. We have actually considered the alternative you describe.

There actually is no strong reason to disallow delete on weak objects. Assignments are probably less critical as well. So I can imagine allowing both of those on weak objects in strong mode.

Unchecked reads would be a pity, though, given what a huge source of errors and debugging pain silent 'undefined' is in JS. The downside of the semantics you describe is that even in strong mode, you wouldn't get any checking for uses of a weak library (including all the JS base library). I expect such uses to be quite dominant, even in strong mode.

I quite agree. I had the same reasoning.
 
On the other hand, I'd expect the number of objects that actually behave like dictionaries to be a small percentage only, although that's hard to estimate. Do you have pointers to concrete examples of APIs that require manipulating dictionary objects on the client side?

Besides JSON manipulation, I don't have any concrete examples, no. I also believe it should be a small percentage. So selection is probably best kept as throwing for weak objects as well.
 
Oh dear, I knew that our ES6 performance still sucks (we don't optimise it much yet), but that's awful. :(

However, have you tried a more recent version of V8 (latest 4.4)? I suspect that a large part of the overhead you see is due to block scoped declarations not being stack-allocated until recently, and thus easily being 20x slower than 'var'. I would be very interested in knowing how much the recent change there improves those numbers.

I did not realize the improvements to let/const came later. I have just reran the benchmarks with the latest master of V8 (freshly built an hour ago). There is some improvement, but it's still *way* slower than ES 5.1. It's still awful (still up to 10x for tracer, and > 6x for deltablue -- richards is getting very close):

richards [fastopt] d8     : 107.86904697696995 us
richards [fastopt] iojs   : 108.11395210551922 us
richards [fullopt] d8     : 134.925453686838 us
richards [fullopt] iojs   : 128.60909266285125 us
richards [es6] d8         : 117.23329425556858 us
richards [es6] iojs       : 385.7280617164899 us
richards [strongmode] d8  : 121.59533073929961 us
richards [strongmode] iojs : 383.6562440053712 us
richards [js] d8          : 102.35938379650955 us
richards [js] iojs        : 116.61127631041921 us

tracer [fastopt] d8       : 1073.5373054213635 us
tracer [fastopt] iojs     : 1305.4830287206266 us
tracer [fullopt] d8       : 950.1187648456057 us
tracer [fullopt] iojs     : 1096.4912280701753 us
tracer [es6] d8           : 10810.81081081081 us
tracer [es6] iojs         : 13137.254901960785 us
tracer [strongmode] d8    : 12616.352201257861 us
tracer [strongmode] iojs  : 14808.823529411764 us
tracer [js] d8            : 961.0764055742432 us
tracer [js] iojs          : 1151.4104778353483 us

deltablue [fastopt] d8    : 267.5227394328518 us
deltablue [fastopt] iojs  : 277.7392028884877 us
deltablue [fullopt] d8    : 421.1412929037692 us
deltablue [fullopt] iojs  : 340.3675970047652 us
deltablue [es6] d8        : 1875.3514526710403 us
deltablue [es6] iojs      : 2578.6082474226805 us
deltablue [strongmode] d8 : 2176.278563656148 us
deltablue [strongmode] iojs : 3004.5045045045044 us
deltablue [js] d8         : 126.57426745142712 us
deltablue [js] iojs       : 131.65690211309328 us


 
Thanks again for your analysis!

My pleasure. I very much believe in strong mode, and I'm happy to help making it as good as possible.

Cheers,
Sébastien

Andreas Rossberg

unread,
May 6, 2015, 7:23:02 AM5/6/15
to Sébastien Doeraene, streng...@googlegroups.com
On 6 May 2015 at 11:37, Sébastien Doeraene <sjrdo...@gmail.com> wrote:
A more uniform pattern for explicit conversions would perhaps be to call the respective constructors, e.g. String(..), Number(..) etc.

True, I had not considered using String() and Number().

But as a compiler writer, I'm not interested in the most uniform pattern. I'm interested in a) the most efficient thing, and b) the shortest thing. Usually, most efficient means avoid using stuff from the global scope, or at least snapshot them in local constants, because lookups in the global scope are costly. If I can use operators (whose meaning cannot be overridden in any way), I expect the VM to (eventually) make it as fast as it can be.

Yes, I understand. For a code generator I'd recommend doing so as well.

Question: are snapshots really as fast as they can be? For example, if I start the mega function surrounding my entire script with
const ToString = global.String;
can I then use ToString(x) and expect the VM to compile calls to ToString(x) to dispatch-free code? It seems to me that this should be possible in theory, since a) ToString is first evaluated, and at that point the VM knows it is the %String% builtin, then b) references to ToString can be resolved statically, because it's a local const that I close over so c) the compiler should know how to compile it away completely. But this process is a priori harder than optimizing code using only operators.

There actually is no %String% or %Number% built-in, they're implemented in JavaScript. Once code gets optimised, inlining and type specialisation should produce decent code for calls to them, though. But because their implementation is not completely trivial there is no guarantee that the optimiser does a good job in all cases. So for a code generator, I'd still use other patterns.


:)  Yes, x|0 and +x are common patterns for int/double conversion in JS code generators (see also asm.js). I am kind of torn whether it is worth making an exception for them. The "clean" thing in human-written code would be to use Number(x) and Number(x)|0 for conversions, but I realise that's more verbose.

For a code generator, would you consider the need for helpers like the above an actual problem, or is it just an inconvenience?

Again, it depends on whether a snapshot for Number would make Number(x) as fast as +x and Number(x)|0 as fast as x|0. If it is as fast, then I would say it is just an inconvenience, and a minor one. I would snapshot
const N = Number;
and use N(x) instead of +x and N(x)|0 instead of x|0. No big deal.
It's a few more characters, but if it is as fast (in most modern VMs, not just V8), then it's fine.

If it is slower, then, as a compiler writer, I lose performance in strong mode that I have in strict mode. Since one of the two main goals of strong mode is to be fast, this is awkward.

With your current wrappers I'm pretty confident that you shouldn't see any significant slowdown, because they should always get inlined in optimised code. You could try to estimate the cost by experimentally introducing similar wrappers in the es5 mode of your compiler and measure if that costs anything. 


It's true that ES6 classes don't allow static fields. However, they do allow static getters/setters. Would that help your use case? For each class you could have a local variable storing the classData, and a getter for it on the class. That should actually be optimised quite well these days.

That would be optimized well if I was selecting $classData in monomorphic contexts.
The problem is that x.$classData is used most of the time in highly megamorphic contexts, on an x that, at runtime, assumes instances of many different classes, even at any given point in the program.
If I use a getter (either static method with empty parens of a real getter), calling the getter is a megamorphic dispatch versus selecting a field on the direct prototype of the object. I did not do any benchmarking of both approaches at this point, but my general understanding of compiler technology and VMs suggests that the megamorphic dispatch would be significantly slower.

Am I right? Or am I completely overlooking something?

The getter would be on the prototype as well, so the cost (and degree of polymorphism) of lookup should be exactly the same as for a data property. Inlining (of the getter) should usually take care of the remaining overhead in optimised code. Overall, I wouldn't be surprised of this is actually cheaper than putting a property on every instance. But it's probably worth measuring.
I see, thanks! I suppose Richards is not making use of classes or rest. How much do the others do?

Are the compiled versions of these benchmarks accessible somewhere? It would be worth analysing their execution on our end.

Thanks!
/Andreas

Andreas Rossberg

unread,
May 8, 2015, 8:38:18 AM5/8/15
to Sébastien Doeraene, streng...@googlegroups.com
On 6 May 2015 at 11:37, Sébastien Doeraene <sjrdo...@gmail.com> wrote:

Indeed, this is the most delicate change that strong mode imposes. We have actually considered the alternative you describe.

There actually is no strong reason to disallow delete on weak objects. Assignments are probably less critical as well. So I can imagine allowing both of those on weak objects in strong mode.

Unchecked reads would be a pity, though, given what a huge source of errors and debugging pain silent 'undefined' is in JS. The downside of the semantics you describe is that even in strong mode, you wouldn't get any checking for uses of a weak library (including all the JS base library). I expect such uses to be quite dominant, even in strong mode.

I quite agree. I had the same reasoning.

Here is an idea that might be a workable compromise: we could make missing property access (on weak objects in strong mode) an error only for dot notation, but keep returning undefined for bracket notation. Likewise, deletion would actually be allowed with brackets.

/Andreas

Sébastien Doeraene

unread,
May 8, 2015, 9:08:03 AM5/8/15
to Andreas Rossberg, streng...@googlegroups.com
Hi,

Here is an idea that might be a workable compromise: we could make missing property access (on weak objects in strong mode) an error only for dot notation, but keep returning undefined for bracket notation. Likewise, deletion would actually be allowed with brackets.

I actually had precisely that idea, but didn't think it would be even considered acceptable! From my point of view, this is perfect.

With your current wrappers I'm pretty confident that you shouldn't see any significant slowdown, because they should always get inlined in optimised code. You could try to estimate the cost by experimentally introducing similar wrappers in the es5 mode of your compiler and measure if that costs anything.

The getter would be on the prototype as well, so the cost (and degree of polymorphism) of lookup should be exactly the same as for a data property. Inlining (of the getter) should usually take care of the remaining overhead in optimised code. Overall, I wouldn't be surprised of this is actually cheaper than putting a property on every instance. But it's probably worth measuring.

I plan to do that kind of experiments next week.


Are the compiled versions of these benchmarks accessible somewhere? It would be worth analysing their execution on our end.

No, they aren't. I'll push them somewhere and give you a link.

Cheers,
Sébastien

Sébastien Doeraene

unread,
May 8, 2015, 9:45:34 AM5/8/15
to Andreas Rossberg, streng...@googlegroups.com
Here is a gist with all variants of {richards,tracer,deltablue} x {es5,es6,strongmode}:
https://gist.github.com/sjrd/c846d716a0d01b8902a7

Each file can be run simply with
$ d8 --harmony-rest-parameters --strong-mode

Cheers,
Sébastien

Youness Belfkih

unread,
Mar 26, 2016, 3:29:30 AM3/26/16
to Strengthen JS, sjrdo...@gmail.com
yes please, a window for code genrators
Reply all
Reply to author
Forward
0 new messages