Hello all,
As you might have read earlier on this list, I have been working on a strong mode backend for the Scala.js [1] toolchain. I am basically done, with a PR awaiting review [2]. I believe it is time to give a small(-ish) report of what I found good, bad, and ugly about strong mode while doing so, so here it is.
tl;dr Skip to Conclusion
The good (not necessarily super interesting -- skip ahead to "The bad" unless you're interested)
As soon as I had learned about strong mode, I read the strawman, and thought it would be a very good fit for Scala.js. There are quite a number of fundamental properties of Scala.js that follow strong mode's philosophy.
G1. Strictness.
Scala.js was always defined to follow and target strict mode.
G2. let/const, declare-before-use (intra method)
The rules on var/val usage in Scala are such that their naive translation to let/const directly follows the restrictions in strong mode.
G3. Classes are frozen, instances are sealed.
Coming from a statically typed language with classes, it is obvious that you emit only classes that follow this restriction. Classes in a statically typed language *are* frozen by nature, and their objects sealed as well.
G4. Arrays.
Scala.js had already taken the side of favoring non-sparse arrays. The API exposed about js.Arrays in Scala.js (the type of a JS Array) assumes the arrays do not contain holes.
G5. The `arguments` magic variable
In Scala.js, the `arguments` magic variable is not exposed to the developer. There is no way to access it. Some very isolated compiler-generated code manipulates `arguments` to export vararg methods to JS, but these usages are only rest-parameter-like, so we could almost trivially turn these into using ...rest params instead.
G6. Constructors
As harsh as the restrictions of strong mode on constructors are, it turns out Scala.js emits just that kind of constructors (except for one thing, see below). The reason is that in Scala, constructors can be overloaded. So just like all other methods, we give different mangled names to the different overloads of constructors, and turn them into `init__xyz` methods. So instantiating a Scala.js object produces JS code that looks like `new TheClass().init___I(5)`. The JS constructor, being unique, only initializes all fields to the zero of their type, according to JVM semantics (0 for numbers, false for booleans, null for pretty much everything else).
G7. undefined
undefined is basically considered as a keyword by the codegen, so it's not possible to change its meaning. Moreover, the value undefined is always read as `void 0` instead of `undefined`.
G8. eval
Direct eval is not even allowed in Scala.js. Only indirect eval is possible.
G9. switch
switch statements are only emitted for `match` expressions in Scala, which naturally follow the restrictions on switch in strong mode.
G10. == and !=
These do not exist in Scala.js.
G11. for..in
As such, it does not exist in Scala.js. There is a method in the standard library, called propertiesOf, that returns a js.Array[String] with the keys returned by a direct loop on for..in, though. But this function is implemented once as such:
function propertiesOf(x) {
const r = [];
for (const p in x)
r.push(p);
return r;
}
It is enough to declare this function in strict mode surrounding the strong mode block, and use it from strong code.
G12. Implicit conversions
Being a statically typed language, operators are checked at compile time anyway. Currently we still emit `string + nonString`, though (we have to fix that by calling `.toString()` explicitly on `nonString`).
The bad
Now coming to the bad. There are a few things that didn't work out quite as well as I hoped.
B1. `null | 0` and `+null`
These two would return 0 in strict mode, but throws in strong mode. It turns out that the conversion of `null` to 0 with | and + is *very* useful to the codegen of Scala.js. Due to a pretty obscure aspect of Scala's specification, the pervasive "unboxing" operations `x.asInstanceOf[Int]` and `x.asInstanceOf[Double]` (which are inserted in *many* places by the compiler) must return 0 if x === null or x itself if it is an Int/Double (in Scala.js any other value is an undefined behavior), which translates extremely well to `x | 0` and `+x` respectively. Also, the `null` case happens in fact extremely rarely. So it's actually a very good thing for Scala.js code that the VM will optimistically optimize this for the number case. I don't really care that there's a deoptimization if x turns out to be null, because in practice it rarely happens (for most codebases, it never happens, actually, but detecting it at compile time is undecidable).
So now I have to emit `(x || 0) | 0` for the Int case instead, and, worse, `uD(x)` where uD is defined as
function uD(x) {
return x === null ? 0 : +x;
}
(I cannot use `+(x || 0)` because that turns NaN into 0 as well, and I don't want that).
All in all, I'm pretty disappointed about this conversion being dropped. I'd very much like to see strong mode still allowing this. But of course, from a pure JS perspective, why would anyone think that `null | 0 === +null === 0` is a useful behavior worth preserving? So I am my own devil's advocate to this change, much to my dismay :-(
B2. No way to define a class-wide *field*
In non-strong mode, Scala.js uses a class-wide field, added on each class' prototype, to record RTTI about the class:
SomeClass.prototype.$classData =
new TypeData(<some metadata about SomeClass>);
(in the future, this will be a constant Int instead, with each class having its own value, but the issue is the same)
These are mostly used for instance-of tests, such as `x.isInstanceOf[Foo]`. Note that Foo can be an *interface*, which doesn't exist at runtime except at the meta level, so using the JS `instanceof` operator is not an option.
In strong mode, I cannot define a field neither on the class' prototype, nor on the class function itself, since there is no ES6 construct allowing to define it, and both are frozen. So, I basically have to duplicate this $classData field in *every* instance of every Scala.js class. Even though it might end up having faster access when it's needed, it's mostly a huge memory penalty (penalty which might eventually be compensated for thanks to the headers of strong objects being smaller). Currently, the initialization of this field is done in the generated constructor(), and so it is rewritten by constructors in subclasses, which currently violates one of the restrictions of constructors. If this violation is confirmed (see also the mailing list thread [3]), I'll find some other way (overwrite in the init__ methods), but for now I have avoided solving it because it appears to be non-trivial.
The ugly
U1. Interoperability
The really ugly thing is *interoperability* with JS code, which we expect to be not strong. Strong mode argues it can interoperate with weak mode, but there are things that don't really work.
I am mostly talking about `x.p = v` when `p` is not a property of `x`, as well as `delete x.p` (and, less importantly, about selecting `x.p` when `p` does not exist).
The problem is that, when you manipulate objects received from a weak library, or meant to be passed to a weak library, or both, you really want those things to behave like in weak mode. The most obvious example is: you're manipulating an object which is used and meant to be used as a map with string keys. It doesn't matter that Map is a better replacement: the weak library does not use Map, it uses an Object. But now, how do you add an element to that Object? You need an incredible amount of ceremony with defineProperty to replicate the semantics of `x.p = v`. Much worse: how do you *remove* an element from that dictionary? You *cannot*.
This will, IMO, be a problem for strong mode in general, not just as a compilation target of Scala.js.
In Scala.js, we work around this issue by declaring the three following functions in strict mode:
function weakSelect(x, p) { return x[p]; }
function weakAssign(x, p, v) { x[p] = v; }
function weakDelete(x, p) { delete x[p]; }
Then we reroute all selects, assigns and deletes to these functions instead of using syntax. This is possible in Scala.js because the codegen knows whether we're selecting something of a Scala.js object (for which we can use syntax) or manipulating JS objects external to Scala.js. We use the weak functions only in the latter case, which is hopefully not too common.
But this is really Ugly. My proposal to fix this would be to behave like weak code when manipulating weak objects, at least for `x.p = v` and `delete x.p` (selection is less problematic). I would argue that strong mode should still allow syntactically `delete x.p` and `delete x[p]`, but that they throw if `x` is a strong object. Similarly, `x.p = v` and `x[p] = v` would be able to create a field if `x` is weak. I believe this would dramatically improve the promise of strong mode to be interoperable with weak code.
Conclusion
There are many things that the semantics of Scala.js and strong mode have in common, and that makes strong mode a really good target for Scala.js. There a few bad and ugly things, but we do have workarounds for them, even if some of them can impact negatively performance or memory consumption.
1. I very much regret that `null | 0` and `+null` are not allowed anymore.
2. There is a need for class-wide fields, which is currently hard to get by.
3. Interoperability with weak mode is not as good as it appears.
I would gladly receive your opinions on these matters!
Cheers,
Sébastien
PS: early benchmarks (not good ^^)
Since we're a compiler, we can easily take an existing codebase and compile it with different output modes for comparing benchmarks. I did that for the benchmarks Richards, Tracer and DeltaBlue of V8's benchmark suite, which we've been using to track Scala.js' performance for quite some time. The current state, using iojs nightly of 4/28 with `--harmony-rest-parameters --strong-mode` (uses V8 4.2.77.18), can be found at [4], and yielded the following results. Results are in microseconds, so lower is better. The different versions are:
fastopt: ES 5.1 output
fullopt: ES 5.1 output, piped through the Google Closure Compiler
es6: ES 6 output, mostly leveraging classes, rest parameters (also uses let and const)
strongmode: ES 6 output compliant with strong mode (modulo the exceptions I mentioned above)
js: reference implementation written in JavaScript (directly taken from V8's benchmarks)
For a fair comparison, compare es6 and strongmode to fastopt.
richards [fastopt] : 106.88899577788467 us
richards [fullopt] : 129.30755802676666 us
richards [es6] : 379.21880925293897 us
richards [strongmode] : 377.5009437523594 us
richards [js] : 116.72016340822877 us
tracer [fastopt] : 1297.6653696498054 us
tracer [fullopt] : 1076.4262648008612 us
tracer [es6] : 14000 us
tracer [strongmode] : 14678.83211678832 us
tracer [js] : 1166.8611435239206 us
deltablue [fastopt] : 275.1031636863824 us
deltablue [fullopt] : 330.57851239669424 us
deltablue [es6] : 2560.819462227913 us
deltablue [strongmode] : 2992.526158445441 us
deltablue [js] : 131.27666557269444 us
So far, these benchmarks are not looking good for ES 6, and even less so for Strong Mode (they're basically 4x to 10x slower than fastopt). This is expected of course. Classes have barely made it out of staging in V8, so we cannot expect them to be optimized yet. I expect classes and strong mode to get faster with time.
[2]
https://github.com/scala-js/scala-js/pull/1620[3]
https://groups.google.com/forum/#!topic/strengthen-js/qSTV7pBOkyE[4]
https://github.com/sjrd/scalajs-benchmarks/tree/es6-and-strongmode