Native vs Interpreted

Nicholas Nethercote

unread,

May 21, 2012, 11:32:52 PM5/21/12

to JS Internals list

Hi,

We currently have two kinds of scripts: "native" and "interpreted".

"Native" is fine, but "interpreted" is sounding wrong these days. And
lazy bytecode generation
(https://bugzilla.mozilla.org/show_bug.cgi?id=678037) is going to
split the "interpreted" scripts into two groups -- those that have
been compiled, and those that have not. "Interpreted but not yet
compiled" is really confusing!

So I want to change "interpreted" to something else. I'm currently
using "sourced" in my patch. The idea is that the script is "sourced"
because it has source code. I'd be happy to hear other suggestions.

Nick

Brendan Eich

unread,

May 21, 2012, 11:46:22 PM5/21/12

to Nicholas Nethercote, JS Internals list

All scripts have source, that is, come from source.

Would {loaded, lexed, parsed, compiled} or some subset spell out the
differences better?

/be

> _______________________________________________
> dev-tech-js-engine-internals mailing list
> dev-tech-js-en...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals

Boris Zbarsky

unread,

May 21, 2012, 11:49:41 PM5/21/12

to

On 5/21/12 11:46 PM, Brendan Eich wrote:
> All scripts have source, that is, come from source.

I think Nicholas was actually talking about JSFunctions, for which we
definitely have a distinction between "interpreted" (as in, comes from
some actual JS) and "native" (as in, just constructed from a JSNative
C++ function, and hence not having any "source" for purposes of the JS
engine).

-Boris

Luke Wagner

unread,

May 22, 2012, 12:18:44 AM5/22/12

to Nicholas Nethercote, JS Internals list

Currently, the adjective we use is "scripted". This name instead of "interpreted" would have a nice coherence in, e.g.,

if (fun->isScripted())
... = fun->script();

Now, with lazy compilation, this gets a bit more confusing since fun->isScripted() doesn't imply you can access fun->script(). What you are guaranteed is access to the source code which is I guess why you chose "sourced":

if (fun->isSourced()) {
... = fun->source();
if (fun->hasScript())
... = fun->script();
}

I wasn't initially a fan of "sourced", but the above makes me think it is the best option. I would ask that perhaps we could try to restrict use of the "sourced" adjective to cases where we truly have the trifurcation and use the "scripted" adjective when there are only two cases (e.g., with StackIter, where presumably one cannot have an activation with isSourced && !hasScript).

Brendan Eich

unread,

May 22, 2012, 1:04:59 AM5/22/12

to Luke Wagner, Nicholas Nethercote, JS Internals list

Boris is right, I was completely thrown by the use of "native scripts"
-- we have "native functions" which have no scripts.

In this light, it might seem better instead of isSourced to use
hasSource in parallel with hasScript, but the latter is a state
predicate whose result can change, while isSourced is an immutable
attribute. I buy it.

/be

David Anderson

unread,

May 22, 2012, 1:45:05 AM5/22/12

to dev-tech-js-en...@lists.mozilla.org

I like 'sourced'. FWIW I think 'native' is what's broken, I could stick
with 'interpreted' since whether or not the script has been parsed yet,
it's something we could interpret whereas a native function never is.

One thing to keep in mind, from the JIT's perspective, is that we're
starting to see use cases where it might make sense to split JSFunction
into two things, to extend it without bloating every closure. I.e.
'nargs', 'flags', 'native', 'script', 'env', 'source' etc should all be
in a separate gcthing. So hopefully whatever naming scheme we choose
wouldn't prohibit that. Especially if we're going to keep source *and*
bytecode lying around.

-David

On 5/21/2012 9:18 PM, Luke Wagner wrote:
> Currently, the adjective we use is "scripted". This name instead of "interpreted" would have a nice coherence in, e.g.,
>
> if (fun->isScripted())
> ... = fun->script();
>

Brian William Hackett

unread,

May 22, 2012, 8:38:43 AM5/22/12

to Luke Wagner, Nicholas Nethercote, JS Internals list

I think it would be cleaner to keep functions the way they are, and just allow scripts to either have bytecode + source or just source. Distinguishing at the function level means that when you actually do generate bytecode for a script you need to fixup all the functions referring to that script, of which there can be many, and also means you'll need to distinguish at the global/eval script level and all the APIs which work directly with scripts.

Brian

Luke Wagner

unread,

May 22, 2012, 10:57:42 AM5/22/12

to Brian William Hackett, Nicholas Nethercote, JS Internals list

> I think it would be cleaner to keep functions the way they are, and
> just allow scripts to either have bytecode + source or just source.

sizeof(JSScript) is itself a large part of script memory (about half iirc) so to achieve the desired memory savings, we'd have to further break up JSScript which seems unpleasant.

> Distinguishing at the function level means that when you actually
> do generate bytecode for a script you need to fixup all the
> functions referring to that script

That seems solvable in several ways, e.g., by keeping an embedded linked list of functions which all get fixed up when one of them needs a JSScript.

> , of which there can be many, and

possibly, but probably just 1 or 2.

> also means you'll need to distinguish at the global/eval script
> level and all the APIs which work directly with scripts.

Since global/eval scripts are immediately executed, they would never be lazy so I don't understand what you are referring to.

Boris Zbarsky

unread,

May 22, 2012, 11:04:02 AM5/22/12

to

On 5/22/12 10:57 AM, Luke Wagner wrote:
> Since global/eval scripts are immediately executed

Uh... They don't have to be. And e.g. in XUL they're not.

-Boris

Luke Wagner

unread,

May 22, 2012, 11:22:49 AM5/22/12

to Boris Zbarsky, dev-tech-js-en...@lists.mozilla.org

If you are just talking about the XUL prototype cache, that is a very small fraction of scripts. Furthermore, since the whole point of the XUL prototype cache is to avoid re-parsing, we'd definitely want to eagerly compile them so that way CloneScript could just memcpy the bytecode. Are there any other common cases where we JS_CompileScript but don't JS_ExecuteScript?

Boris Zbarsky

unread,

May 22, 2012, 11:42:20 AM5/22/12

to

On 5/22/12 11:22 AM, Luke Wagner wrote:
> Are there any other common cases where we JS_CompileScript but don't JS_ExecuteScript?

In Firefox, I think not. Though XBL might move in that direction, perhaps.

-Boris

Brian William Hackett

unread,

May 22, 2012, 12:02:48 PM5/22/12

to Luke Wagner, Nicholas Nethercote, JS Internals list

I thought that the idea of being able to execute scripts without compiling bytecode for them was on the table, if not now then in the future at least. If that's the case, then cutting things in the right place now will make things easier later. Is a JSScript "a function or global JS script" or "a function or global JS script which has been compiled to bytecode" ?

Memory usage is a wash either way. Most script memory is storing data that only applies to scripts that have been compiled to bytecode and would be trivial to spin off into a separate structure, and the rest won't be affected by whether JSScript must be compiled or not (two principals fields?? can these go away now with CPG?)

Brian

----- Original Message -----

Dave Mandelin

unread,

May 22, 2012, 12:08:06 PM5/22/12

to mozilla.dev.tech.j...@googlegroups.com, dev-tech-js-en...@lists.mozilla.org

ECMA-262 uses the terminology 'native' (for vanilla JS objects) and 'host' (for objects provided by the host environment). I like using spec terms but both of those terms are heavily overloaded and may be confusing to some.

'Builtin' is a pretty common usage, and less ambiguous, although it gets weird with self-hosting, and is also a bit weird for host objects that aren't actually built in to the engine. Python uses 'builtin' as a term but seems to use PyMethod as the C type, which I assume is from some history before classes.

'isSourced' sounds very weird to me. Source is not a verb in this context.

Given all that, I like 'host function' and 'isHostFunction' to check for stuff not written in C++. If that's intolerably bureaucratic, my next choice is 'builtin' and 'isBuiltin'. I would think we don't even need a predicate for the normal case given that not not-JS => JS. 'Native' is too overloaded, so for comment terminology I prefer 'scripted'. Having source code attached or not is a separate concept, and I like 'hasSource' for that.

Dave

Bobby Holley

unread,

May 22, 2012, 12:10:36 PM5/22/12

to Brian William Hackett, Luke Wagner, Nicholas Nethercote, JS Internals list

On Tue, May 22, 2012 at 6:02 PM, Brian William Hackett <
bhac...@stanford.edu> wrote:

> (two principals fields?? can these go away now with CPG?)
>

Hopefully yes. I'm taking an axe to this stuff as we speak.

Dave Mandelin

unread,

May 22, 2012, 12:08:06 PM5/22/12

to dev-tech-js-en...@lists.mozilla.org

ECMA-262 uses the terminology 'native' (for vanilla JS objects) and 'host' (for objects provided by the host environment). I like using spec terms but both of those terms are heavily overloaded and may be confusing to some.

'Builtin' is a pretty common usage, and less ambiguous, although it gets weird with self-hosting, and is also a bit weird for host objects that aren't actually built in to the engine. Python uses 'builtin' as a term but seems to use PyMethod as the C type, which I assume is from some history before classes.

'isSourced' sounds very weird to me. Source is not a verb in this context.

Given all that, I like 'host function' and 'isHostFunction' to check for stuff not written in C++. If that's intolerably bureaucratic, my next choice is 'builtin' and 'isBuiltin'. I would think we don't even need a predicate for the normal case given that not not-JS => JS. 'Native' is too overloaded, so for comment terminology I prefer 'scripted'. Having source code attached or not is a separate concept, and I like 'hasSource' for that.

Dave

On Monday, May 21, 2012 10:45:05 PM UTC-7, David Anderson wrote:

Boris Zbarsky

unread,

May 22, 2012, 1:03:42 PM5/22/12

to

On 5/22/12 12:02 PM, Brian William Hackett wrote:
> (two principals fields?? can these go away now with CPG?)

One might be able to if JSScript will now be tied to a particular
JSCompartment which has a fixed principal.

The other needs to stay: it indicates the originating principal of the
script, which may not have any compartments associated with it at all.

-Boris

Wes Garland

unread,

May 22, 2012, 1:37:00 PM5/22/12

to Luke Wagner, Boris Zbarsky, dev-tech-js-en...@lists.mozilla.org

On 22 May 2012 11:22, Luke Wagner <lu...@mozilla.com> wrote:

> Are there any other common cases where we JS_CompileScript but don't
> JS_ExecuteScript?
>

We (gpsee) do this to generate the compiled script, which we then in turn
write to disk with the XDR routines. We do it during "make install" as the
executing user usually does not have access to write the script cache
file. Our system is very much like the emacs .elc stuff.

How does Firefox generate the "fastload" XDR stuff for extensions?

Wes

--
Wesley W. Garland
Director, Product Development
PageMail, Inc.
+1 613 542 2787 x 102

Jeff Walden

unread,

May 22, 2012, 1:37:58 PM5/22/12

to Luke Wagner, Nicholas Nethercote, Brian William Hackett, JS Internals list

On 05/22/2012 07:57 AM, Luke Wagner wrote:
>> Distinguishing at the function level means that when you actually
>> do generate bytecode for a script you need to fixup all the
>> functions referring to that script
>
> That seems solvable in several ways, e.g., by keeping an embedded linked list of functions which all get fixed up when one of them needs a JSScript.

Note that if we ever want the ability to edit scripts "live" when debugging -- as I understand it Chrome has this -- we'll need to do something like this. (I suspect it's close to inconceivable that we wouldn't want to have that capability, eventually.)

On the original topic of names, yes, native/interpreted are confusing, no, I have no definitely-better (let alone "good") ideas for better names. Anyone know if the other engines have come up with good names for these things? Great programmers steal. :-)

Jeff

Brendan Eich

unread,

May 22, 2012, 3:16:53 PM5/22/12

to Wes Garland, Luke Wagner, Boris Zbarsky, dev-tech-js-en...@lists.mozilla.org

Wes Garland wrote:
> On 22 May 2012 11:22, Luke Wagner<lu...@mozilla.com> wrote:
>
>> Are there any other common cases where we JS_CompileScript but don't
>> JS_ExecuteScript?
>>
>
> We (gpsee) do this to generate the compiled script, which we then in turn
> write to disk with the XDR routines. We do it during "make install" as the
> executing user usually does not have access to write the script cache
> file. Our system is very much like the emacs .elc stuff.
>
> How does Firefox generate the "fastload" XDR stuff for extensions?

XPCOM components written in JS are fastloaded and use
JS_Compile*/JS_Execute* as you surmise.

The APIs exist, we should not narrowly talk about XUL prototype cache.
We have real uses to avoid recompiling. All of this could be
reengineered at cost of course, but that's not on the table in this thread.

/be

Brendan Eich

unread,

May 22, 2012, 4:39:25 PM5/22/12

to Dave Mandelin, mozilla.dev.tech.j...@googlegroups.com, dev-tech-js-en...@lists.mozilla.org

Dave Mandelin wrote:
> ECMA-262 uses the terminology 'native' (for vanilla JS objects) and 'host' (for objects provided by the host environment). I like using spec terms but both of those terms are heavily overloaded and may be confusing to some.

This is probably going to change in ES6, see

https://mail.mozilla.org/pipermail/es-discuss/2012-January/020157.html

and followups.

"Native" is just badly overloaded in common parlance. "Native ECMAScript
object" is a mouthful, people talk about "native code" too, and from
Java and elsewhere we have "native methods".

"Built-in" is better ("builtin" in code), I agree with Nick.

/be

Nicholas Nethercote

unread,

May 22, 2012, 10:01:31 PM5/22/12

to Brian William Hackett, Luke Wagner, JS Internals list

On Wed, May 23, 2012 at 2:02 AM, Brian William Hackett
<bhac...@stanford.edu> wrote:
>
> Is a JSScript "a function or global JS script" or "a function or global JS script which has been compiled to bytecode" ?

That's a very good question. I've been assuming the latter, and the
"does it have bytecode" question would be determined from the
JSFunction. Splitting JSScript into two parts, as you suggest -- one
part relating to compiled bytecode, and the other part holding
everything else -- will probably make lazy bytecode generation much
easier. I'll start looking into it. Thanks for the suggestion!

Nick