Modules for a future ECMAScript

9 views
Skip to first unread message

Kris Kowal

unread,
Jan 11, 2009, 2:14:54 AM1/11/09
to Chiron, ihab...@gmail.com, Mark Miller
Ihab Awad, from the team at Google that makes Caja, and I have had an
extended discussion over the last year about modules in JavaScript.
We've discovered a mutual interest in assuring that, if modules do
become an integral component of the JavaScript language, that the
modules satisfy certain requirements. Ihab is a proponent of
security, and I'm a proponent of usability. I've invited him to take
our email thread to this list.

In the old comic, one math researcher begins a proof at the top-left
of a chalk board, gives up and works from the conclusion in the
bottom-right toward his earlier work, and in a fit of frustration,
finally invokes the Theorem of Divine Intervention between the two
armies of equations: "and then a miracle occurs." Our conversation
has been similar. I have very clear ideas about what the code should
ultimately look like and Ihab has some very clear ideas about how it
needs to work. We've closed some of the gap and I've been awed by the
Caja team's combined kindness and intelligence.

I've attached the PDF Ihab provided with an account of our all-day
meeting a few months ago, and here's a reference to Ihab's brainstorm
before that meeting which I understand he's revising presently:

http://google-caja.googlecode.com/svn/trunk/experimental/doc/html/harmonyModules/index.html

There's also my original statement about the state of JavaScript
before I began building Chiron; the third major point is that it lacks
a module system: https://cixar.com/tracs/javascript/wiki/Why

Mark Miller recently notified me that Ihab and I are on the agenda to
present our ideas about modules at the ECMA meeting at Google later
this month, so we're compiling a presentation.

- L a module loader
- F is a module file
- C is a module constructor
- S is a module scope (the imports provided to the module)
- I is a module instance (the exports provided by the module)
- N is a module name

L(N) -> C
C(S) -> I

This illustrates some of the layers of a module loader system. The
separation is necessary to allow memoization of each layer so that
sandboxes can perform well. One layer can fetch (and memoize) the
text of a file. Another desugars and evaluates the file into a module
constructor function. Another layer calls the constructor function to
provide a module instance. Fully separating these layers permits a
system of modules to construct a sandbox that shares a fetcher,
evaluator, and constructor but has its own memo of module instances.
I would propose to revise this formula to include the factors of the
loader, a loader and evaluator.

- F the text of a module
- N the name of a module
- L loader (gets and memoizes or caches the text of modules)
- E evaluator (transforms the text of a module to a constructor function)
- C constructor (executes a module constructor
with a scope and provides an instance of a module)
- S scope (the module scope with its imports, and some builtins)
- I a module instance (the exports)

L(N) -> F (memoized or cached)
E(F) -> C (memoized or cached)
C(S) -> I (memoized)

We propose that ECMAScript should implement a strict and secure module system.

Requirements:
* modules, and their transitive imports, can be sandboxed.
* modules should not need to express the transitive dependencies of
the modules they import.
* the order in which a module imports its dependencies should not be
important for correctness. That is, the module constructor should
call module constructor functions of dependencies on demand if they
have not been memoized.
* The module loader should be free to prefetch the text of any
modules it might later need and create their constructor functions. A
module's text and constructor function should be sharable by dependent
sandboxes.
* module instances must be singleton within a sandbox.
* modules should be possible to reference other modules both
absolutely (relative to the module root) or relatively (relative to
their own name)
* a module can create a sandbox with all or a subset of its own capabilities
* a module context function must be evaluated in a sterile scope, one
that only contains frozen globals, frozen imports, and local
variables. "free variables" in the module text would be desugared as
import references or implicitly necessarily equivalent to module
imports where any free variable that fails to find an import or
builtin would throw a NameError or ReferenceError (I forget which is
appropriate in the context of ECMAScript).
* an import statement should block JavaScript execution. This MAY
imply desugaring to a continuation passing form.

This system would support ML or Python-like import syntax that would
be equivalent to a desugared, "salty", syntax that would also be
acceptable and supplantable by a module for its own use or for the use
of any sandboxes it constructs.

"import module" -> "var module = require("module")"
"import .module" -> "var module = require(".module")"
"import module as mod" -> "var mod = require("module")
"from module import *" -> "include("module")"
"from .module import *" -> "include(".module")"
"from module import a, b, c"
-> "var {a, b, c} = require("module")"
-> "include("module", {a, b, c})"
-> "include("module", ['a', 'b', 'c'])"
"from module import a as b" -> "var {a: "b"} = require("module")"
"import module with sandbox"

"sandbox" would be an alternate "require" function for use within the
sandbox. The "require" function could provided a subset of the
capabilities of the provided "require" function. Summarily, import
statements would be of the forms:

"import" ( (absoluteName | relativeName) [ "as" name ] [ "with"
expression ] )+
"from" (absoluteName | relativeName) "import" (name [ "as" name ])+ [
"with" expression ]

-- i could use some help formalizing this notation for whatever
notation the ECMAScript
board usually expects.

To permit sandboxing, particularly the creation of "require" functions
with constrained capabilities, the ECMAScript engine would provide the
internals of its module system with a standard a "modules" module.
This would have a "fetch(N) -> F" function, a "fetchMemo" object, a
"build(F) -> C(S) -> I" function that would create a constructor from
a file's text, a "buildMemo", and "require(N) -> I" and "include(N) ->
I" functions with the corresponding "requireMemo". The "require" and
"include" functions would have "bind(S)". The memo objects would map
module names to the memoized output of their respective functions.
This idea would benefit from some consideration about how to use
"this" and how to construct the execution context for a module.

Modules use and are used by modules. The global scope provided by
existing ECMAScript engines is not a module and shouldn't be expected
to function as one, and likewise modules shouldn't be expected to work
in global scope. I recommend that "require" be added to global scope
so that one can execute a module constructor from outside its
environment. As a footnote (not for ECMA), I think that a <modules
root="/javascript" import="name .name name"> tag should be added to
HTML to create a module sandbox and import some initial modules (with
relative names relative to the containing HTML file).

Migration to this secure and strict module system would be a two-step,
opt-in process for ECMAScript developers.

1. A module system (or standardized module systems for each platform)
like modules.js (in that it works client side with current ECMAScript
features like eval and with) is provided that permits ECMAScript
modules to be written in the absence of any syntactic sugar, a syntax
wherein simple text transformations could change the salt to sugar.
module writers have the option of using this system or continue
writing for global scripts.

2. ECMAScript engine vendors provide a module system that supports
identical modules to the previous with both salty and sugary syntax.
the engine deprecates the salty syntax. Module writers have the
option of migrating their salty modules to sugary modules when they're
willing to alienate users of older engines.

Both of these steps could be coreleased since developers would always
have the option to migrate from global-script to salty-script, or from
salty-module to sugary-module at their leisure.

Some definitions:

* salty: not using any syntax not supported by older ecmascript
engines. salty isn't strictly opposite of sugary. we're using
desugared syntax to illustrate the equivalent behavior of various
systems which would not necessarily be usable by end users, but might
exist in intermediate stages of interpretation. that's not what i
mean by salt. A salty syntax would be almost equivalent to sugary
syntax, line by line. Like, var {a} = require(".a") could be the
salty version of from .a import a.

* global-script-system: the existing ecmascript script loader system
wherein all scripts are serially evaluated in an identical execution
context.

* user-script-provided-module-system: a system like modules.js that
permits script writers to use salty modules.

* script-engine-provided-module-system: a system provided by future
ecmascript engines that may use either salty or sugary modules, or
both, and may supplant existing, low-performance, insecure
user-script-provided-module-system if it is available without
preventing use of global-script-system or
user-script-provided-module-system when it is not available.

* global-script: a script that a global-script-system can run.

* salty-module: a script that either a
user-script-provided-module-system or
script-engine-provided-module-system can use.

* sugary-module: a script that only a
scrip-engine-provided-module-system could use.

To this end, I can provide a user-script-provided-module system that
supports the salty import semantics, that is, all of the above
behavior with the exception of "import from with as" statements or
object capability isolation, that is security.

To reduce the complexity of this proposal, I've implicitly made some
choices that are not strictly necessary for the final resolution.

- the syntax does not need to support the exact syntax described,
although this is familiar to at least some programmers and not a new
invention. I am convinced it does need to support all of the terms
suggested.

- the sandbox object does not need to exactly be the "require"
function. There are a couple alternatives, including an alternate
"modules" module, or something similar.

- The names of the "require" and "include" functions, and any
additional arguments that they MAY or MUST provide to support various
kinds of destructuring and continuation passing.

- The Caja team convinced me that ".modulename.modulename" dot
delimited module names would be more usable than URL or URN module
names. I've integrated this idea in this iteration of the proposal
since we achieved consensus on that point.

Some hard work is still ahead. I think we need to still figure out
what parts we MUST specify and which we an leave as an appendix to
vendors. We also need to work out how "bundling" could work for all
levels of provision: user-script and script-engine module-systems.

Kris

modules-2008-10-13.pdf

ihab...@gmail.com

unread,
Jan 13, 2009, 1:25:29 AM1/13/09
to Kris Kowal, Chiron, Mark Miller
Hey Kris & everyone,

On Sat, Jan 10, 2009 at 11:14 PM, Kris Kowal <kris....@cixar.com> wrote:
... here's a reference to Ihab's brainstorm before that meeting which I understand he's revising presently:

http://google-caja.googlecode.com/svn/trunk/experimental/doc/html/harmonyModules/index.html

It is now revised. Good news: I cut out lots of cruft. Since we have not chatted about it yet, it of course represents only my own, sometimes partly-formed thoughts.

I'll re-read and start working through your remarks.

Ihab

--
Ihab A.B. Awad, Palo Alto, CA

ihab...@gmail.com

unread,
Jan 13, 2009, 6:38:33 PM1/13/09
to Kris Kowal, Chiron, Mark Miller

On Mon, Jan 12, 2009 at 10:25 PM, <ihab...@gmail.com> wrote:
It is now revised.

Now moved to a Google doc --


Reply all
Reply to author
Forward
0 new messages