Alternate Module Format, Run Proposal

10 views
Skip to first unread message

James Burke

unread,
Oct 28, 2009, 3:02:17 PM10/28/09
to CommonJS
I put up a proposal for an alternate module format, see here:

http://wiki.commonjs.org/wiki/AlternateModuleFormatRun

It is focused on providing a format that works best in browsers, and
it has an implementation. I am wondering if it fits in with the goals
of CommonJS.

I am new to the community, so I apologize for any cultural
transgressions. Please let me know if I should go about this another
way.

Thanks,
James

Kris Kowal

unread,
Oct 28, 2009, 5:11:10 PM10/28/09
to comm...@googlegroups.com

Thanks for the proposal. This is actually related to a discussion we
had some time ago that amounted to creating a standard module
transport format suitable for script injection, which we've since
implemented in Narwhal and should definitely bring forward for
CommonJS standardization again.

http://groups.google.com/group/commonjs/browse_thread/thread/b40a38b42f248c0c

The schema serves all the same purposes, but looks like:

require.register({
id: {
"factory": function (require, exports, module) {

},
"depends": ["a", "b", "c", …]
},

});

This supports bundles as well as single module transport, and can with
varying degrees of difficulty depending on need, be automatically
complied from CommonJS modules where the first ellipsis is the text of
the "id" module and the dependencies are scraped with static analysis.
The module metadata could also be extended in the future.

Kris Kowal

James Burke

unread,
Oct 29, 2009, 3:33:40 PM10/29/09
to comm...@googlegroups.com
On Wed, Oct 28, 2009 at 2:11 PM, Kris Kowal <cowber...@gmail.com> wrote:
> Thanks for the proposal.  This is actually related to a discussion we
> had some time ago that amounted to creating a standard module
> transport format suitable for script injection, which we've since
> implemented in Narwhal and should definitely bring forward for
> CommonJS standardization again.
>
> http://groups.google.com/group/commonjs/browse_thread/thread/b40a38b42f248c0c

Kris,

Thanks for the pointer. In the thread you pointed to, as Jonathan
points out, I am looking for something that works well in the browser,
and does not require a lot of boilerplate so I can author it by hand
for static files loaded by the browser. I also want something that
could run as-is in a server-side JS environment if possible.

As long as those things are met, then the particulars are less
important. But we've had a good amount of feedback in Dojo that using
synchronous XHR calls + eval() are not ideal.

Some comments while browsing through the links in that other thread:

It seems like some modules, like system and print are given special
status in the function wrapper. That seems to blur the line between a
module loader and a specific module collection.

As for any additional metadata for a module, for the run format
proposal, we could place that after the function(){} wrapper for the
module:

run(


"a",
["b", "c"],

function(b, c) {

},
{
"author": "John Doe"
}
);

If version targeting is required, I favor putting the version of the
module in its name, separated by a comma:

run(
"a,v1",
["b", "c"],
function(b, c) {

}
);

and if module a only wants b version 3's .colorize() method, using #
to reference it:

run(
"a,v1",
["b,v3#colorize", "c"],
function(colorize, c) {

}
);

James

ihab...@gmail.com

unread,
Oct 29, 2009, 3:53:09 PM10/29/09
to comm...@googlegroups.com
Hi James,

On Wed, Oct 28, 2009 at 12:02 PM, James Burke <jrb...@gmail.com> wrote:

> I put up a proposal for an alternate module format, see here:

This is similar in spirit to the compiled module format for Caja, for example:

http://code.google.com/p/google-caja/source/browse/trunk/tests/com/google/caja/parser/quasiliteral/testModule.co.js

where, as you can see, we embed compiler artifacts extracted from the
code, like debugging information and the list of modules that are
loaded by the code, into the module format.

We do not, however, use this as the actual text that programmers type in.

Ihab

--
Ihab A.B. Awad, Palo Alto, CA

Kris Kowal

unread,
Oct 29, 2009, 7:59:58 PM10/29/09
to comm...@googlegroups.com
On Thu, Oct 29, 2009 at 12:33 PM, James Burke <jrb...@gmail.com> wrote:
> Thanks for the pointer. In the thread you pointed to, as Jonathan
> points out, I am looking for something that works well in the browser,
> and does not require a lot of boilerplate so I can author it by hand
> for static files loaded by the browser. I also want something that
> could run as-is in a server-side JS environment if possible.

I agree. There will have to be some boilerplate, though, which is
strictly less awesome than no boilerplate at all. So, I would
additionally like this format to be easy to build from CommonJS
modules. These goals *should* be reconcilable.

> As long as those things are met, then the particulars are less
> important. But we've had a good amount of feedback in Dojo that using
> synchronous XHR calls + eval() are not ideal.

Right, which is why *some* boilerplate is necessary. Some would go so
far as to say that XHR+eval is not acceptable in any production
environment. I'll have to draw up a flow graph on the technically
viable solutions at some point, but it suffices to say we're on the
same page.

> It seems like some modules, like system and print are given special
> status in the function wrapper. That seems to blur the line between a
> module loader and a specific module collection.

"system" and "print" are not specified by CommonJS. It has been
decided that "system" as a free variable certainly never will be [1].
"print" has not been proposed, but it's still on the list of things
that it would be nice to inject into module scope. Both "system" and
"print" are available in Narwhal, but we take the stance that any uses
of these names in the standard library are bugs.

Speaking of which, it's becoming more likely in Narwhal that we'll
take an approach to module factory functions more like Ihab, Wes, and
Hannes's projects, where we pass an object to the module factory
functions that contains stuff that ought to be in scope. If that were
the case, the boilerplate for a "Module in Transit" from CommonJS
would probably look more like:

require.register({id: {
"depends": ["a", "b", "c", …],
"factory": function (___) {
var require = ___.require;
var exports = ___.exports;
var module = ___.module;

};
})

For the hand-written case, this would work:

require.register({id: {
"depends": ["a"],
"factory": function (_) {
var a = _.require("a");

}
});

Injecting the modules directly as arguments would obviate the
possibility of compiling CommonJS modules to this transit format.

> As for any additional metadata for a module, for the run format
> proposal, we could place that after the function(){} wrapper for the
> module:
>
> run(
>    "a",
>    ["b", "c"],
>    function(b, c) {
>
>    },
>    {
>        "author": "John Doe"
>    }
> );

This gets at the heart of what you're looking for, I think: really
cutting the fat on the syntax. I think it would be more efficient to
get through the bikeshedding on IRC at #commonjs.

The reason for going with "require" as the name space is to reduce the
global footprint. There would need to be *some* footprint, and you
might want to be able to use "require" in inline scripts somewhere to
load the program module. Additionally, "require" would be masked by
"require" in the module's local scope, so it would not even be a free
variable observable within a module.

The reason for calling it "register" as opposed to "run" is to free
the implication that the module is run in-place at the time of
declaration, which would not be the case if it is declared before its
transitive dependencies have become available (it depends an "a", "b",
and "c", and "c" depends on "d", so you can't execute it until all of
those are registered). Also, the module itself might not be a
transitive dependency of the "main" program module, in which case it
would not be executed and any of its side-effects would not be
desired.

The reason for going with an object as the first argument is to permit
bundles of modules to be sent down in a single file. My original
proposal called for register(id, factory), but the need for additional
metadata (the dependencies, and presumably other stuff eventually) and
the need for bundles brought us to the current syntax. With some
sacrifice in complexity, we could presumably support an additional
argument form for the "as-brief-as-possible" handwritten module
transport case.

> If version targeting is required, I favor putting the version of the
> module in its name, separated by a comma:

We do not presently specify anything for versions, but this would work
with the current proposal as a module identifier convention.

> and if module a only wants b version 3's .colorize() method, using #
> to reference it:
>
> run(
>    "a,v1",
>    ["b,v3#colorize", "c"],
>    function(colorize, c) {
>
>    }
> );

An old version of Chiron supported require("id#key"), but I've dropped
it because it costs far more in the complexity of the loader than the
complexity of having to dereference it in JavaScript, and this group
is [wisely] reticent about standardizing [superfluous] conveniences.

Kris Kowal

[1] http://wiki.commonjs.org/wiki/System/ArchivedShowOfHands

James Burke

unread,
Nov 11, 2009, 12:41:01 AM11/11/09
to comm...@googlegroups.com
I apologize for the late reply, but I am still interested in pursuing
this more. Additional comments inline.

On Thu, Oct 29, 2009 at 3:59 PM, Kris Kowal <cowber...@gmail.com> wrote:
> I agree.  There will have to be some boilerplate, though, which is
> strictly less awesome than no boilerplate at all.  So, I would
> additionally like this format to be easy to build from CommonJS
> modules.  These goals *should* be reconcilable.

I agree, that is ideal, being able to translate between the two
formats, the normal serverside version and a browser-optimized
version.

> Speaking of which, it's becoming more likely in Narwhal that we'll
> take an approach to module factory functions more like Ihab, Wes, and
> Hannes's projects, where we pass an object to the module factory
> functions that contains stuff that ought to be in scope.  If that were
> the case, the boilerplate for a "Module in Transit" from CommonJS
> would probably look more like:
>
> require.register({id: {
>    "depends": ["a", "b", "c", …],
>    "factory": function (___) {
>        var require = ___.require;
>        var exports = ___.exports;
>        var module = ___.module;
>        …
>    };
> })
>
> For the hand-written case, this would work:
>
> require.register({id: {
>    "depends": ["a"],
>    "factory": function (_) {
>        var a = _.require("a");
>        …
>    }
> });
>
> Injecting the modules directly as arguments would obviate the
> possibility of compiling CommonJS modules to this transit format.

I prefer to pass anything the module might need as arguments as shown
below, hopefully this would allow the CommonJS modules to work in the
transit format. Additionally, exports can be defined as a part of the
boilerplate insertion, inside the function wrapping the module
definition. I would replace calls in the CommonJS module that do
require("a") with some _$ syntax if to avoid variable conflicts (the
_$ prefix could be varied if that token is already in the module).

So something that does:

var a = require("a");
//rest of module code here

might get transformed like this (just using run as a placeholder name):

run(
"module_id",
["require", "module", "a"],
function (require, module, _$a) {
var exports;
var a = _$a;
//rest of module code here
return exports;
}
);

>> As for any additional metadata for a module, for the run format
>> proposal, we could place that after the function(){} wrapper for the
>> module:
>>
>> run(
>>    "a",
>>    ["b", "c"],
>>    function(b, c) {
>>
>>    },
>>    {
>>        "author": "John Doe"
>>    }
>> );
>
> This gets at the heart of what you're looking for, I think: really
> cutting the fat on the syntax.  I think it would be more efficient to
> get through the bikeshedding on IRC at #commonjs.

Sounds good. If there is a specific meeting time that works best,
please let me know. I might check in every so often in IRC. My nick is
jrburke. I'm on the west coast of North America, Vancouver, BC.

> The reason for going with "require" as the name space is to reduce the
> global footprint.  There would need to be *some* footprint, and you
> might want to be able to use "require" in inline scripts somewhere to
> load the program module.  Additionally, "require" would be masked by
> "require" in the module's local scope, so it would not even be a free
> variable observable within a module.
>
> The reason for calling it "register" as opposed to "run" is to free
> the implication that the module is run in-place at the time of
> declaration, which would not be the case if it is declared before its
> transitive dependencies have become available (it depends an "a", "b",
> and "c", and "c" depends on "d", so you can't execute it until all of
> those are registered).  Also, the module itself might not be a
> transitive dependency of the "main" program module, in which case it
> would not be executed and any of its side-effects would not be
> desired.

I chose run because it is short and run( is 4 characters, the typical
indent size, so it looked pretty if module names are indented on the
next line. I was also fine with considering the human language
definition of run as a conditional: "run this code to define the
module given these conditions". If the conditions do not match, then
the code would not be run.

But I am fine with another name. Is it possible just to use require()
and have it be smart about looking at the args? require() used in this
context could be interpreted as a requirement is being stated. :) It
is a bit of a stretch on the meaning, but I prefer that to requiring
another property lookup via require.register. Less typing is strongly
desired since these modules will be hand-coded.

> The reason for going with an object as the first argument is to permit
> bundles of modules to be sent down in a single file.  My original
> proposal called for register(id, factory), but the need for additional
> metadata (the dependencies, and presumably other stuff eventually) and
> the need for bundles brought us to the current syntax.  With some
> sacrifice in complexity, we could presumably support an additional
> argument form for the "as-brief-as-possible" handwritten module
> transport case.

I was considering bundles of modules just a concat of files, so a
separate run/require() call for each module, with there being a pause
function to hold off tracing dependencies until the last call, then a
resume() to then trace all dependencies. Something like this: (this
example uses require() instead of run() as the entry point)

require.pause();

require(
"a"
["b"],
function(b) {
}
);

require(
"b",
["c"],
function(c){
}
);

require.resume();

As for the need for other arguments in the future, I would rather keep
the most common things (module name, dependencies, and function
definition) as positional arguments and reserve an object after the
module defining function to hold any other data. I see that fourth
object parameter being very rarely used (at least in the browser
case), and it keeps the most common things cheap to write.

James
Reply all
Reply to author
Forward
0 new messages