Real-world loader interoperability challenges

149 views

Skip to first unread message

C Snover

unread,

Sep 20, 2013, 7:23:14 PM9/20/13

to amd-im...@googlegroups.com

Hi all,

First, apologies if this is a duplicate post; I posted this morning but then checked this evening and couldn’t find my post on the list.

I am in the process of trying to update [Intern](http://theintern.io) to allow for people to swap in alternative AMD loaders, but I’m running into some challenges in a few key areas:

1. The global loader object (`require` or `requirejs` or `curl` or something else) is the only place where several loaders place their configuration mechanisms (though consistently at `loader.config`). Local `require` does not expose the configuration mechanism in most loaders, and the AMD spec says that local `require` is not intended to support passing a configuration object as the first argument. This means that without knowing in advance what the global loader might be it is not possible to reconfigure at runtime from within an AMD module, which Intern needs to be able to do. Since most loaders support the necessary Common Config options, this is an annoying omission that really hinders interoperability.
2. Code executed through `vm.runInThisContext` in Node.js has no reference to a Node.js `require` method, which makes it impossible to load Node.js modules from within the AMD environment. Curl works around this by not using `vm` at all and executing scripts with `eval` (so code has access to the `require` for the `curl` module); RequireJS and Dojo put the require method at `require.nodeRequire` (where `require` is the AMD loader object). In all cases the `require` method is the one from the loader script. Dojo exposes this through the local require; RequireJS does not. RequireJS will attempt to use Node.js `require` automatically to load a module if an AMD load fails; Dojo will not. Being able to reliably access the underlying module system in an identifiable way is important for interoperability on SSJS platforms.
3. Not all loaders that support Node.js return their loader function when loaded into Node.js. Specifically, curl.js exposes itself on `global.curl` but does not set `module.exports = curl`. (I would probably consider this an implementation bug, but it was something I ran into that seemed worth mentioning.)
4. There seems to maybe be some lack of clarity with regards to how `packages` mapping is supposed to work; it looks like maybe RequireJS does not apply packages’ `location` if a “module ID” ends in .js, whereas the Dojo loader does? Not 100% sure on this one yet. Still testing.

Potentially all of these issues can be hacked around by putting extra stuff in the global scope, looking there for known AMD loader locations, and testing to see if an object at those global identifiers matches expectations for being the AMD loader global, but of course this ties implementations to only the AMD loaders that the implementer knew existed at the time or iterating and trying to infer what global might contain an AMD loader. It also violates some of the major tenets of authoring AMD modules, like staying out of the global object. As a result, I would really like to see some more definition around these areas so such hacks would not be needed.

Thanks,

James Burke

unread,

Sep 23, 2013, 8:10:54 PM9/23/13

to amd-im...@googlegroups.com

On Fri, Sep 20, 2013 at 4:23 PM, C Snover <goo...@zetafleet.com> wrote:
> 1. The global loader object (`require` or `requirejs` or `curl` or something
> else) is the only place where several loaders place their configuration
> mechanisms (though consistently at `loader.config`). Local `require` does
> not expose the configuration mechanism in most loaders, and the AMD spec
> says that local `require` is not intended to support passing a configuration
> object as the first argument. This means that without knowing in advance
> what the global loader might be it is not possible to reconfigure at runtime
> from within an AMD module, which Intern needs to be able to do. Since most
> loaders support the necessary Common Config options, this is an annoying
> omission that really hinders interoperability.

The best we came to on this is summarized here:

https://github.com/amdjs/amdjs-api/wiki/require#local-vs-global-require-

"There is often an implementation-dependent API that will kick off
module loading; if interoperability with several loaders is needed,
the global require() should be used to load the top level modules
instead."

It is up to individual loaders to opt in for that though.

> 2. Code executed through `vm.runInThisContext` in Node.js has no reference
> to a Node.js `require` method, which makes it impossible to load Node.js
> modules from within the AMD environment. Curl works around this by not using
> `vm` at all and executing scripts with `eval` (so code has access to the
> `require` for the `curl` module); RequireJS and Dojo put the require method
> at `require.nodeRequire` (where `require` is the AMD loader object). In all
> cases the `require` method is the one from the loader script. Dojo exposes
> this through the local require; RequireJS does not. RequireJS will attempt
> to use Node.js `require` automatically to load a module if an AMD load
> fails; Dojo will not. Being able to reliably access the underlying module
> system in an identifiable way is important for interoperability on SSJS
> platforms.

I am obviously biased and prefer the route requirejs took. I don't
think local require should have knowledge of node's require, but
rather local require should just express the need for a module given a
module ID, and then it is up to the loader to do any bridging with
Node.

Also, when using an AMD loader, I feel hiding the node require is
appropriate since it is inferior for network-based front end work, and
not what the module would see if running in the browser.

Probably good to have other implementers speak to this one though.

> 3. Not all loaders that support Node.js return their loader function when
> loaded into Node.js. Specifically, curl.js exposes itself on `global.curl`
> but does not set `module.exports = curl`. (I would probably consider this an
> implementation bug, but it was something I ran into that seemed worth
> mentioning.)

It seems like if require('loader-name') is done using Node's module
system, it would make sense to return the loader from that call.

> 4. There seems to maybe be some lack of clarity with regards to how
> `packages` mapping is supposed to work; it looks like maybe RequireJS does
> not apply packages’ `location` if a “module ID” ends in .js, whereas the
> Dojo loader does? Not 100% sure on this one yet. Still testing.

Right, module IDs should be IDs, not something that looks like an URL
or a file name with an extension. Examples of failure cases here where
the '.js' is useful to put in the module ID would be good to know.

> Potentially all of these issues can be hacked around by putting extra stuff
> in the global scope, looking there for known AMD loader locations, and
> testing to see if an object at those global identifiers matches expectations
> for being the AMD loader global, but of course this ties implementations to
> only the AMD loaders that the implementer knew existed at the time or
> iterating and trying to infer what global might contain an AMD loader. It
> also violates some of the major tenets of authoring AMD modules, like
> staying out of the global object. As a result, I would really like to see
> some more definition around these areas so such hacks would not be needed.

I'm open to making changes in requirejs, but not sure what those would
be right now, given the above. These are trickier issues though, have
not had much discussion, given that they are outside the basic "define
a module" use case.

James

John Hann

unread,

Sep 23, 2013, 9:15:57 PM9/23/13

to amd-im...@googlegroups.com

On Mon, Sep 23, 2013 at 5:10 PM, James Burke <jrb...@gmail.com> wrote:

On Fri, Sep 20, 2013 at 4:23 PM, C Snover <goo...@zetafleet.com> wrote:
> 1. The global loader object (`require` or `requirejs` or `curl` or something
> else) is the only place where several loaders place their configuration

> mechanisms (though consistently at `loader.config`). [snip]

The best we came to on this is summarized here:

https://github.com/amdjs/amdjs-api/wiki/require#local-vs-global-require-

"There is often an implementation-dependent API that will kick off
module loading; if interoperability with several loaders is needed,
the global require() should be used to load the top level modules
instead."

It is up to individual loaders to opt in for that though.

Most loader, curl.js included, have a mechanism to remap their global APIs. lodash.js takes a different approach: https://github.com/lodash/lodash/blob/master/test/test-ui.js

> 2. Code executed through `vm.runInThisContext` in Node.js has no reference
> to a Node.js `require` method, which makes it impossible to load Node.js
> modules from within the AMD environment. Curl works around this by not using
> `vm` at all and executing scripts with `eval` (so code has access to the
> `require` for the `curl` module); RequireJS and Dojo put the require method
> at `require.nodeRequire` (where `require` is the AMD loader object). In all
> cases the `require` method is the one from the loader script. Dojo exposes
> this through the local require; RequireJS does not. RequireJS will attempt
> to use Node.js `require` automatically to load a module if an AMD load
> fails; Dojo will not. Being able to reliably access the underlying module
> system in an identifiable way is important for interoperability on SSJS
> platforms.

I am obviously biased and prefer the route requirejs took. I don't
think local require should have knowledge of node's require, but
rather local require should just express the need for a module given a
module ID, and then it is up to the loader to do any bridging with
Node.

I also agree with this: "I don't think local require should have knowledge of node's require". Wouldn't it be very confusing to mix AMD and node_modules lookup rules?

Also, when using an AMD loader, I feel hiding the node require is
appropriate since it is inferior for network-based front end work, and
not what the module would see if running in the browser.

Probably good to have other implementers speak to this one though.

Whenever we need to use both `require` functions, we explicitly capture both using special boilerplate. (The `eval` code you mention is only used when you explicitly configure curl to use the curl/loader/cjsm11 module to load code.)

> 3. Not all loaders that support Node.js return their loader function when
> loaded into Node.js. Specifically, curl.js exposes itself on `global.curl`
> but does not set `module.exports = curl`. (I would probably consider this an
> implementation bug, but it was something I ran into that seemed worth
> mentioning.)

It seems like if require('loader-name') is done using Node's module
system, it would make sense to return the loader from that call.

This has been in curl.js fixed recently.

> 4. There seems to maybe be some lack of clarity with regards to how
> `packages` mapping is supposed to work; it looks like maybe RequireJS does
> not apply packages’ `location` if a “module ID” ends in .js, whereas the
> Dojo loader does? Not 100% sure on this one yet. Still testing.

Are you saying the config looks something like this?

`{ name: 'package.js', location: 'foo/bar/package' }`

This looks like user error to me. If curl.js is forgiving of this, it's by accident. :)

> Potentially all of these issues can be hacked around by putting extra stuff
> in the global scope, looking there for known AMD loader locations, and
> testing to see if an object at those global identifiers matches expectations
> for being the AMD loader global, but of course this ties implementations to
> only the AMD loaders that the implementer knew existed at the time or
> iterating and trying to infer what global might contain an AMD loader. It
> also violates some of the major tenets of authoring AMD modules, like
> staying out of the global object. As a result, I would really like to see
> some more definition around these areas so such hacks would not be needed.

I'm having trouble envisioning what hacks would be in users' modules. Seems like the inconsistencies could be resolved in the app's bootstrap code, no?

Just pointing again to the lodash repo as a way to handle these things in the bootstrap code: https://github.com/lodash/lodash/blob/master/test/test-ui.js

Regards,

-- John

Colin Snover

unread,

Sep 24, 2013, 12:17:59 AM9/24/13

to amd-im...@googlegroups.com

On 2013-09-23 19:10, James Burke wrote:
> On Fri, Sep 20, 2013 at 4:23 PM, C Snover <goo...@zetafleet.com> wrote:
>> 1. The global loader object (`require` or `requirejs` or `curl` or something
>> else) is the only place where several loaders place their configuration
>> mechanisms (though consistently at `loader.config`). Local `require` does
>> not expose the configuration mechanism in most loaders, and the AMD spec
>> says that local `require` is not intended to support passing a configuration
>> object as the first argument. This means that without knowing in advance
>> what the global loader might be it is not possible to reconfigure at runtime
>> from within an AMD module, which Intern needs to be able to do. Since most
>> loaders support the necessary Common Config options, this is an annoying
>> omission that really hinders interoperability.
> The best we came to on this is summarized here:
>
> https://github.com/amdjs/amdjs-api/wiki/require#local-vs-global-require-
>
> "There is often an implementation-dependent API that will kick off
> module loading; if interoperability with several loaders is needed,
> the global require() should be used to load the top level modules
> instead."
>
> It is up to individual loaders to opt in for that though.

So, AMD loaders that do not expose themselves at `require` canï¿½t be
kicked off or configured without hard-coding an application to know
where else to look for the loader to expose itself.

ï¿½well, they can be kicked off in Node.js, because Node.js has a module
system to retrieve the loader module (except for the current release
version of curl).

I donï¿½t know. Having to hard-code this part is just frustrating. It is
normally OK for application authors since they decide which loader they
are using and the entry point is a minor part of an application. Maybe
this is just something too specific to the sort of generic approach that
a testing system like Intern has to take.

John Hann wrote:
> lodash.js takes a different
> approach: https://github.com/lodash/lodash/blob/master/test/test-ui.js

Yes. Having to provide support for every AMD loader explicitly like this
is what I am hoping to avoidï¿½ but maybe it is impossible.

>> 2. Code executed through `vm.runInThisContext` in Node.js has no reference
>> to a Node.js `require` method, which makes it impossible to load Node.js
>> modules from within the AMD environment. Curl works around this by not using
>> `vm` at all and executing scripts with `eval` (so code has access to the
>> `require` for the `curl` module); RequireJS and Dojo put the require method
>> at `require.nodeRequire` (where `require` is the AMD loader object). In all
>> cases the `require` method is the one from the loader script. Dojo exposes
>> this through the local require; RequireJS does not. RequireJS will attempt
>> to use Node.js `require` automatically to load a module if an AMD load
>> fails; Dojo will not. Being able to reliably access the underlying module
>> system in an identifiable way is important for interoperability on SSJS
>> platforms.
> I am obviously biased and prefer the route requirejs took. I don't
> think local require should have knowledge of node's require, but
> rather local require should just express the need for a module given a
> module ID, and then it is up to the loader to do any bridging with
> Node.
>
> Also, when using an AMD loader, I feel hiding the node require is
> appropriate since it is inferior for network-based front end work, and
> not what the module would see if running in the browser.
>
> Probably good to have other implementers speak to this one though.

The Node.js module loader has behavioural differences versus how AMD
loaders must work by necessity, so trying to transparently call Node.js
through the loader feels wrong to me and I think leads to end-user
confusion. Most pertinently:

1. Node.js walks the filesystem; AMD does not
2. AMD can remap modules; Node.js does not
3. Dependencies in a Node.js module will not resolve the same as
dependencies in an AMD module because of #2, except UMD can sometimes
interfere and then things get extra crazy unless you remove `define`
from scope before trying to use Nodeï¿½s `require` to load a module

But anyway, the point is that a loader that tries to transparently
bridge the underlying module loader donï¿½t actually follow the rules of
AMD part of the time, which makes the mental model of AMD more complex
than it should be.

(Also, because an AMD loader cannot be effectively modularized without
gross build hackery, and file size is a major consideration, I
personally prefer to make sure that the loader can farm out to loader
plugins as much functionality as possible.)

>> 4. There seems to maybe be some lack of clarity with regards to how
>> `packages` mapping is supposed to work; it looks like maybe RequireJS does

>> not apply packagesï¿½ `location` if a ï¿½module IDï¿½ ends in .js, whereas the

>> Dojo loader does? Not 100% sure on this one yet. Still testing.
> Right, module IDs should be IDs, not something that looks like an URL
> or a file name with an extension. Examples of failure cases here where
> the '.js' is useful to put in the module ID would be good to know.

The use case for this is to provide visual differentiation between
loading non-AMD *scripts* and AMD modules using the AMD loader.

Specifically, Intern has an `order` module that does what it sounds like
(makes modules load serially in order). The use signature is
`intern/order!../non-amd-module.js`. Supposing the caller module is at
`foo/tests/non-amd-module`, `foo` is a package defined as `{ name:
'foo', location: 'bar' }`, RequireJS will try to load
`foo/non-amd-module.js` instead of `baseUrl/bar/non-amd-module.js`. Dojo
will load the latter.

This could again probably be worked around with a custom normalize
implementation but it would be nice for normalization to be specified to
work consistently.

Regards,

--
Colin Snover
http://zetafleet.com

James Burke

unread,

Oct 13, 2013, 12:25:22 AM10/13/13

to amd-im...@googlegroups.com

from scope before trying to use Node’s `require` to load a module

But anyway, the point is that a loader that tries to transparently

bridge the underlying module loader don’t actually follow the rules of

AMD part of the time, which makes the mental model of AMD more complex
than it should be.

(Also, because an AMD loader cannot be effectively modularized without
gross build hackery, and file size is a major consideration, I
personally prefer to make sure that the loader can farm out to loader
plugins as much functionality as possible.)

I am not sure what the reasonable path is here. For one, any exterior loader cannot reliably create a node `require` that would be created for each module loaded by the AMD loader -- that node require is constructed internally.

So if a node require is exposed to all those modules, it will be, as you have found, the top level require for the module that started module loading. If this is used for all modules, the path lookup logic in node will be wrong for those other modules.

Even if a node require specific to the module could be created, I am not sure how that all works out if the AMD loader wants to actually get the export.

In short, it sucks having two module systems, and it sucks that node is so inwardly focused to not care about having a robust enough module system that also works in the browser. Let's hope ES modules gives us something better.

>> 4. There seems to maybe be some lack of clarity with regards to how
>> `packages` mapping is supposed to work; it looks like maybe RequireJS does

>> not apply packages’ `location` if a “module ID” ends in .js, whereas the

>> Dojo loader does? Not 100% sure on this one yet. Still testing.
> Right, module IDs should be IDs, not something that looks like an URL
> or a file name with an extension. Examples of failure cases here where
> the '.js' is useful to put in the module ID would be good to know.

The use case for this is to provide visual differentiation between
loading non-AMD *scripts* and AMD modules using the AMD loader.

Specifically, Intern has an `order` module that does what it sounds like
(makes modules load serially in order). The use signature is
`intern/order!../non-amd-module.js`. Supposing the caller module is at
`foo/tests/non-amd-module`, `foo` is a package defined as `{ name:
'foo', location: 'bar' }`, RequireJS will try to load
`foo/non-amd-module.js` instead of `baseUrl/bar/non-amd-module.js`. Dojo
will load the latter.

This could again probably be worked around with a custom normalize
implementation but it would be nice for normalization to be specified to
work consistently.

Ah, so the '.js' handling in requirejs was also reserved to indicate "not a module ID, but just an URL path". So Intern may be using it to indicate "contains a define()'d module", but for requirejs, it means "just an URL" that does not get all the module ID resolution logic.

I would just stick to using module ID names though. In an alternate universe, I probably would have rethought that ".js means URL not module ID" choice in requirejs, and just always treat it as a full module ID (which means also adding a '.js' to the end of it for final path -- this seems likely as the default in ES modules too). Unfortunately it is not easy for me to change it now as some people rely on it. If I did though, it means the use you use it now would not work anyway. So may not be the answer you are looking for, but it would be a straightforward answer that I can see other loaders also supporting.

BTW, the ".js is an URL" is useful for some remote, one-off cases, like third party ad CDN resources that need to be referenced just once. Perhaps for a requirejs 3, I would remove that and just ask people to insert paths configs for those one-off cases.