Feature: Moving ClojureScript Compiler to JSModules

159 views
Skip to first unread message

Thomas Heller

unread,
Jun 15, 2013, 7:05:10 AM6/15/13
to cloju...@googlegroups.com
Hey,

I'd like to propose a new feature for the ClojureScript Compiler, which requires a little bit of restructuring and possibly requires some new compiler options and might deprecate some current ones. Work started here but I'll try to outline the intent.

Purpose:

Support splitting the optimized CLJS javascript output into multiple "modules" (separate .js files) instead of one big monolithic javascript file.

Motivation:

This is focused towards Browser Javascript (not node.js) where its generally desirable to keep file sizes small. In a "big" client side App its usually not required to serve every bit of javascript upfront since the user may never visit certain areas of the site thus not needing the scripts. Since Advanced Compilation requires everything at compile time its very unlikely you could split the output manually without the support of the Closure Compiler.

Feature Description:

Support a :modules vector in the compiler options, for example:

{:optimizations :advanced
:output-dir "target/cljs"
:module-dir "public/assets/js"
:modules [{:id :sub1 :include ["myapp.sub1"]}
{:id :admin :include ["myapp.admin"]}
{:id :sub2 :include ["myapp.sub2"
"myapp.something-else"]}
{:id :sub3 :include ["myapp.sub3"]}
]}

At cljs.closure/optimize time all sources are "moved" into modules by walking the :modules vector in reverse order and prefix matching the :provides statements. If a match is found the file is moved into that module, if no match can be found it will be moved into the "default" module. Also the dependency graph is analyzed and files that are only used by one module can also be moved there.  (For example clojure.set is only used in myapp.admin.something so it can be moved into the :admin module). Modules can also depend on other Modules, which should also be analyzed and handled accordingly. (For example :sub3 may depend on :sub2 and :admin may depend on :sub1). All modules will always depend on the :default module which will probably always include cljs.core.

If no :modules are defined everything ends up in the :default module which basically then mirrors the current "one file" build. So its fully backwards compatible, although it would probably be advisable to change some of the compiler options since :output-to is targeted to one file only and has no meaning once a :modules is defined (since at that point at least 2 files will be generated). I'd also include a :module-dir option since its not the same as :output-dir. Output files should be named to <module-dir>/<module-name>.js

So in my example we'd end up with:

public/assets/js/default.js
public/assets/js/sub1.js
public/assets/js/admin.js
...

Overriding the :default name should be possible I guess.

Problems:

I have a "proof-of-concept" commit which provides the splitting of modules but does not yet properly analyze the dependencies and requires that all modules name which modules they depend on, also only specified includes are moved.

Source Maps are most likely also broken. I have not yet looked into how they work, but the Closure Compiler only provides ONE source maps, regardless of how many JSModules are defined. So I don't know how that works exactly.

:output-wrapper is also targeted at one file only, I don't think this would work with modules.

Generally node.js targeted builds probably wouldn't understand closure modules. But since there is no need to split files in a node.js environment its probably safe to either disable them when {:target :nodejs} or warn accordingly.

Issues:

Herwig Hochleitner suggested using symbols instead of strings for :include, I opted for strings since the underlying data refers to goog.provide namespaces not clojure namespaces. But symbols would work too I guess.

Work to be done:

I'd be happy clean up my proof of concept commit and fix the issues (use the dependency graph). I could use some help on the source maps since I have not used them yet and don't know the internals.


If there are no objections to this I'd be happy to start work and open a proper Jira Issue.

Regards,
/thomas

David Nolen

unread,
Jun 15, 2013, 1:19:19 PM6/15/13
to cloju...@googlegroups.com
You should probably make a design page for this. One word of advice - closure.clj is now very complicated - at least as long as analyzer.clj or compiler.clj. Given that we should do as much as possible to not add any additional complexity. The likelihood of us adopting any changes to support Closure modules will be correlated to the size and number of changes required - the fewer, the better. If anything all we should do is expose knobs so that external build tools like lein-cljsbuild can implement this functionality *themselves*.


--
You received this message because you are subscribed to the Google Groups "Clojure Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure-dev...@googlegroups.com.
To post to this group, send email to cloju...@googlegroups.com.
Visit this group at http://groups.google.com/group/clojure-dev.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Thomas Heller

unread,
Jun 16, 2013, 4:54:11 AM6/16/13
to cloju...@googlegroups.com
Hey,

I wanted to start this discussion here since I wanted to get this done ASAP even if I have to maintain a fork for a while, I understand that a proper Design process is required to mainline integration. It solves a lot of issues I have right now with a rather large cljs build. I expect that this will affect everyone trying to build compact cljs apps. I'm very concerned with pagespeed and I can more than half the initial load time using modules.

The changes required are minimal since (almost) only cljs.closure/optimize is affected. The changes to cljs.closure/build are straightforward and just need to make it aware that an optimized build may contain more than one output file, which requires about 5 lines of code.

The "hardest" part is analyzing the dependency graph and sorting files into modules which then may depend on other modules. I have a hacky version which works for my test-case but breaks pretty easily so I need to tune that a bit. You could probably solve this in a few lines of core.logic. ;)

I tried using lein-cljsbuild for this but its pretty much impossible since you need to change the Closure Compiler .compile to .compileModules which takes different inputs.

I'll try and move as much functionality as I can into ns cljs.closure.modules so cljs.closure doesn't get any more complicated. The dependency analysis is currently supported by https://github.com/stuartsierra/dependency but I'm not sure if its ok to introduce another dependency so that might need to be inlined when its done. The analysis can also be done with pure functions since all information required is already available through cljs.closure/-compile and cljs.closure/add-dependencies. Once I have a fully working prototype I'll provide a patch so we can discuss this with code at hand.

I assume introducing leiningen to the project is out of the question? :P I started writing some tests for the functionality I'm working on and my workflow involves "lein test-loop" so I created a basic project.clj, but that doesn't have to be in the final commit.

Regards,
/thomas

Chris Granger

unread,
Jun 16, 2013, 9:25:32 AM6/16/13
to cloju...@googlegroups.com
FWIW, something like this is definitely needed for Light Table too. For plugins there needs to be a way to produce much more granular output that won't require trying to excise common dependencies or require rebuilding the root file. I haven't started looking it yet, so I'm excited to see what you've got.

Cheers,
Chris.

Thomas Heller

unread,
Jun 16, 2013, 3:16:08 PM6/16/13
to cloju...@googlegroups.com
Hey,

so I finally got it to work, but it got way more complicated than I wanted it to. I'm pretty sure it can be dramatically simplified and I'm just too stupid to see it. There were a few gotchas I had to work around but my codebase compiles without hiccups now (well no source-maps, but those never worked in my project anyways) and I get even better results than expected.


Its unfinished since some of the backwards compatibility is broken (:output-to is gone, :module-dir is required, output to stdout wont work) but I'm happy to help if anyone else wants to give this a try. I'll try and clean it up over the next week so that all projects without :modules compile as usual. 


@chris: I'm not sure if this is something thats applicable to Light Table. This is for the optimization phase of the compiler which requires that all source files are present, which in the case of LT plugins won't happen. I guess you'd need to create an extern.js which describes the plugin API so others can link against it without having the source present.

Regards,
/thomas

Thomas Heller

unread,
Jun 18, 2013, 11:24:54 AM6/18/13
to cloju...@googlegroups.com
Hey,

small Update: I decided to write a custom build tool instead of trying to integrate this into clojurescript itself. Keeping everything backwards compatible would hurt the design and I also need some finer grained access than cljs.closure/build currently provides.

While I still believe every browser-targeted CLJS project should use modules at some point (as an optimization), not everyone will require the other features I have in mind. 

Sorry for the noise ;)

Cheers,
/thomas

PS: the current solution is way too slow (25sec build time on my project) since the algorithm used to find dependencies uses recursion and doesn't remember already visited nodes.

Chris Granger

unread,
Jun 18, 2013, 11:48:32 AM6/18/13
to cloju...@googlegroups.com
Is that work by chance on github somewhere? I'm curious to see what you did.

Cheers,
Chris.

Baishampayan Ghose

unread,
Jun 18, 2013, 12:00:21 PM6/18/13
to cloju...@googlegroups.com

Thomas Heller

unread,
Jun 18, 2013, 12:36:41 PM6/18/13
to cloju...@googlegroups.com
Hey,

its at https://github.com/thheller/clojurescript/commit/0209ef0b4f333fc2bba1ee821d313724baf5682f

Basically I construct a dependency graph with all goog.provide/goog.require namespaces, extract a subgraph for every module and build a dependency graph for the modules. Each module then is written to a separate .js file and manifest.json is created. This file includes basic metadata (which namespace is included in which file) which I intended to use for a "smart" resource loader.
 
I will open source everything once I have optimized the dependency resolution and worked out the kinks.

Cheers,
/thomas



--
You received this message because you are subscribed to a topic in the Google Groups "Clojure Dev" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure-dev/_dzGVBp7mjA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to clojure-dev...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages