RFC: sharded linking

14 views
Skip to first unread message

Lex Spoon

unread,
Feb 9, 2010, 5:31:16 PM2/9/10
to GWTcontrib, Ray Cromwell
This is a design doc about speeding up the link phase of GWT.  If you don't maintain a linker, and if you don't have a multi-machine GWT build, then none of this should matter to you.  If you do maintain a linker, let's make sure your linker can be updated with the proposed changes.  If you do have a multi-machine build, or if you have some ideas about them, then perhaps you can help us get the best speed benefit possible out of this.

I want to speed up linking for multi-machine builds in two ways:

1. Allow more parts of linking to run in parallel.  In particular, anything that happens once per permutation and does not need information from other permutations can run in parallel.  As an example, the iframe linker chunks the JavaScript of each permutation into multiple <script> tags.  That work can happen in parallel once the linker API supports it.

2. Link does a lot of Java serialization for its artifacts, but the majority of the artifacts in a compile are emitted artifacts that have no structure.  They are just a named bag of bits, from the compiler's perspective.  It would help if such artifacts did not need a round of Java serialization on the Link node and could instead be bulk copied.


=== Transition ===

The compiler will support two compilation modes: maximal sharding and simulated sharding.  Maximal sharding is used when all linkers support it and the Precompile/CompilePerms/Link entry points are used.  Simulated sharding is used when either some linker can't shard or when the Compiler entry point is used.

Linkers individually indicate whether they implement the sharding or non-sharding API. This allows linkers to be updated one by one and to leave the non-sharding API behind once they do. It does not cause trouble with other linkers, because in practice linkers are highly independent.  I've looked at as many linkers as I could find to verify this.  Occasionally one linker depends on another; in such a case they'll have to be updated in tandem, but the need for that should be rare.

By default, a linker is assumed to want the legacy non-sharding API. For such linkers, it isn't safe to assume it generators or its associated artifacts can be safely serialized and then deserialized on a different computer.

The non-sharding API will be deprecated.  After the sharding API has been out for one GWT release cycle, support for non-shardable linkers will be dropped.


=== Maximal sharding ===

Currently, Precompile parses Java into ASTs and runs generators. CompilePerms then runs one copy for each permutation, in parallel. Each instance optimizes the AST for one permutation and then converts it into JavaScript plus some additional artifacts. Finally, Link takes the JavaScript and all the produced artifacts, runs the individual linkers, and produces the final output. In summary, the three stages are:

current Precompile:

  • parse Java and run generators
  • output: number of permutations, AST, generated artifacts

current CompilePerms:

  • input: permutation id, AST
  • compile one permutation to JavaScript
  • output: JavaScript, generated artifacts

current Link:

  • input: JavaScript from all permutations, generated artifacts
  • run linkers on all artifacts
  • emit EmittedArtifacts into the final output

With maximal sharding, Precompile does no work except to count the number of permutations. Each CompilePerms instance parses Java ASTs, run generators, and optimizes for a specific permutation. Additionally, each CompilePerms instance also runs the shardable part of linkers on the results for that permutation. It then "thins" the artifacts (see below) and emits them. Finally, Link takes these results from the CompilePerms instances, runs the final, non-shardable part of each linker, and emits all the artifacts designated as emitted artifacts.  In summary, the maximal-sharding staging looks like this:

new Precompile:

  • output: number of permutations

new CompilePerms:

  • input: permutation id
  • compile one permutation to JavaScript, including running generators
  • run the on-shard part of linkers
  • thin down the resulting artifacts, as defined below
  • output: JavaScript and the thinned down set of artifacts

new Link:

  • input: JavaScript and transferable artifacts from each permutation
  • run the final part of linkers, which can add more files to the final output
  • output: resulting emitted artifacts


=== Simulated Sharding ===

Simulated sharding uses the in-trunk compiler staging, but runs the linkers as much as possible as if they were using the maximal sharding staging. The sequence is the same whether the Compiler entry point is used or the Precompile/CompilePerms/Link trio of entry points is used. Under simulated sharding, the Precompile and CompilePerms steps run exactly as in trunk. The Link stage, however, runs the linkers in a careful order so as to use the sharded API for those linkers that have been updated:

  • For each compiled permutation, run the on-shard part of all shardable linkers. For each permutation, start with a fresh set of artifacts so that the linkers don't see each other's output.
  • Combine all of the resulting artifacts.
  • Run the non-shardable linkers on those artifacts.
  • Thin the artifacts, as defined below
  • Run the final part of all shardable linkers.
  • Emit the "output" and "extra" files.


=== Development mode ===

Development mode does not generate any compiled permutations. Thus, it does not run the per-permutation part of linkers. It does, however, need to run the final-link part of linkers. It should do this just after the places it calls link() or relink().


=== Detailed API changes ===

  • Linkers that are updated to be shardable are annotated with a new annotation @Shardable
  • The Linker.link() method has a new boolean parameter, indicating whether it is running on a shard or on the final node.
  • BinaryEmittedArtifact is added as a final subclass of EmittedArtifact, indicating an artifact with no internal structure.  The compiler can bulk copy such artifacts rather than using Java serialization.
  • There is a new annotation @Transferable that can be added to artifacts.  Artifacts without this annotation are subject to thinning, described below.


=== Thinning of an artifact set ===

After the sharded part of a linker runs, the resulting artifact set is thinned down, so as to minimize the amount sent back to the Link node and to minimize the amount of deserialization that Link has to do. Thinning an artifact set does two things:

  • All EmittedArtifacts are replaced by a BinaryEmittedArtifact, thus discarding any fields that the EmittedArtifact might have had.
  • All other artifacts are discarded, except ones annotated with @Transferable


=== Order of linkers ===

Whenever the compiler runs a number of linkers, it runs them in the order implied by the PRE, PRIMARY, and POST annotations.  This is true on the shards and not, as well as with both the shardable and non-shardable link() methods.

Alex Moffat

unread,
Feb 10, 2010, 8:55:23 AM2/10/10
to Google Web Toolkit Contributors
I've replied before but don't see it here, if it turns up ignore this
dupe.

I don't maintain any linkers but I have experimented with multi-
machine builds. The current Precompile, CompilePerms, and Link
implementation has the nice feature that the CompilePerms step does
not require access to the source code being compiled. This makes it
very, very much easier to deploy additional CompilePerms workers as
they don't need to check out source code etc. I like the plan for
being able to perform some linking in parallel but I wouldn't like to
lose the ability to deploy a useful CompilePerms worker that does not
need source code access. If performing Java parsing, creating AST and
generating artifacts is something that may need to be parallelized for
some builds then I'd like it if that was done in an additional step so
that people could choose whether or not to run that on multiple
machines while still being able to run the CompilePerms steps on
multiple machines.

>    - parse Java and run generators
>    - output: number of permutations, AST, generated artifacts
>
> current CompilePerms:
>
>    - input: permutation id, AST
>    - compile one permutation to JavaScript
>    - output: JavaScript, generated artifacts
>
> current Link:
>
>    - input: JavaScript from all permutations, generated artifacts
>    - run linkers on all artifacts
>    - emit EmittedArtifacts into the final output


>
> With maximal sharding, Precompile does no work except to count the number of
> permutations. Each CompilePerms instance parses Java ASTs, run generators,
> and optimizes for a specific permutation. Additionally,
> each CompilePerms instance also runs the shardable part of linkers on the
> results for that permutation. It then "thins" the artifacts (see below) and
> emits them. Finally, Link takes these results from the CompilePerms
> instances, runs the final, non-shardable part of each linker, and emits all
> the artifacts designated as emitted artifacts.  In summary, the
> maximal-sharding staging looks like this:
>
> new Precompile:
>

>    - output: number of permutations
>
> new CompilePerms:
>
>    - input: permutation id
>    - compile one permutation to JavaScript, including running generators
>    - run the on-shard part of linkers
>    - thin down the resulting artifacts, as defined below
>    - output: JavaScript and the thinned down set of artifacts
>
> new Link:
>
>    - input: JavaScript and transferable artifacts from each permutation
>    - run the final part of linkers, which can add more files to the final
>    output
>    - output: resulting emitted artifacts


>
> === Simulated Sharding ===
>
> Simulated sharding uses the in-trunk compiler staging, but runs the linkers
> as much as possible as if they were using the maximal sharding staging. The
> sequence is the same whether the Compiler entry point is used or the
> Precompile/CompilePerms/Link trio of entry points is used. Under
> simulated sharding, the Precompile and CompilePerms steps run exactly as in
> trunk. The Link stage, however, runs the linkers in a careful order so as to
> use the sharded API for those linkers that have been updated:
>

>    - For each compiled permutation, run the on-shard part of


>    all shardable linkers. For each permutation, start with a fresh set of
>    artifacts so that the linkers don't see each other's output.

>    - Combine all of the resulting artifacts.
>    - Run the non-shardable linkers on those artifacts.
>    - Thin the artifacts, as defined below
>    - Run the final part of all shardable linkers.
>    - Emit the "output" and "extra" files.


>
> === Development mode ===
>
> Development mode does not generate any compiled permutations. Thus, it does
> not run the per-permutation part of linkers. It does, however, need to run
> the final-link part of linkers. It should do this just after the places it
> calls link() or relink().
>
> === Detailed API changes ===
>

>    - Linkers that are updated to be shardable are annotated with a new
>    annotation @Shardable
>    - The Linker.link() method has a new boolean parameter, indicating


>    whether it is running on a shard or on the final node.

>    - BinaryEmittedArtifact is added as a final subclass of EmittedArtifact,


>    indicating an artifact with no internal structure.  The compiler can bulk
>    copy such artifacts rather than using Java serialization.

>    - There is a new annotation @Transferable that can be added to artifacts.


>     Artifacts without this annotation are subject to thinning, described below.
>
> === Thinning of an artifact set ===
>
> After the sharded part of a linker runs, the resulting artifact set is
> thinned down, so as to minimize the amount sent back to the Link node and to
> minimize the amount of deserialization that Link has to do. Thinning an
> artifact set does two things:
>

>    - All EmittedArtifacts are replaced by a BinaryEmittedArtifact, thus


>    discarding any fields that the EmittedArtifact might have had.

>    - All other artifacts are discarded, except ones annotated with

Alex Moffat

unread,
Feb 9, 2010, 10:29:53 PM2/9/10
to Google Web Toolkit Contributors
I may have misunderstood the proposal but I've experimented a little
with multi machine builds so I'll comment based on that.

One very nice feature of the current system is that the CompilePerms
step does not need access to the source code being compiled. This is a
significant benefit as it makes it very easy to setup a new machine to
perform CompilePerms work. Without this each CompilePerms machine
would have to checkout the source to compile, a significant amount of
work and potentially difficult to configure. My experiments showed
most of the time being spent in the current Precompile step, but that
is because I was not generating a large number of permutations. I
imagine the use case for multi machine builds is that you're doing a
build for QA or release that needs to include all languages etc,
certainly 10s of permutations. One big machine with access to the
source to run Precompile in parallel on multiple pages and then being
able to simply make available lots of dumb CompilePerms workers that
need just GWT installed would be a big advantage here. Or, Precompile
different pages on different machines (using some out of band
distribution system) and then use a farm of dumb workers to
CompilePerms. Individual developers probably use dev mode or just
build a single language.

Making it possible to run portions of the linkers as part of
CompilePerms would certainly be a benefit and I'm all for the "reduced
serialization" plan.

On Feb 9, 4:31 pm, Lex Spoon <sp...@google.com> wrote:

>    - parse Java and run generators
>    - output: number of permutations, AST, generated artifacts
>
> current CompilePerms:
>


>    - input: permutation id, AST

>    - compile one permutation to JavaScript
>    - output: JavaScript, generated artifacts
>
> current Link:
>
>    - input: JavaScript from all permutations, generated artifacts
>    - run linkers on all artifacts
>    - emit EmittedArtifacts into the final output


>
> With maximal sharding, Precompile does no work except to count the number of
> permutations. Each CompilePerms instance parses Java ASTs, run generators,
> and optimizes for a specific permutation. Additionally,
> each CompilePerms instance also runs the shardable part of linkers on the
> results for that permutation. It then "thins" the artifacts (see below) and
> emits them. Finally, Link takes these results from the CompilePerms
> instances, runs the final, non-shardable part of each linker, and emits all
> the artifacts designated as emitted artifacts.  In summary, the
> maximal-sharding staging looks like this:
>
> new Precompile:
>

>    - output: number of permutations
>
> new CompilePerms:
>
>    - input: permutation id
>    - compile one permutation to JavaScript, including running generators
>    - run the on-shard part of linkers
>    - thin down the resulting artifacts, as defined below
>    - output: JavaScript and the thinned down set of artifacts
>
> new Link:
>
>    - input: JavaScript and transferable artifacts from each permutation
>    - run the final part of linkers, which can add more files to the final
>    output
>    - output: resulting emitted artifacts


>
> === Simulated Sharding ===
>
> Simulated sharding uses the in-trunk compiler staging, but runs the linkers
> as much as possible as if they were using the maximal sharding staging. The
> sequence is the same whether the Compiler entry point is used or the
> Precompile/CompilePerms/Link trio of entry points is used. Under
> simulated sharding, the Precompile and CompilePerms steps run exactly as in
> trunk. The Link stage, however, runs the linkers in a careful order so as to
> use the sharded API for those linkers that have been updated:
>

>    - For each compiled permutation, run the on-shard part of


>    all shardable linkers. For each permutation, start with a fresh set of
>    artifacts so that the linkers don't see each other's output.

>    - Combine all of the resulting artifacts.
>    - Run the non-shardable linkers on those artifacts.
>    - Thin the artifacts, as defined below
>    - Run the final part of all shardable linkers.
>    - Emit the "output" and "extra" files.


>
> === Development mode ===
>
> Development mode does not generate any compiled permutations. Thus, it does
> not run the per-permutation part of linkers. It does, however, need to run
> the final-link part of linkers. It should do this just after the places it
> calls link() or relink().
>
> === Detailed API changes ===
>

>    - Linkers that are updated to be shardable are annotated with a new
>    annotation @Shardable
>    - The Linker.link() method has a new boolean parameter, indicating


>    whether it is running on a shard or on the final node.

>    - BinaryEmittedArtifact is added as a final subclass of EmittedArtifact,


>    indicating an artifact with no internal structure.  The compiler can bulk
>    copy such artifacts rather than using Java serialization.

>    - There is a new annotation @Transferable that can be added to artifacts.


>     Artifacts without this annotation are subject to thinning, described below.
>
> === Thinning of an artifact set ===
>
> After the sharded part of a linker runs, the resulting artifact set is
> thinned down, so as to minimize the amount sent back to the Link node and to
> minimize the amount of deserialization that Link has to do. Thinning an
> artifact set does two things:
>

>    - All EmittedArtifacts are replaced by a BinaryEmittedArtifact, thus


>    discarding any fields that the EmittedArtifact might have had.

>    - All other artifacts are discarded, except ones annotated with

Lex Spoon

unread,
Feb 10, 2010, 10:45:04 AM2/10/10
to google-web-tool...@googlegroups.com
What you describe, Alex, is available via the "Compiler" entry point, though it hasn't been particularly well documented.  There is a PermutationWorkerFactory that can create CompilePerms workers.  The default worker factory spawns Java VMs on the same machine, but it is possible to write a replacement worker that uses ssh or whatnot to do the work on a separate machine.  The way to plug in a replacement worker factory is to set the Java property gwt.jjs.permutationWorkerFactory .


That said, I thought the reason for existence of Precompile, CompilePerms, and Link is to get the best build time but at the expense of needing extra configuration.  We are finding that by spending a few seconds copying source code over, we save 10+ minutes in Precompile and 10+ minutes in Link.

Is copying source code so inconvenient that it would be worth having a slower build?  I would have thought any of the following would work to move source code from one machine to another:

1. rsync
2. jar + scp
3. "svn up" on the slave machines

Do any of those seem practical for your situation, Alex?

Overall, it's easy to provide an extra build staging as an option, but we support a number of build stagings already....

Lex


John Tamplin

unread,
Feb 10, 2010, 10:58:33 AM2/10/10
to google-web-tool...@googlegroups.com
What does make it difficult is that you can't have a pool of worker machines that can build any project that are asked of them without copying the sources to the worker for each request.  For a large project, this can get problematic especially when you have to send the transitive dependencies.

Besides, what is gained by having the user have to arrange this copying themselves rather than the current method of sending it as part of the compile process?  For example, distributed C/C++ compilers send the preprocessed source to the worker nodes, so they don't have to have the source or the same include files, we currently send the AST which is a representation of the source, etc.

--
John A. Tamplin
Software Engineer (GWT), Google

James Northrup

unread,
Feb 10, 2010, 11:01:32 AM2/10/10
to google-web-tool...@googlegroups.com
there's a fairly large repository based elephant in the room named maven.

John Tamplin

unread,
Feb 10, 2010, 11:12:08 AM2/10/10
to google-web-tool...@googlegroups.com
On Wed, Feb 10, 2010 at 11:01 AM, James Northrup <northru...@gmail.com> wrote:
there's a fairly large repository based elephant in the room named maven.

I'm not sure what that has to do with sharding a compile of a GWT application across a build farm.
 

James Northrup

unread,
Feb 10, 2010, 11:25:49 AM2/10/10
to google-web-tool...@googlegroups.com
i'm only chiming in on the 3 letters RFC in the topic.

the usecases being described as a point of deliberation, defining dependancies, repository access, and bundling automation, are well solved items in the maven stable.  how hard can it be to define a multiproject descriptor, assign "channels" of build-stage progression, and have a top-level project build coordinated by one maven instance publish artifacts to sucessive build-channels served elsewhere by daemons which trigger maven sub-builds?

even if a GWT build is not in itself a maven project, there's very few reasons why a synthetic maven pom cannot be fashioned for a build-node graph to unify conventions of scm, artifact, versioning, and build hooks to prior documented art and tools.




Lex Spoon

unread,
Feb 11, 2010, 4:33:41 PM2/11/10
to google-web-tool...@googlegroups.com
I've posted a patch for review here:


Let's make it work well for everyone it impacts.

-Lex


Lex Spoon

unread,
Feb 11, 2010, 5:33:53 PM2/11/10
to google-web-tool...@googlegroups.com
On Wed, Feb 10, 2010 at 10:58 AM, John Tamplin <j...@google.com> wrote:
On Wed, Feb 10, 2010 at 10:45 AM, Lex Spoon <sp...@google.com> wrote:
Is copying source code so inconvenient that it would be worth having a slower build?  I would have thought any of the following would work to move source code from one machine to another:

1. rsync
2. jar + scp
3. "svn up" on the slave machines

Do any of those seem practical for your situation, Alex?

Overall, it's easy to provide an extra build staging as an option, but we support a number of build stagings already....

What does make it difficult is that you can't have a pool of worker machines that can build any project that are asked of them without copying the sources to the worker for each request.  For a large project, this can get problematic especially when you have to send the transitive dependencies.

You assume the answer here, John.  The question is, just why is copying source code problematic to begin with?  Can anyone put their finger on it?

One concern is that the copying might take too long.  However, is there any project where it would take more than a few seconds?  A few seconds seems like not a big deal for any build large enough to bother with parallel building.

Another possible concern is the need to do some extra build configuration.  It doesn't take much *build time* to copy the dependencies, but it takes *developer time* to set it up.  Here I agree that it is some amount of extra work.  However, it doesn't seem like much.  You have to know what your dependencies are, and you have already worked out how to copy precompilation.ser, so how much more work is it to also send over the source code?

Overall, I see that it worries people to send source code to the CompilePerms nodes.  Yet, it seems entirely normal to me.  When you do a distributed build, all the remote workers must have their inputs copied over to them over the network.


Besides, what is gained by having the user have to arrange this copying themselves rather than the current method of sending it as part of the compile process?  For example, distributed C/C++ compilers send the preprocessed source to the worker nodes, so they don't have to have the source or the same include files, we currently send the AST which is a representation of the source, etc.

Compared to the status quo, we gain much faster builds.

Compared to automatically copying, we have a fully specced out proposal.  :)  If we try to automatically copy dependencies, how would we we know exactly what to copy?

Lex

John Tamplin

unread,
Feb 11, 2010, 5:42:14 PM2/11/10
to google-web-tool...@googlegroups.com
On Thu, Feb 11, 2010 at 5:33 PM, Lex Spoon <sp...@google.com> wrote:

Besides, what is gained by having the user have to arrange this copying themselves rather than the current method of sending it as part of the compile process?  For example, distributed C/C++ compilers send the preprocessed source to the worker nodes, so they don't have to have the source or the same include files, we currently send the AST which is a representation of the source, etc.

Compared to the status quo, we gain much faster builds.

Compared to automatically copying, we have a fully specced out proposal.  :)  If we try to automatically copy dependencies, how would we we know exactly what to copy?

That is exactly my point -- the C++ example sends the preprocessed source to the worker nodes, so they don't have to have the dependencies or the right include path or whatever.  The analogy here would be for GWT to send all of the collected source, either in its native form or as is currently done in a parsed AST form, to the worker nodes.

Lex Spoon

unread,
Feb 11, 2010, 5:43:01 PM2/11/10
to google-web-tool...@googlegroups.com
On Wed, Feb 10, 2010 at 11:25 AM, James Northrup <northru...@gmail.com> wrote:
the usecases being described as a point of deliberation, defining dependancies, repository access, and bundling automation, are well solved items in the maven stable.  how hard can it be to define a multiproject descriptor, assign "channels" of build-stage progression, and have a top-level project build coordinated by one maven instance publish artifacts to sucessive build-channels served elsewhere by daemons which trigger maven sub-builds?

That's a nice idea.  Has anyone heard of a project using  Maven to support distributed builds?

The little bit of web searching I did turned up did not look good.  People were saying it would be logical to build that way, but that Maven has a fundamental showstopper: the local repositories are not thread safe.  Perhaps that has changed by now?

Maven aside, there are other options.  Hudson and Pulse should work well.

Lex

Scott Blum

unread,
Feb 11, 2010, 7:43:48 PM2/11/10
to google-web-tool...@googlegroups.com, Ray Cromwell
I have a few comments, but first I wanted to raise the point that I'm not sure why we're having this argument about maximally sharded Precompiles at all.  For one thing, it's already implemented, and optional, via "-XshardPrecompile".  I can't think of any reason to muck with this, or why it would have any relevance to sharded linking.  Can we just table that part for now, or is there something I'm missing?


Okay, so now on to sharded linking itself.  Here's what I love:

- Love the overall goals: do more work in parallel and eliminate serialization overhead.
- Love the idea of simulated sharding because it enforces consistency.
- Love that the linkers all run in the same order.

Here's what I don't love:

- I'm not sure why development mode wouldn't run a sharded link first.  Wouldn't it make sense if development mode works just like production compile, it just runs a single "development mode" permutation shard link before running the final link?

- I dislike the whole transition period followed by having to forcibly update all linkers, unless there's a really compelling reason to do so.  Maybe I'm missing some use cases, but I don't see what problems result from having some linkers run early and others run late.  As Lex noted, all the linkers are largely independent of each other and mostly won't step on each other's toes.

- It seems unnecessary to have to annotate Artifacts to say which ones are transferable, because I thought we already mandated that all Artifacts have to be transferable.

I have in mind a different proposal that I believe addresses the same goals, but in a less-disruptive fashion.  Please feel free to poke holes in it:

1) Linker was made an abstract class specifically so that it could be extended later.  I propose simply adding a new method "linkSharded()" with the same semantics as "link()".  Linkers that don't override this method would simply do nothing on the shards and possibly lose out on the opportunity to shard work.  Linkers that can effectively do some work on shards would override this method to do so.  (We might also have a "relinkSharded()" for development mode.)

2) Instead of trying to do automatic thinning, we just let the linkers themselves do the thinning.  For example, one of the most serialization-expensive things we do is serialize/deserialze symbolMaps.  To avoid this, we update SymbolMapsLinker to do most of its work during sharding, and update IFrameLinker (et al) to remove the CompilationResult during the sharded link so it never gets sent across to the final link.

The pros to this idea are (I think) that you don't break anyone... instead you opt-in to the optimization.  If you don't do anything, it should still work, but maybe slower than it could.

The cons are... well maybe it's too simplistic and I'm missing some of the corner cases, or ways this could break down.

Thoughts?
Scott

Brendan Kenny

unread,
Feb 11, 2010, 8:58:15 PM2/11/10
to Google Web Toolkit Contributors

If this is indeed the direction to go in (and I'm a big fan of the
goals as well), it's probably also worth making a more formal
definition for "won't step on each other's toes". As a use case, I'm
working on a PRE linker that (currently) removes CompilationResults,
alters them based on information collected from across all
permutations, and then emits new ones. Obviously this isn't ideal--its
expensive and CompilationResults were written to be (mostly)
immutable--but it's also perfectly acceptable within the current
design of the artifactSet/linker chain. The primary linker only cares
about the set of compilation results it receives, and if an earlier
linker altered them, it need never know.

It seems (and I could definitely be misinterpreting here) that in both
the simulated sharding procedure and Scott's alternate proposal, there
will be sections of primary and post linkers running before a non-
shardable pre linker. If that's true, then neither will be able to
fully honor the ordering of linkers when shardable and non-shardable
linkers are mixed. But, then again, when I started on this one I think
I could find only one other PRE linker in existence, so now would be
the time to change.

Continuing to think out loud, it seems that the way to alter my linker
is probably either to statically derive what all permutations will
need in every shard (as opposed to just having each triggered
generator emit an artifact and collecting them at the end), or keeping
that the same and creating a custom primary linker, which I was hoping
not to do as it would tend to limit adoption. If that's the largest
price to pay, though, the trade off would seem worth it.

Alex Moffat

unread,
Feb 12, 2010, 9:50:38 AM2/12/10
to Google Web Toolkit Contributors
Where can I read a description of what -XshardPrecompile, or see the
code for it, it sounds very useful to me personally? It's not in 2.0.0
as far as I can see. My concerns about the sharded linking proposal
came from what I understood the original flow to be from my looking at
it and from the original sharded linkin proposal.

current Precompile:
- parse Java and run generators
- output: number of permutations, AST, generated artifacts
current CompilePerms:


- input: permutation id, AST

- compile one permutation to JavaScript
- output: JavaScript, generated artifacts
current Link:
- input: JavaScript from all permutations, generated artifacts
- run linkers on all artifacts
- emit EmittedArtifacts into the final output

If this isn't what the the current flow is then what is the current
flow and how does sharded linking fit into that?

On Feb 11, 6:43 pm, Scott Blum <sco...@google.com> wrote:

Ray Cromwell

unread,
Feb 12, 2010, 3:15:01 PM2/12/10
to google-web-tool...@googlegroups.com, Ray Cromwell
On Thu, Feb 11, 2010 at 4:43 PM, Scott Blum <sco...@google.com> wrote:
> - I dislike the whole transition period followed by having to forcibly
> update all linkers, unless there's a really compelling reason to do so.

In general, I'd agree, but the number of linkers in the wild appears
to be small, this may be a case of trying to preserve an API that only
5 or 10 people in the world are using.

>  Maybe I'm missing some use cases, but I don't see what problems result from
> having some linkers run early and others run late.  As Lex noted, all the
> linkers are largely independent of each other and mostly won't step on each
> other's toes.

In theory, you could have a non-sharded pre-linker whose job it is to
pre-filter the results before all other linkers are supposed to see
them. This could be, for example, substituting text into compiled
artifacts that a later linker might depend on, although admittedly,
this would only cause you a problem if you had written a
sharded-linker that cooperates with something a non-shared pre-linker
is supposed to do. I can't really think of any practical cases.

> - It seems unnecessary to have to annotate Artifacts to say which ones are
> transferable, because I thought we already mandated that all Artifacts have
> to be transferable.

Should all artifacts have to be transferable? The linker could be
generating temporary artifacts that run within a shard that don't need
to be sent back for the final link right?


> 2) Instead of trying to do automatic thinning, we just let the linkers
> themselves do the thinning.  For example, one of the most
> serialization-expensive things we do is serialize/deserialze symbolMaps.  To
> avoid this, we update SymbolMapsLinker to do most of its work during
> sharding, and update IFrameLinker (et al) to remove the CompilationResult
> during the sharded link so it never gets sent across to the final link.

It sounds to me like almost every linker will want to do thinning,
so if thinning is going to be used 100% of the time, won't requiring
everyone to reimplement thinning themselves result in potential bugs?

I thought Lex's design was essentially to make things network
efficient by doing the right thing in the common case (automatic
thinning, white-list things you want transferred). I'm not saying the
manual/opt-out approach wouldn't result in similar savings, but it
seems like Lex's design would make it harder for people to write
linkers that blow up on sharded compiles, especially when most third
parties/external contributors aren't using the shard feature yet, so
don't have much a way to detect they've done something bad.

-Ray

James Northrup

unread,
Feb 12, 2010, 4:00:42 PM2/12/10
to google-web-tool...@googlegroups.com
in this comment i also mentioned use of 'synthetic' maven poms.  heriein lies the scope and garbage collection features, where by semaphore or lack of semaphore, almost any single process can  decorate a repo with artifacts and precursor artifacts using the synthesized repo for the project or build-session.

by garbage collection, keeping it simple, im saying perhaps 

synthetic-pom/
synthetic-pom/shard1
synthetic-pom/shard2
synthetic-pom/shard2/target/
synthetic-pom/shard2/target/synthetic-sub-pom/shard2-1/
synthetic-pom/shard2/target/synthetic-sub-pom/shard2-1/target

..and so on

creating a temporary or session-specific repo with shard1-1,shard1-2, etc. etc. 

these artifacts, with a small bit of digest cleverness can become somewhat permanent cache assets as well.

so for supposition sake, as I won't pretend to be familiar with gwt multi-node builds, maven provides out-of-the-box:

  • descriptors for access protocols and plugins 
  • descriptors for scm and plugins
  • descriptors for final publishing 
  • descriptors for build hooks
  • descriptors for actual projects
  • occasionally borrowed ant-tasks
  • well understood repositories
  • transitive dep unified namespace
  • plugins to deploy to test, production, to fire up servlets, etc.

maven would need some extra mojo for:
  • build reactor traffic cop
    • spawn rules
  • descriptors for build mesh/cloud
  • intermediate-representation plugins and tasks along the lines of javacc and antlr examples

having said all this, i looked over the code-review and I liked what has been submitted in terms of code clarity for the objectives. At the end of the day I am just a casual RFc observer however.

Jim


Matt Mastracci

unread,
Feb 12, 2010, 4:41:20 PM2/12/10
to google-web-tool...@googlegroups.com, Ray Cromwell
On 2010-02-12, at 1:15 PM, Ray Cromwell wrote:

> On Thu, Feb 11, 2010 at 4:43 PM, Scott Blum <sco...@google.com> wrote:

>> - I dislike the whole transition period followed by having to forcibly
>> update all linkers, unless there's a really compelling reason to do so.
>
> In general, I'd agree, but the number of linkers in the wild appears
> to be small, this may be a case of trying to preserve an API that only
> 5 or 10 people in the world are using.

+1. I've written a handful of custom linkers (including one in the public gwt-firefox-extension project), but I'm used to updating them between GWT releases to work around subtle changes in the linker contract (ie: the evolution of hosted mode, various global variable changes, etc).

I'd rather have a clean linker system that changes from version to version than an awkward one with a lot of legacy interfaces.

Matt.

Lex Spoon

unread,
Feb 12, 2010, 7:00:30 PM2/12/10
to google-web-tool...@googlegroups.com
On Fri, Feb 12, 2010 at 9:50 AM, Alex Moffat <alex....@gmail.com> wrote:
Where can I read a description of what -XshardPrecompile, or see the
code for it, it sounds very useful to me personally?

-XshardPrecompile is an experiment that everyone wants to change, so it seems unlikely to be released in its current form.  We can talk about it if it helps, but I would propose that we focus more on what we want to do for real.

 
It's not in 2.0.0
as far as I can see. My concerns about the sharded linking proposal
came from what I understood the original flow to be from my looking at
it and from the original sharded linkin proposal.

Your understanding is correct as far as I can tell.

Lex


Lex Spoon

unread,
Feb 12, 2010, 7:31:25 PM2/12/10
to google-web-tool...@googlegroups.com
On Thu, Feb 11, 2010 at 8:58 PM, Brendan Kenny <bck...@gmail.com> wrote:
If this is indeed the direction to go in (and I'm a big fan of the
goals as well), it's probably also worth making a more formal
definition for "won't step on each other's toes". As a use case, I'm
working on a PRE linker that (currently) removes CompilationResults,
alters them based on information collected from across all
permutations, and then emits new ones. Obviously this isn't ideal--its
expensive and CompilationResults were written to be (mostly)
immutable--but it's also perfectly acceptable within the current
design of the artifactSet/linker chain. The primary linker only cares
about the set of compilation results it receives, and if an earlier
linker altered them, it need never know.

Hey, Brendan, it sounds like you are already pressing the limits of what is doable with linkers.    Can you describe in more detail what this linker accomplishes?

For this linker to be used in distributed builds, I believe you'd really want to come up with a way to do the JS rewrites on the sharded part.  Otherwise, the final link node is going to have to do the JS rewrites for the whole build sequentially.  What exact information is used as input to the rewrites?



It seems (and I could definitely be misinterpreting here) that in both
the simulated sharding procedure and Scott's alternate proposal, there
will be sections of primary and post linkers running before a non-
shardable pre linker. If that's true, then neither will be able to
fully honor the ordering of linkers when shardable and non-shardable
linkers are mixed.

That's a large part of why I suggested that we phase out non-sharded linkers.  In mixed mode, there isn't a perfect ordering to choose.  With all sharded linkers, the order is simple and predictable.  All sharded parts run before all final parts, and within either of those groups, PRE/PRIMARY/POST are respected.



Continuing to think out loud, it seems that the way to alter my linker
is probably either to statically derive what all permutations will
need in every shard (as opposed to just having each triggered
generator emit an artifact and collecting them at the end), or keeping
that the same and creating a custom primary linker, which I was hoping
not to do as it would tend to limit adoption.
 

It might help to know that both generators and linkers have access to the full set of *possible* values of a deferred binding, not just the values for the current permutation.  As an example, the LocaleListLinker reads off all possible values of "locale" and generates a file containing them:



Lex



Lex Spoon

unread,
Feb 12, 2010, 8:27:45 PM2/12/10
to google-web-tool...@googlegroups.com, Ray Cromwell
On Thu, Feb 11, 2010 at 7:43 PM, Scott Blum <sco...@google.com> wrote:
I have a few comments, but first I wanted to raise the point that I'm not sure why we're having this argument about maximally sharded Precompiles at all.  For one thing, it's already implemented, and optional, via "-XshardPrecompile".  I can't think of any reason to muck with this, or why it would have any relevance to sharded linking.  Can we just table that part for now, or is there something I'm missing?

There are still two modes, but there's no more need for an explicit argument.  For Compiler, precompile is never sharded.  For the three-stage entry points, full sharding happens iff all linkers are shardable.

 
- I'm not sure why development mode wouldn't run a sharded link first.  Wouldn't it make sense if development mode works just like production compile, it just runs a single "development mode" permutation shard link before running the final link?

Sure, we can do that. Note, though, that they will be running against an empty ArtifactSet, because there aren't any compiles for them to look at.  Thus, they won't typically do anything.



2) Instead of trying to do automatic thinning, we just let the linkers themselves do the thinning.  For example, one of the most serialization-expensive things we do is serialize/deserialze symbolMaps.  To avoid this, we update SymbolMapsLinker to do most of its work during sharding, and update IFrameLinker (et al) to remove the CompilationResult during the sharded link so it never gets sent across to the final link.

In addition to the other issues pointed out, note that this adds ordering constraints among the linkers.  Any linker that deletes something must run after every linker that wants to look at it.  Your example wouldn't work as is, because it would mean no POST linker can look at CompilationResults.  It also wouldn't work to put the deletion in a POST linker, for the same reason.  We'd have to work out a way for the deletions to happen last, after all the normal linkage activity.

Suppose, continuing that idea, we add a POSTPOST order that is used only for deletion.  If it's really only for deletion, then the usual link() API is overly general, because it lets linkers both add and remove artifacts during POSTPOST, which is not desired.  So, we want a POSTPOST API that is only for deletion.  Linkers somehow or another mark artifacts for deletion, but not anything else.  At this point, though, isn't it pretty much the same as the automated thinning in the initial proposal?


> The pros to this idea are (I think) that you don't break anyone... instead you
> opt-in to the optimization.  If you don't do anything, it should still work, but
> maybe slower than it could.

The proposal that started this thread also does not break anyone.

Lex


Lex Spoon

unread,
Feb 12, 2010, 8:45:38 PM2/12/10
to google-web-tool...@googlegroups.com
On Thu, Feb 11, 2010 at 5:42 PM, John Tamplin <j...@google.com> wrote:
That is exactly my point -- the C++ example sends the preprocessed source to the worker nodes, so they don't have to have the dependencies or the right include path or whatever.  The analogy here would be for GWT to send all of the collected source, either in its native form or as is currently done in a parsed AST form, to the worker nodes.

For GWT it's less clear what to send.  We also need to make sure that the remote node has the generator and linker implementations and any resources they need.

In general, we can do some of these transfers for people if it helps.  However, perhaps we could start by leaving it up to the surrounding build tool, and then add support to GWT as more specific needs are identified?

Lex

Scott Blum

unread,
Feb 16, 2010, 3:32:14 PM2/16/10
to google-web-tool...@googlegroups.com
On Fri, Feb 12, 2010 at 7:00 PM, Lex Spoon <sp...@google.com> wrote:
On Fri, Feb 12, 2010 at 9:50 AM, Alex Moffat <alex....@gmail.com> wrote:
Where can I read a description of what -XshardPrecompile, or see the
code for it, it sounds very useful to me personally?

-XshardPrecompile is an experiment that everyone wants to change, so it seems unlikely to be released in its current form.  We can talk about it if it helps, but I would propose that we focus more on what we want to do for real.

It seemed relevant because it sounded like you propose to essentially make -XshardPrecompile the default (only?) behavior for Precompile?  Or did I misread?  The reason that makes me cautious has to do with a desire for a future change to the Generator API to support things like minimal rebuild.  I imagine a world where the work each Generator does could be sharded out in a way that's independent of the number of permutations.

 
- I'm not sure why development mode wouldn't run a sharded link first.  Wouldn't it make sense if development mode works just like production compile, it just runs a single "development mode" permutation shard link before running the final link?

Sure, we can do that. Note, though, that they will be running against an empty ArtifactSet, because there aren't any compiles for them to look at.  Thus, they won't typically do anything.

Do public resources and generated resources show up during the sharded phase?
 


2) Instead of trying to do automatic thinning, we just let the linkers themselves do the thinning.  For example, one of the most serialization-expensive things we do is serialize/deserialze symbolMaps.  To avoid this, we update SymbolMapsLinker to do most of its work during sharding, and update IFrameLinker (et al) to remove the CompilationResult during the sharded link so it never gets sent across to the final link.

In addition to the other issues pointed out, note that this adds ordering constraints among the linkers.  Any linker that deletes something must run after every linker that wants to look at it.  Your example wouldn't work as is, because it would mean no POST linker can look at CompilationResults.  It also wouldn't work to put the deletion in a POST linker, for the same reason.  We'd have to work out a way for the deletions to happen last, after all the normal linkage activity.

Suppose, continuing that idea, we add a POSTPOST order that is used only for deletion.  If it's really only for deletion, then the usual link() API is overly general, because it lets linkers both add and remove artifacts during POSTPOST, which is not desired.  So, we want a POSTPOST API that is only for deletion.  Linkers somehow or another mark artifacts for deletion, but not anything else.  At this point, though, isn't it pretty much the same as the automated thinning in the initial proposal?

It does start to sound like a big mess when you put it that way. And Ray makes a good point about forcing people to write linkers that won't blow up when you shard them.

It sounds like, on a high level, there's a broad agreement that changing the linker API isn't really a problem.  That we should basically just redesign it to do the exact best thing and it's okay, since not many linkers exist.  To be honest, I hadn't really approached the problem from that point of view.

Now that I am thinking along those lines, it almost begs the question.  If we are willing to break the world, is this the best possible way to model new link process?  In other words, it seems worth re-examining the design without regard to the existing API and asking ourselves if it's the thing we'd have designed from scratch.  Maybe you guys all already did that and I'm the only one late to the party.

For example, if we're going from scratch, then we could avoid the transition entirely and just mandate what the new rules are.  We wouldn't need a @Shardable annotation since all linkers would need to be sharding aware.  We might rather have two separate methods for sharded vs. non-sharded link than a boolean parameter.  We might revisit the whole PRE, PRIMARY, POST thing with regards to sharding and decide the right answer is SHARD, PRE, PRIMARY, POST.  Or something.  I don't know what the right answers are.  All I'm saying is, breaking things is awesome when you're doing something revolutionary and the end result is awesome.  I just want to be sure, if we're going to break things, that we believe we'll end up somewhere revolutionary and awesome as opposed to evolutionary and incremental, but less than awesome.

--Scott

Lex Spoon

unread,
Feb 17, 2010, 6:17:09 PM2/17/10
to google-web-tool...@googlegroups.com
On Tue, Feb 16, 2010 at 3:32 PM, Scott Blum <sco...@google.com> wrote:
On Fri, Feb 12, 2010 at 7:00 PM, Lex Spoon <sp...@google.com> wrote:
On Fri, Feb 12, 2010 at 9:50 AM, Alex Moffat <alex....@gmail.com> wrote:
Where can I read a description of what -XshardPrecompile, or see the
code for it, it sounds very useful to me personally?

-XshardPrecompile is an experiment that everyone wants to change, so it seems unlikely to be released in its current form.  We can talk about it if it helps, but I would propose that we focus more on what we want to do for real.

It seemed relevant because it sounded like you propose to essentially make -XshardPrecompile the default (only?) behavior for Precompile?  Or did I misread?  

No, that's the idea.


 
The reason that makes me cautious has to do with a desire for a future change to the Generator API to support things like minimal rebuild.  I imagine a world where the work each Generator does could be sharded out in a way that's independent of the number of permutations.

Are you saying that you want to not have to shard, with future developments?  I don't think that should be a problem with this patch.  As a case in point, the Compiler entry point *could* shard out generating and linking, but it chooses not to.  We have the flexibility to play around with these choices over time.


- I'm not sure why development mode wouldn't run a sharded link first.  Wouldn't it make sense if development mode works just like production compile, it just runs a single "development mode" permutation shard link before running the final link?

Sure, we can do that. Note, though, that they will be running against an empty ArtifactSet, because there aren't any compiles for them to look at.  Thus, they won't typically do anything.

Do public resources and generated resources show up during the sharded phase?

Everyone is happy, I think, with having dev mode run a single on-shard linking step.  So, these are just details.  FWIW, here is how it is in the patch:

1. Resources are available via ResourceOracle.
2. Public artifacts are be there.  They are identical on all permutations, so they aren't added to the artifact set until the final link step.
3. Generated artifacts are there for compilation, but not for development mode.  With development mode, all linking is done before the generators run, and generators run on demand.


----------- you write (gmail just messed up my reply quotes): ----

 Now that I am thinking along those lines, it almost begs the question.  If we are willing to break the world, is this the best possible way to model new link process?  In other words, it seems worth re-examining the design without regard to the existing API and asking ourselves if it's the thing we'd have designed from scratch.  Maybe you guys all already did that and I'm the only one late to the party.

For example, if we're going from scratch, then we could avoid the transition entirely and just mandate what the new rules are.  We wouldn't need a @Shardable annotation since all linkers would need to be sharding aware.  We might rather have two separate methods for sharded vs. non-sharded link than a boolean parameter.  We might revisit the whole PRE, PRIMARY, POST thing with regards to sharding and decide the right answer is SHARD, PRE, PRIMARY, POST.  Or something.  I don't know what the right answers are.  All I'm saying is, breaking things is awesome when you're doing something revolutionary and the end result is awesome.  I just want to be sure, if we're going to break things, that we believe we'll end up somewhere revolutionary and awesome as opposed to evolutionary and incremental, but less than awesome.
--------------------------------------------------------------------------------


I initially proposed simply breaking the world.  However, at your encouragement, this patch has developed to be backwards compatible.  As things stand, this patch both gets a large improvement and is evolutionary.

On those specific changes:

1. @Shardable can certainly be dropped after a deprecation period.  Is there any urgency to drop it immediately?

2. Two separate methods versus one with a boolean looks fine to me.  It's changed back and forth as the patch developed.

3.PRE/PRIMARY/POST still appear to be useful.  All linkers care whether they are primary or not, because there is one primary linker and it must deal with generating a selection script.  Additionally, a few linkers care whether they go before or after the primary linker.

4. SHARD as a separate linker order is very tempting but turns out to have some problems. First, many linkers have both an on-shard and on-final part, and if SHARD was a separate order then those linkers would have to be subdivided into two linkers.  Instead of IframeLinker, we'd have to have IframeShardLinker and IframeFinalLinker.  Second, the SHARD part also has PRE/PRIMARY/POST, so you really have six linker orders, not four.  It's tidier to represent the six as two times three.

Lex

Scott Blum

unread,
Feb 18, 2010, 7:10:16 PM2/18/10
to google-web-tool...@googlegroups.com
On Wed, Feb 17, 2010 at 6:17 PM, Lex Spoon <sp...@google.com> wrote:
On Tue, Feb 16, 2010 at 3:32 PM, Scott Blum <sco...@google.com> wrote:
On Fri, Feb 12, 2010 at 7:00 PM, Lex Spoon <sp...@google.com> wrote:
On Fri, Feb 12, 2010 at 9:50 AM, Alex Moffat <alex....@gmail.com> wrote:
Where can I read a description of what -XshardPrecompile, or see the
code for it, it sounds very useful to me personally?

-XshardPrecompile is an experiment that everyone wants to change, so it seems unlikely to be released in its current form.  We can talk about it if it helps, but I would propose that we focus more on what we want to do for real.

It seemed relevant because it sounded like you propose to essentially make -XshardPrecompile the default (only?) behavior for Precompile?  Or did I misread?  

No, that's the idea.

Okay.  I think making precompiles sharded is maybe what makes some folks nervous, because there's no way to avoid sending lots of "stuff" over the wire, as opposed to the current configuration which only needs to send over the AST for optimizations.  Also, while sharded precompiles take less wall time to get the whole thing done (provided you have plenty of machine resources), it takes more total CPU-hours, and could actually take longer if machine resources are scarce, which could be a concern for some.

I'm not (at this time) making a case for or against making precompiles always sharded.  Rather, I'm whether that argument can, or cannot, be separated from sharded linking.  I'm not clear on whether sharded linking inherently requires sharded precompiles.
 
 
The reason that makes me cautious has to do with a desire for a future change to the Generator API to support things like minimal rebuild.  I imagine a world where the work each Generator does could be sharded out in a way that's independent of the number of permutations.

Are you saying that you want to not have to shard, with future developments?  I don't think that should be a problem with this patch.  As a case in point, the Compiler entry point *could* shard out generating and linking, but it chooses not to.  We have the flexibility to play around with these choices over time.

Ok, good point.
 
Everyone is happy, I think, with having dev mode run a single on-shard linking step.  So, these are just details.  FWIW, here is how it is in the patch:

1. Resources are available via ResourceOracle.
2. Public artifacts are be there.  They are identical on all permutations, so they aren't added to the artifact set until the final link step.
3. Generated artifacts are there for compilation, but not for development mode.  With development mode, all linking is done before the generators run, and generators run on demand.

In the current design, linkers get run again (via relink()), whenever additional generated resources are created.  That gives any linkers which have connections to generators a chance to run again any time new generated resources become available.  Are you saying it will no longer work this way?
 
----------- you write (gmail just messed up my reply quotes): ----

 Now that I am thinking along those lines, it almost begs the question.  If we are willing to break the world, is this the best possible way to model new link process?  In other words, it seems worth re-examining the design without regard to the existing API and asking ourselves if it's the thing we'd have designed from scratch.  Maybe you guys all already did that and I'm the only one late to the party.

For example, if we're going from scratch, then we could avoid the transition entirely and just mandate what the new rules are.  We wouldn't need a @Shardable annotation since all linkers would need to be sharding aware.  We might rather have two separate methods for sharded vs. non-sharded link than a boolean parameter.  We might revisit the whole PRE, PRIMARY, POST thing with regards to sharding and decide the right answer is SHARD, PRE, PRIMARY, POST.  Or something.  I don't know what the right answers are.  All I'm saying is, breaking things is awesome when you're doing something revolutionary and the end result is awesome.  I just want to be sure, if we're going to break things, that we believe we'll end up somewhere revolutionary and awesome as opposed to evolutionary and incremental, but less than awesome.
--------------------------------------------------------------------------------


I initially proposed simply breaking the world.  However, at your encouragement, this patch has developed to be backwards compatible.  As things stand, this patch both gets a large improvement and is evolutionary.

Okay, I am convinced now that this change is more evolutionary than it sounded like from the high level description.  (For example, it sounded like link() was actually getting an extra parameter-- a breaking change-- but when I actually looked at the patch, I saw it was a new overload.)
 
On those specific changes:

1. @Shardable can certainly be dropped after a deprecation period.  Is there any urgency to drop it immediately?

2. Two separate methods versus one with a boolean looks fine to me.  It's changed back and forth as the patch developed.

Ok, that all sounds good.
 
3.PRE/PRIMARY/POST still appear to be useful.  All linkers care whether they are primary or not, because there is one primary linker and it must deal with generating a selection script.  Additionally, a few linkers care whether they go before or after the primary linker.

4. SHARD as a separate linker order is very tempting but turns out to have some problems. First, many linkers have both an on-shard and on-final part, and if SHARD was a separate order then those linkers would have to be subdivided into two linkers.  Instead of IframeLinker, we'd have to have IframeShardLinker and IframeFinalLinker.  Second, the SHARD part also has PRE/PRIMARY/POST, so you really have six linker orders, not four.  It's tidier to represent the six as two times three.
 
Yeah, the two times three thing... I totally get it, and it makes sense at one level, definitely.  Certainly having to have "paired" linkers that run in different phases seems plain bad.  At the same time, thinking about it in terms of PRE-shard, PRIMARY-shard, POST-shard, PRE-final, PRIMARY-final, POST-final seems kind of... unpleasant.  It might be totally the right way to go, I just wish it were less ugly. :)

But I can't think of anything less-ugly that doesn't run into some problems, either.  The best idea I had was that maybe @LinkerOrder could optionally take an array of stages in which to run the same linker.  So the IFrameLinker could have @LinkerOrder(SHARD, PRIMARY) whereas something like "gzip static resources" could be @LinkerOrder(SHARD, PRE).  I still kind of like this idea in general, but I can't figure out a good solution to the question: how do determine the appropriate order for linkers *within* a pass (particularly SHARD pass) that were originally ordered via PRE, PRIMARY, POST?  And this approach also particularly gets in the way of the original intent of using lexiical order within the GWT XML to determine what order to run the linkers -- and the order is actually defined to have stack-like behavior ala "earliest definitions run closest to PRIMARY" rather than "earliest first" or "earliest last".

I dunno, if that question could be answered, I think that approach appeals to me more than 3 times 2.  But I have to admit that 3 times 2 seems like the best approach we've come up with so far, even if it's a little ugly.

Scott


Reply all
Reply to author
Forward
0 new messages