There are a lot of interesting things on the horizon for the Compiler that we have been talking about lately, more than we can probably do in Q1, in fact. I'm going to talk briefly about each of the major areas, the implications where it's non-obvious, and finally propose an order in which I think they should be tackled.
1) Java 5.0 Language support
I won't belabour this one too much because it's fairly self-explanatory. As we've talked about before, this would essentially switch all GWT client code to compiling under 5.0 source settings. There would be no option to compile using 1.4 source settings, which in practice would not mean a whole lot since it's backwards-compatible.
Subparts of this effort include: - Generics - Autoboxing - Enums - For-each - Rewrite JRE with generics - Annotations (may be zero code changes) - Covariant return types (may be zero code changes) - Partial template optimizations (not required)
I'm thinking would be about 4-5 weeks of effort (real time).
2) Compiler Infrastructure
There are a number of things that I'm bundling together here under this category, which all relate to improving the compiler's reliability, maintainability, and performance.
Subparts might include: - Add file and line information to the AST to track it throughout compilation - Keep track of what nodes are being visited so that if an Internal Compiler Error occurs, the user can be told what file, line, and node was being examined. - Figure out if there's a lightweight way to implement stack traces in JS; add a compiler flag to turn on stack traces (especially for JUnit support). - Compile in assertions with a compiler option (perhaps -ea) - Refactor how we perform AST modifications. The ChangeList method was a good idea, but in practice has gotten in the way more than it's helped, obscured the actual algorithms, and broken certain kinds of encapsulation. I think what we want to do here is design a type of Visitor that can perform modifications and localize the actual updates within individual AST nodes. We'd also want to add better infrastructure for logging what the compiler does when it modifies the tree. - Look for low-hanging fruit in terms of improving the compiler's speed and memory usage. - Document & advertise the technique of using a derived module to force one only permutation to be compiled.
3) Some sweet optimizations
Basically, there's a whole lot more we can be doing to bring down code size more.
Some subparts: - Easy optimizations that have already been proven / patches provided. - Ability to inline handwritten JavaScript. This would require JS AST subtrees to be clonable, I think. It also begs the question of whether we try to do the inlining early into a mixed Java/JS tree, or we do it late in the pure JS AST. The latter might require more metadata to be in the JS AST so that we could distinguish between compiler-generated vs. handwritten, which would tell us what assumptions we can make. - Allow MakeCallsStatic to work on JSNI functions - Ability to prune unused parameters (especially unused $this in staticified instance methods) - Multi-type type flow instead of single-type-at-a-time "tightening". Instead of being a single type, any given identifier would be a set of possible types. - Merge string literals into a string table - Compiler flags to control optimizations (maybe) - Static Single Assigment (far out)
Most things in this list require doing item #2 first, because the limitations in the changelist infrastructure would be barriers to getting these done.
4) Metrics on generated code - Generate profiling code into the an application - Ability to perform some analysis on the compiler's output (not fully fleshed out)
5) Support dynamic module loading
This means the ability to break an app down into dynamically loaded parts that are not part of the initial download. More code would be downloaded to the client as needed. As far as we can tell, there isn't a good automatic way to do this; the developer would need to be actively involved in determining what the segments should be.
6) Compiler support for JavaScript API
This is just whatever is needed for the JS API proposals. We're not sure exactly what that is.
So.. that was quite a bit of stuff... I think that pretty much covers the space of things we could do in the compiler next. My personal opinion is that we should focus on items #2 and then #1 this quarter (in that order), and perhaps also do item #6 as it's needed. I think #1 is a gotta-have, but I'd rather not dive into it without doing the necessary infrastructure work first, or I'd just be adding to the eventual reworkload. #6 is also pretty important since it would be blocking another effort. Finally, I'd like to get a couple of the low-hanging, isolated optimizations in, such as the patches Sandy has submitted.
> There are a lot of interesting things on the horizon for the Compiler > that we have been talking about lately, more than we can probably do > in Q1, in fact. I'm going to talk briefly about each of the major > areas, the implications where it's non-obvious, and finally propose an > order in which I think they should be tackled.
> 1) Java 5.0 Language support
> I won't belabour this one too much because it's fairly > self-explanatory. As we've talked about before, this would > essentially switch all GWT client code to compiling under 5.0 source > settings. There would be no option to compile using 1.4 source > settings, which in practice would not mean a whole lot since it's > backwards-compatible.
> Subparts of this effort include: > - Generics > - Autoboxing > - Enums > - For-each > - Rewrite JRE with generics > - Annotations (may be zero code changes) > - Covariant return types (may be zero code changes) > - Partial template optimizations (not required)
> I'm thinking would be about 4-5 weeks of effort (real time).
> 2) Compiler Infrastructure
> There are a number of things that I'm bundling together here under > this category, which all relate to improving the compiler's > reliability, maintainability, and performance.
> Subparts might include: > - Add file and line information to the AST to track it throughout > compilation > - Keep track of what nodes are being visited so that if an Internal > Compiler Error occurs, the user can be told what file, line, and node > was being examined. > - Figure out if there's a lightweight way to implement stack traces in > JS; add a compiler flag to turn on stack traces (especially for JUnit > support). > - Compile in assertions with a compiler option (perhaps -ea) > - Refactor how we perform AST modifications. The ChangeList method > was a good idea, but in practice has gotten in the way more than it's > helped, obscured the actual algorithms, and broken certain kinds of > encapsulation. I think what we want to do here is design a type of > Visitor that can perform modifications and localize the actual updates > within individual AST nodes. We'd also want to add better > infrastructure for logging what the compiler does when it modifies the > tree. > - Look for low-hanging fruit in terms of improving the compiler's > speed and memory usage. > - Document & advertise the technique of using a derived module to > force one only permutation to be compiled.
> 3) Some sweet optimizations
> Basically, there's a whole lot more we can be doing to bring down code > size more.
> Some subparts: > - Easy optimizations that have already been proven / patches provided. > - Ability to inline handwritten JavaScript. This would require JS AST > subtrees to be clonable, I think. It also begs the question of > whether we try to do the inlining early into a mixed Java/JS tree, or > we do it late in the pure JS AST. The latter might require more > metadata to be in the JS AST so that we could distinguish between > compiler-generated vs. handwritten, which would tell us what > assumptions we can make. > - Allow MakeCallsStatic to work on JSNI functions > - Ability to prune unused parameters (especially unused $this in > staticified instance methods) > - Multi-type type flow instead of single-type-at-a-time "tightening". > Instead of being a single type, any given identifier would be a set of > possible types. > - Merge string literals into a string table > - Compiler flags to control optimizations (maybe) > - Static Single Assigment (far out)
> Most things in this list require doing item #2 first, because the > limitations in the changelist infrastructure would be barriers to > getting these done.
> 4) Metrics on generated code > - Generate profiling code into the an application > - Ability to perform some analysis on the compiler's output (not fully > fleshed out)
> 5) Support dynamic module loading
> This means the ability to break an app down into dynamically loaded > parts that are not part of the initial download. More code would be > downloaded to the client as needed. As far as we can tell, there > isn't a good automatic way to do this; the developer would need to be > actively involved in determining what the segments should be.
> 6) Compiler support for JavaScript API
> This is just whatever is needed for the JS API proposals. We're not > sure exactly what that is.
> So.. that was quite a bit of stuff... I think that pretty much covers > the space of things we could do in the compiler next. My personal > opinion is that we should focus on items #2 and then #1 this quarter > (in that order), and perhaps also do item #6 as it's needed. I think > #1 is a gotta-have, but I'd rather not dive into it without doing the > necessary infrastructure work first, or I'd just be adding to the > eventual reworkload. #6 is also pretty important since it would be > blocking another effort. Finally, I'd like to get a couple of the > low-hanging, isolated optimizations in, such as the patches Sandy has > submitted.
> Thoughts? > Scott
-- "There are only 10 types of people in the world: Those who understand binary, and those who don't"
> On 1/18/07, Scott Blum <sco...@google.com> wrote:
> > Hi all,
> > There are a lot of interesting things on the horizon for the Compiler > > that we have been talking about lately, more than we can probably do > > in Q1, in fact. I'm going to talk briefly about each of the major > > areas, the implications where it's non-obvious, and finally propose an > > order in which I think they should be tackled.
> > 1) Java 5.0 Language support
> > I won't belabour this one too much because it's fairly > > self-explanatory. As we've talked about before, this would > > essentially switch all GWT client code to compiling under 5.0 source > > settings. There would be no option to compile using 1.4 source > > settings, which in practice would not mean a whole lot since it's > > backwards-compatible.
> > Subparts of this effort include: > > - Generics > > - Autoboxing > > - Enums > > - For-each > > - Rewrite JRE with generics > > - Annotations (may be zero code changes) > > - Covariant return types (may be zero code changes) > > - Partial template optimizations (not required)
> > I'm thinking would be about 4-5 weeks of effort (real time).
> > 2) Compiler Infrastructure
> > There are a number of things that I'm bundling together here under > > this category, which all relate to improving the compiler's > > reliability, maintainability, and performance.
> > Subparts might include: > > - Add file and line information to the AST to track it throughout > > compilation > > - Keep track of what nodes are being visited so that if an Internal > > Compiler Error occurs, the user can be told what file, line, and node > > was being examined. > > - Figure out if there's a lightweight way to implement stack traces in > > JS; add a compiler flag to turn on stack traces (especially for JUnit > > support). > > - Compile in assertions with a compiler option (perhaps -ea) > > - Refactor how we perform AST modifications. The ChangeList method > > was a good idea, but in practice has gotten in the way more than it's > > helped, obscured the actual algorithms, and broken certain kinds of > > encapsulation. I think what we want to do here is design a type of > > Visitor that can perform modifications and localize the actual updates > > within individual AST nodes. We'd also want to add better > > infrastructure for logging what the compiler does when it modifies the > > tree. > > - Look for low-hanging fruit in terms of improving the compiler's > > speed and memory usage. > > - Document & advertise the technique of using a derived module to > > force one only permutation to be compiled.
> > 3) Some sweet optimizations
> > Basically, there's a whole lot more we can be doing to bring down code > > size more.
> > Some subparts: > > - Easy optimizations that have already been proven / patches provided. > > - Ability to inline handwritten JavaScript. This would require JS AST > > subtrees to be clonable, I think. It also begs the question of > > whether we try to do the inlining early into a mixed Java/JS tree, or > > we do it late in the pure JS AST. The latter might require more > > metadata to be in the JS AST so that we could distinguish between > > compiler-generated vs. handwritten, which would tell us what > > assumptions we can make. > > - Allow MakeCallsStatic to work on JSNI functions > > - Ability to prune unused parameters (especially unused $this in > > staticified instance methods) > > - Multi-type type flow instead of single-type-at-a-time "tightening". > > Instead of being a single type, any given identifier would be a set of > > possible types. > > - Merge string literals into a string table > > - Compiler flags to control optimizations (maybe) > > - Static Single Assigment (far out)
> > Most things in this list require doing item #2 first, because the > > limitations in the changelist infrastructure would be barriers to > > getting these done.
> > 4) Metrics on generated code > > - Generate profiling code into the an application > > - Ability to perform some analysis on the compiler's output (not fully > > fleshed out)
> > 5) Support dynamic module loading
> > This means the ability to break an app down into dynamically loaded > > parts that are not part of the initial download. More code would be > > downloaded to the client as needed. As far as we can tell, there > > isn't a good automatic way to do this; the developer would need to be > > actively involved in determining what the segments should be.
> > 6) Compiler support for JavaScript API
> > This is just whatever is needed for the JS API proposals. We're not > > sure exactly what that is.
> > So.. that was quite a bit of stuff... I think that pretty much covers > > the space of things we could do in the compiler next. My personal > > opinion is that we should focus on items #2 and then #1 this quarter > > (in that order), and perhaps also do item #6 as it's needed. I think > > #1 is a gotta-have, but I'd rather not dive into it without doing the > > necessary infrastructure work first, or I'd just be adding to the > > eventual reworkload. #6 is also pretty important since it would be > > blocking another effort. Finally, I'd like to get a couple of the > > low-hanging, isolated optimizations in, such as the patches Sandy has > > submitted.
Much of what the GWTCompiler is doing is still opaque to me but I have some comments.
ChangeLists: When working on some of the patches I've submitted the first implementation I wrote simply modified the AST and it seemed to work. Then I converted the modifications to use ChangeLists and it was unclear to me when the accumulated changes get applied. This created a concern for me that accumulated changes could conflict. I think my patches are safe, and the later ones I think I wrote in a way that I think are defensive against accumulated conflicting changes but I'm not 100% sure.
JavaScript AST: One feature of the AST I would have appreciated is a getParent() for each node much like the XML DOM has. In the dead code removal I ended up keeping stacks of what nodes in the AST have already been visited so I could find the parent node of a child and remove that child from the parent.
This guy wrote some code to add stack trace support to JavaScript. what he does is basically intercept a call to a function, push a stack frame to a global list object and then call the original function. Then he can inspect that list to create a stack trace. http://kallewoof.com/index.php/2006/03/15/precompiling-javascript-fun...
Dynamic Module Loading: This isn't that important to me but I would like a staged loading which may be more easily achievable and improve perceived performance. For example if GWT could get a login form or the welcome page up before the entire code is loaded then the user would think they were happier.
I've wondered if at compilePermutations it would be possible to use Runtime.availableProcessors() of Threads running in parallel.
Also, is it worth making the JS AST serializable so you could send a copy of it to another machine and have it run optimizations like distcc can distribute work.
On 1/18/07, Sandy McArthur <sandy...@gmail.com> wrote:
> Dynamic Module Loading: This isn't that important to me but I would > like a staged loading which may be more easily achievable and improve > perceived performance.
"Staged loading" is what Scott's talking about, and yours is a more accurate term. The compiler would in essence still compile monolithically, but it would then divide the monolithic script into separately loadable chunks that get pulled down based on how the developer architects the Java code.
On 1/18/07, Sandy McArthur <sandy...@gmail.com> wrote:
> ChangeLists: When working on some of the patches I've submitted the > first implementation I wrote simply modified the AST and it seemed to > work. Then I converted the modifications to use ChangeLists and it was > unclear to me when the accumulated changes get applied. This created a > concern for me that accumulated changes could conflict. I think my > patches are safe, and the later ones I think I wrote in a way that I > think are defensive against accumulated conflicting changes but I'm not > 100% sure.
By the way, I just got through reviewing your patch for issue #518, and I think it totally underscores the need for refactoring. Your patch is terrific, but it's also horrific in a way. It's way more complex than it needs to be, and it's clear that you were forced into the complexity by the way we are doing things in the compiler. Example: you shouldn't have to keep a stack of blocks and if statements just to track your parent.
I want to use your patch as a forcing function for doing the refactoring that's needed. Basically, I want to build the infrastructure needed to rewrite your patch the "right" way, and then in time rewrite the other AST modifying visitors to use the new, better infrastructure.
Just to be crystal clear, I'm only criticizing the current compiler infrastructure, not your patch. The patch is great. :)
Regarding Stacktrace support in javascript. I have implemented it a while back for my rocket-gwt library. It includes stacktrace support for ie, ff, (dont know about safari) but not opera. You can run the included tests to check it out. The only browser it doesnt work in is opera because opera does not provide something like callee etc to let the code figure out the call stacktrace.
Whilst not 100% accurate its good enough. There are a number of issues with calling callee: * one must keep recursively calling callee on each function to visit the stacktrace. * one must be careful not to get stuck in an infinite loop X > Y > Z > Y > Z
In such cases i was forced to lose the bottom part of the stack ( X>Y>Z) where Z is found a second time.
If gwt wants to use my code in the real gwt go for it ...The only gotcha is one must use a 1.5 jdk because my code needs the new StackTraceElement(String/String/String/int) constructor.
Manually keeping a stacktrace is nasty..What happens if one forgets to pop
hth mP
On 1/19/07, Sandy McArthur <sandy...@gmail.com> wrote:
> Much of what the GWTCompiler is doing is still opaque to me but I have > some comments.
> ChangeLists: When working on some of the patches I've submitted the > first implementation I wrote simply modified the AST and it seemed to > work. Then I converted the modifications to use ChangeLists and it was > unclear to me when the accumulated changes get applied. This created a > concern for me that accumulated changes could conflict. I think my > patches are safe, and the later ones I think I wrote in a way that I > think are defensive against accumulated conflicting changes but I'm not > 100% sure.
> JavaScript AST: One feature of the AST I would have appreciated is a > getParent() for each node much like the XML DOM has. In the dead code > removal I ended up keeping stacks of what nodes in the AST have already > been visited so I could find the parent node of a child and remove that > child from the parent.
> JavaScript Stack Traces: Mozilla is the only JS engine that has caller > variable that can be used inside of a function to find out what > function called it.
> This guy wrote some code to add stack trace support to JavaScript. what > he does is basically intercept a call to a function, push a stack frame > to a global list object and then call the original function. Then he > can inspect that list to create a stack trace.
> Dynamic Module Loading: This isn't that important to me but I would > like a staged loading which may be more easily achievable and improve > perceived performance. For example if GWT could get a login form or the > welcome page up before the entire code is loaded then the user would > think they were happier.
> I've wondered if at compilePermutations it would be possible to use > Runtime.availableProcessors() of Threads running in parallel.
> Also, is it worth making the JS AST serializable so you could send a > copy of it to another machine and have it run optimizations like distcc > can distribute work.
I was thinking in the same direction but there are three open questions in such approach: 1) Where to put handling for uncaught exceptions 2) What to do if exception happen but was caught in user code, so we have to modify traced stack 3) How expensive to make try/catch for each call
Please have a look on following code, which address all three problems. Maybe you will find it useful. It was tested on IE, FF and Opera 9.
/* StackTrace is global class, which provide simple infrastructure to find bugs in javascript code.
Because of high risk to break parentness of calls it is strongly recommended never write StackTrace-aware code manually but use it together with some automatic code generator or transformer.
Here is skeleton for StackTrace-aware function
function hasStackTraceSample () { // Standard prologue var stackTraceEntry = StackTrace.enter(this,arguments,anyUserData); if (stackTraceEntry.hasResult) return stackTraceEntry.resultValue;
StackTrace.setStatus ( { file: 'source.java', line: 12 } ); // ANY JAVASCRIPT CODE HERE // SEE RULES FOR USAGE OF return AND catch STATEMENTS BELOW
StackTrace.setStatus ( { file: 'source.java', line: 13 } ); // ANY JAVASCRIPT CODE HERE // SEE RULES FOR USAGE OF return AND catch STATEMENTS BELOW
// Standard epiloge var returnValue = <CALCULATE_RETURN_VALUE_EXPRESSION> StackTrace.leave(stackTraceEntry); return returnValue; }
It is required that - expression part of non-empty return statement is either literal or variable(local or parameter). AT LEAST NOTHING THAT CAN RAISE AN EXCEPTION! - that StackTrace.leave(stackTraceEntry) called immidiately before each return or as last statement in the function without return.
If StackTrace-aware function contains try/catch block, the first statement in the catch part should be StackTrace.catch$() with corresponding stackTraceEntry.
So, compiler generating StackTrace-aware code should produce following simple steps of any function, which should be StackTrace-aware (except empty ones): 1) Add all necessary StackTrace.setStatus statements It is completely up to compiler where to add or not to add this calls and what userData parameter to generate. For instance java-to-javascript compiler can keep references to original java source code in javascript AST and use this data during generation in debug mode. 2) Insert standard prologe (see above) as first statement of the function First two parameters should be always the same - this & arguments. It is completely up to compiler what userData parameter to generate. For instance java-to-javascript compiler can keep references to original java source code in javascript AST and use this data during generation in debug mode. 3) For each return statement except ones returning either literal or variable introduce local variable assign original expression to that variable modify return expression to return the variable 4) Insert StackTrace.leave(stackTraceEntry) immidiately before each return (including implicit one at the end of function) 5) Insert StackTrace.catch$(stackTraceEntry) as first statement of each catch block;
Also compiler should generate function, which will be assigned to StackTrace.onUncaughtException For instance unit test runner can log uncaught exception
If necessary compiler may also generate function, which will be assigned to StackTrace.onCaughtException */ var StackTrace = { /* Call stack */ st