There are a lot of interesting things on the horizon for the Compiler
that we have been talking about lately, more than we can probably do
in Q1, in fact. I'm going to talk briefly about each of the major
areas, the implications where it's non-obvious, and finally propose an
order in which I think they should be tackled.
1) Java 5.0 Language support
I won't belabour this one too much because it's fairly
self-explanatory. As we've talked about before, this would
essentially switch all GWT client code to compiling under 5.0 source
settings. There would be no option to compile using 1.4 source
settings, which in practice would not mean a whole lot since it's
backwards-compatible.
Subparts of this effort include:
- Generics
- Autoboxing
- Enums
- For-each
- Rewrite JRE with generics
- Annotations (may be zero code changes)
- Covariant return types (may be zero code changes)
- Partial template optimizations (not required)
I'm thinking would be about 4-5 weeks of effort (real time).
2) Compiler Infrastructure
There are a number of things that I'm bundling together here under
this category, which all relate to improving the compiler's
reliability, maintainability, and performance.
Subparts might include:
- Add file and line information to the AST to track it throughout compilation
- Keep track of what nodes are being visited so that if an Internal
Compiler Error occurs, the user can be told what file, line, and node
was being examined.
- Figure out if there's a lightweight way to implement stack traces in
JS; add a compiler flag to turn on stack traces (especially for JUnit
support).
- Compile in assertions with a compiler option (perhaps -ea)
- Refactor how we perform AST modifications. The ChangeList method
was a good idea, but in practice has gotten in the way more than it's
helped, obscured the actual algorithms, and broken certain kinds of
encapsulation. I think what we want to do here is design a type of
Visitor that can perform modifications and localize the actual updates
within individual AST nodes. We'd also want to add better
infrastructure for logging what the compiler does when it modifies the
tree.
- Look for low-hanging fruit in terms of improving the compiler's
speed and memory usage.
- Document & advertise the technique of using a derived module to
force one only permutation to be compiled.
3) Some sweet optimizations
Basically, there's a whole lot more we can be doing to bring down code
size more.
Some subparts:
- Easy optimizations that have already been proven / patches provided.
- Ability to inline handwritten JavaScript. This would require JS AST
subtrees to be clonable, I think. It also begs the question of
whether we try to do the inlining early into a mixed Java/JS tree, or
we do it late in the pure JS AST. The latter might require more
metadata to be in the JS AST so that we could distinguish between
compiler-generated vs. handwritten, which would tell us what
assumptions we can make.
- Allow MakeCallsStatic to work on JSNI functions
- Ability to prune unused parameters (especially unused $this in
staticified instance methods)
- Multi-type type flow instead of single-type-at-a-time "tightening".
Instead of being a single type, any given identifier would be a set of
possible types.
- Merge string literals into a string table
- Compiler flags to control optimizations (maybe)
- Static Single Assigment (far out)
Most things in this list require doing item #2 first, because the
limitations in the changelist infrastructure would be barriers to
getting these done.
4) Metrics on generated code
- Generate profiling code into the an application
- Ability to perform some analysis on the compiler's output (not fully
fleshed out)
5) Support dynamic module loading
This means the ability to break an app down into dynamically loaded
parts that are not part of the initial download. More code would be
downloaded to the client as needed. As far as we can tell, there
isn't a good automatic way to do this; the developer would need to be
actively involved in determining what the segments should be.
6) Compiler support for JavaScript API
This is just whatever is needed for the JS API proposals. We're not
sure exactly what that is.
So.. that was quite a bit of stuff... I think that pretty much covers
the space of things we could do in the compiler next. My personal
opinion is that we should focus on items #2 and then #1 this quarter
(in that order), and perhaps also do item #6 as it's needed. I think
#1 is a gotta-have, but I'd rather not dive into it without doing the
necessary infrastructure work first, or I'd just be adding to the
eventual reworkload. #6 is also pretty important since it would be
blocking another effort. Finally, I'd like to get a couple of the
low-hanging, isolated optimizations in, such as the patches Sandy has
submitted.
Thoughts?
Scott
ChangeLists: When working on some of the patches I've submitted the
first implementation I wrote simply modified the AST and it seemed to
work. Then I converted the modifications to use ChangeLists and it was
unclear to me when the accumulated changes get applied. This created a
concern for me that accumulated changes could conflict. I think my
patches are safe, and the later ones I think I wrote in a way that I
think are defensive against accumulated conflicting changes but I'm not
100% sure.
JavaScript AST: One feature of the AST I would have appreciated is a
getParent() for each node much like the XML DOM has. In the dead code
removal I ended up keeping stacks of what nodes in the AST have already
been visited so I could find the parent node of a child and remove that
child from the parent.
JavaScript Stack Traces: Mozilla is the only JS engine that has caller
variable that can be used inside of a function to find out what
function called it.
http://developer.mozilla.org/en/docs/Core_JavaScript_1.5_Reference:Global_Objects:Function:caller
This guy wrote some code to add stack trace support to JavaScript. what
he does is basically intercept a call to a function, push a stack frame
to a global list object and then call the original function. Then he
can inspect that list to create a stack trace.
http://kallewoof.com/index.php/2006/03/15/precompiling-javascript-functions/
Dynamic Module Loading: This isn't that important to me but I would
like a staged loading which may be more easily achievable and improve
perceived performance. For example if GWT could get a login form or the
welcome page up before the entire code is loaded then the user would
think they were happier.
I've wondered if at compilePermutations it would be possible to use
Runtime.availableProcessors() of Threads running in parallel.
Also, is it worth making the JS AST serializable so you could send a
copy of it to another machine and have it run optimizations like distcc
can distribute work.
Dynamic Module Loading: This isn't that important to me but I would
like a staged loading which may be more easily achievable and improve
perceived performance.
By the way, I just got through reviewing your patch for issue #518,
and I think it totally underscores the need for refactoring. Your
patch is terrific, but it's also horrific in a way. It's way more
complex than it needs to be, and it's clear that you were forced into
the complexity by the way we are doing things in the compiler.
Example: you shouldn't have to keep a stack of blocks and if
statements just to track your parent.
I want to use your patch as a forcing function for doing the
refactoring that's needed. Basically, I want to build the
infrastructure needed to rewrite your patch the "right" way, and then
in time rewrite the other AST modifying visitors to use the new,
better infrastructure.
Just to be crystal clear, I'm only criticizing the current compiler
infrastructure, not your patch. The patch is great. :)
Scott
Manually keeping a stacktrace is nasty..What happens if one forgets to pop
Please have a look on following code, which address all three problems.
Maybe you will find it useful. It was tested on IE, FF and Opera 9.
The code also available at
http://groups-beta.google.com/group/Google-Web-Toolkit-Contributors/web/stacktrace.js
/*
StackTrace is global class, which provide simple infrastructure to
find bugs in javascript code.
Because of high risk to break parentness of calls it is strongly
recommended never write StackTrace-aware
code manually but use it together with some automatic code generator
or transformer.
Here is skeleton for StackTrace-aware function
function hasStackTraceSample ()
{
// Standard prologue
var stackTraceEntry =
StackTrace.enter(this,arguments,anyUserData);
if (stackTraceEntry.hasResult)
return stackTraceEntry.resultValue;
StackTrace.setStatus ( { file: 'source.java', line: 12 } );
// ANY JAVASCRIPT CODE HERE
// SEE RULES FOR USAGE OF return AND catch STATEMENTS BELOW
StackTrace.setStatus ( { file: 'source.java', line: 13 } );
// ANY JAVASCRIPT CODE HERE
// SEE RULES FOR USAGE OF return AND catch STATEMENTS BELOW
// Standard epiloge
var returnValue = <CALCULATE_RETURN_VALUE_EXPRESSION>
StackTrace.leave(stackTraceEntry);
return returnValue;
}
It is required that
- expression part of non-empty return statement is either literal or
variable(local or parameter).
AT LEAST NOTHING THAT CAN RAISE AN EXCEPTION!
- that StackTrace.leave(stackTraceEntry) called immidiately before
each return or as last statement in the function without return.
If StackTrace-aware function contains try/catch block, the first
statement in the catch part should be
StackTrace.catch$() with corresponding stackTraceEntry.
So, compiler generating StackTrace-aware code should produce
following simple steps of any function,
which should be StackTrace-aware (except empty ones):
1) Add all necessary StackTrace.setStatus statements
It is completely up to compiler where to add or not to add this
calls and what userData parameter to generate.
For instance java-to-javascript compiler can keep references to
original java source code in javascript AST
and use this data during generation in debug mode.
2) Insert standard prologe (see above) as first statement of the
function
First two parameters should be always the same - this &
arguments.
It is completely up to compiler what userData parameter to
generate.
For instance java-to-javascript compiler can keep references to
original java source code in javascript AST
and use this data during generation in debug mode.
3) For each return statement except ones returning either literal or
variable
introduce local variable
assign original expression to that variable
modify return expression to return the variable
4) Insert StackTrace.leave(stackTraceEntry) immidiately before each
return (including implicit one at the end of function)
5) Insert StackTrace.catch$(stackTraceEntry) as first statement of
each catch block;
Also compiler should generate function, which will be assigned to
StackTrace.onUncaughtException
For instance unit test runner can log uncaught exception
If necessary compiler may also generate function, which will be
assigned to StackTrace.onCaughtException
*/
var StackTrace =
{
/*
Call stack
*/
stack : [],
/*
Start execution of StackTrace-aware function
Following code should be first two statements in any StackTrace-aware
method.
First two parameters should be always the same - this & arguments.
userData - any object to be interpreted by an application.
For instance, it can be an object representing original source
file and position in this file
// Standard prologue
var stackTraceEntry = StackTrace.enter(this,arguments,anyUserData);
if (stackTraceEntry.hasResult)
return stackTraceEntry.resultValue;
*/
enter : function (callerThis,callerArguments,userData)
{
var len = StackTrace.stack.length;
if (len == 0)
{
/*
Complicated case:
It is first stackTraced call and we are unprotected from
uncaught exceptions.
Instead of return to caller we
- push call data to the stack
- call caller function once again with the same parameters
but inside try/catch block
- returns to caller with information that it has to stop its
execution and immidiately return provided result
- our standard prologue garantees that it happens
*/
try
{
// put artificial element to prevent infinite recursion
StackTrace.stack.push(null);
var params = [];
for (var param = 0; param < callerArguments.length; param++)
params.push(callerArguments[param]);
var result = callerArguments.callee.apply(callerThis, params);
// current call element
StackTrace.stack.pop();
// drop artificial element
StackTrace.stack.pop();
return { hasResult : true, resultValue : result }
}
catch (e)
{
// Uncaught (by stacktraced code) exception happen.
// Call user-defined callback
StackTrace.onUncaughtException (e);
// Drop collected stackTraces and rethrow the exception.
// Can we do something better?
StackTrace.stack.length = 0;
throw e;
}
}
else
{
/*
Simple case:
It is not first stackTraced call and we know that it is
already guarded by try/catch created in previous case.
So we just push data in to the stack and retuns to normal
execution of caller function
*/
StackTrace.stack.push(
{
callerThis : callerThis,
callerArguments : callerArguments,
userData : userData,
status : null
});
return { stackLength : len, hasResult : false };
}
},
/*
Complete execution of StackTrace-aware function
StackTrace.pop(push) should be called immidiately before each
return
or as last statement in a function without return.
enter - value returned by corresponding call to
StackTrace.enterStackTrace
*/
leave : function (entry)
{
StackTrace.stack.pop ();
if (entry && StackTrace.stack.length != entry.stackLength)
throw new Error('StackTrace stack is corrupted');
},
/*
Call user-defined callback onCaughtException and then drop all
failed calls from the stack
If StackTrace-aware function contains try/catch block, the first
statement in the catch part should be
StackTrace.catch$(stackTraceEntry,exception).
*/
catch$ : function (entry,e)
{
// call user provided callback
StackTrace.onCaughtException (e);
// drop failed calls from the stack
while (StackTrace.stack.length > 0 && entry.stackLength <
StackTrace.stack.length-1)
StackTrace.stack.pop ();
},
/*
Save current status of of StackTrace-aware function.
status - any object to be interpreted by an application.
For instance, it can be an object representing original source
file and position of current execution point
in original javascript or java file
*/
setStatus : function(status)
{
if (StackTrace.stack.length < 2)
throw new Error('StackTrace stack is corrupted');
StackTrace.stack[StackTrace.stack.length-1].executionPoint =
status;
},
/*
User-defined callback for logging etc.
Called from StackTrace.enterStackTrace if uncaught exception
happened inside guarding try/catch block.
StackTrace.stack points to the function where exception happen.
It is strongly recommended to keep this code very simple and
exception free :)
*/
onUncaughtException : function (e)
{},
/*
User-defined callback for logging etc.
Called from StackTrace.catchStackTrace.
StackTrace.stack points to the function where exception happen.
It is strongly recommended to keep this code very simple and
exception free :)
*/
onCaughtException : function (e)
{}
};
On Jan 20, 12:23 pm, "Scott Blum" <sco...@google.com> wrote:
> On 1/20/07, Miroslav Pokorny <miroslav.poko...@gmail.com> wrote:
>
>
>
> > Manually keeping a stacktrace is nasty..What happens if one forgets to popMy thought was to enclose each method in a try/finally...
No. But I have more ideas as always. :-)
I think it would be nice if no matter what compiler settings are used
the current stack frame is added to Throwables when they are created.
The hardest bugs are the ones that only happen for your users and that
is unfortunately when you've turned on all optimizations to get the
code size down. Even if it's just a class and method name, that can
save hours trying to get close to the source of a problem.
Another method between manually keeping a stack of stack frames and
just the current frame method above would be to only fill in the stack
frames as an exception bubbles up. You'd loose line numbers and it
wouldn't be a complete stack but it should have a minimal performance
impact for when an exception is not thrown.
If you didn't want to embed package, class, and method literals in the
JavaScript everywhere you should be able to use arguments.callee in all
browsers plus a little RegEx magic to extract a method name.
Finally, some obfuscaters produce a table to convert the obfuscated
stack frames into real stack frames. Having a translation table next to
the nocache.html would be nice so you could convert "Fj" into
"com.google.gwt.user.client.Event" or whatever it may be.
If we drop performance reason, which truly speaking nobody checked,
then I agree that your approach is a bit more straightforward. By some
reasons I decided that try/catch/finally will bring us a lot of
problems with user-defined catches but now I see it is not a case.
Anyway it was useful exercise for me and I think this hack with return
statements can be useful in some other cases.
Alex
I haven't thought through the all the use cases (it's a Saturday
afternoon and I'm supposed to be working on the book :-), but for
handling normal "Java" exceptions, is it possible to just use a
synthesized parameter and avoid and sort of try/finally? For example...
==== Generated JavaSript ====
var FILES = ["<init>", "A.java", "B.java", "C.java"];
// Compiler-emitted bootstrap
function init() {
try {
a([null,FILES[0],0]);
} catch (e) {
alert(e);
}
}
// From A.java
function a(trace) {
// ...stuff...
b([trace, FILES[1], 12]);
// ...stuff...
b([trace, FILES[1], 16]);
// ...stuff...
}
// From B.java
function b(trace) {
// ...stuff...
c([trace, FILES[2], 322]);
// ...stuff...
}
// From C.java
function c(trace) {
// ...stuff...
throwException("An exception message", [trace, FILES[3], 412]);
}
// The real thing would throw a strutured exception object
function throwException(msg, trace) {
var s = msg;
while (trace != null) {
s += "\n " + trace[1] + "(" + trace[2] + ")";
trace = trace[0];
}
throw s;
}
I think it would be nice if no matter what compiler settings are used
the current stack frame is added to Throwables when they are created.
The hardest bugs are the ones that only happen for your users and that
is unfortunately when you've turned on all optimizations to get the
code size down. Even if it's just a class and method name, that can
save hours trying to get close to the source of a problem.
Another method between manually keeping a stack of stack frames and
just the current frame method above would be to only fill in the stack
frames as an exception bubbles up. You'd loose line numbers and it
wouldn't be a complete stack but it should have a minimal performance
impact for when an exception is not thrown.
If you didn't want to embed package, class, and method literals in the
JavaScript everywhere you should be able to use arguments.callee in all
browsers plus a little RegEx magic to extract a method name.
Finally, some obfuscaters produce a table to convert the obfuscated
stack frames into real stack frames. Having a translation table next to
the nocache.html would be nice so you could convert "Fj" into
"com.google.gwt.user.client.Event" or whatever it may be.
I think so. If you didn't catch the exception then it'd basically have
the full stack when the uncaught exception handler got it. (Is there an
uncaught exception handler?)
If the programmers catches the exception, then there is an implicit
expectations by the programmer that there could be an exception in the
preceding try block. If the programmer isn't prepared to deal with the
exception at the point in time they should either rethrow it or throw
another exception with the caught exception as the cause.
> > If you didn't want to embed package, class, and method literals in the
> > JavaScript everywhere you should be able to use arguments.callee in all
> > browsers plus a little RegEx magic to extract a method name.
>
> If I understand you correctly, that'd only work in DETAILED mode, right?
Umm, it wouldn't work for anonymous functions but it seems to me that
every generated JavaScript function has a unique name. You should be
able to look that up to find the full class+method name. Right?
> > Finally, some obfuscaters produce a table to convert the obfuscated
> > stack frames into real stack frames. Having a translation table next to
> > the nocache.html would be nice so you could convert "Fj" into
> > "com.google.gwt.user.client.Event" or whatever it may be.
>
> That also seems like a good idea in general.
Cool, If implemented I'd like to request that the translation table be
a format that is loadable after the client app has been running for a
while so that someone working on enhanced logging or debugging support
could lazily loaded that data once needed.