First stage of replacing Scope with a more flexible API

2 views

Skip to first unread message

mikes...@gmail.com

unread,

Nov 4, 2009, 3:04:42 PM11/4/09

to ihab...@gmail.com, google-ca...@googlegroups.com

Reviewers: ihab.awad,

Description:
This is part 5 of
https://groups.google.com/group/google-caja-discuss/browse_thread/thread/50db6111ae71c2c9

We currently have a Scope class which was defined for the Rewriter
scheme. For each optimization pass I've done, I've ended up writing
my own scope scheme, often on top of Scope. E.g. the AlphaRenaming
uses an identity map to associate extra info with scopes, and the
array index optimization builds it's own scope tree, so that it can
easily walk from a scope to its children. The linter uses a scoping
scheme that abstracts away the decision of where declarations are
introduced into scopes and which scopes they are hoisted to, so that
it can make sure that scoping is consistent between JScript and other
interpreters.

Further changelists will unify these scoping schemes and support the
following use cases:
# In a rewriter to scope lazily
# On an entire tree to attach a scope to each node
# To generate masking errors about violations of Cajita specific
invariants (such as forbidding declaration of the name "cajita")
# To identify the kind of symbol -- function declaration, local
function name, exception, data, globals
# To collect information about declarations
# To resolve references to declaration points or to scopes
# To detect redelcaration
# To emulate ES5 scoping rules
# To emulate broken IE scoping
# To collect uses of a variable
# To determine whether/where a symbol is used as a
LeftHandSideExpression
# To collect uses in narrower scopes.
# To collect the set of symbols from wider scopes.

There are a few separable concerns here:
# Applying scoping rules of a given variant of JS -- exception
hoisting, whether function constructors' names intrude on the
containing scope to introduce scopes, and collect declarations
# Associating scopes with parse tree nodes Collecting usage
information

Please see ScopeListener for a general interface to collecting and
collating Scope information. See ScopeAnalyzer for a general
implementation that implements a scoping scheme, and its concrete
implementations: ES5ScopeAnalyzer, JScriptScopeAnalyzer,
WorstCaseScopeAnalyzer.

The ScopeAnalyzerTests give an idea of how ScopeAnalyzer works.

In later CLS, Scope has been rewritten to use ES5ScopeAnalyzer under
the hood, and Rewriter will change to create the scopes so that it is
no longer the responsibility of Rules to create and pass around scopes.

Please review this at http://codereview.appspot.com/148045

Affected files:
M build.xml
M src/com/google/caja/ancillary/opt/LocalVarRenamer.java
M src/com/google/caja/parser/js/Operation.java
A src/com/google/caja/parser/js/scope/AbstractScope.java
A src/com/google/caja/parser/js/scope/ES5ScopeAnalyzer.java
A src/com/google/caja/parser/js/scope/JScriptScopeAnalyzer.java
A src/com/google/caja/parser/js/scope/ScopeAnalyzer.java
A src/com/google/caja/parser/js/scope/ScopeListener.java
A src/com/google/caja/parser/js/scope/ScopeType.java
A src/com/google/caja/parser/js/scope/WorstCaseScopeAnalyzer.java
M src/com/google/caja/parser/quasiliteral/Scope.java
M src/com/google/caja/parser/quasiliteral/SyntheticRuleSet.java
A tests/com/google/caja/parser/js/scope/ScopeAnalyzerTest.java

mikes...@gmail.com

unread,

Nov 6, 2009, 5:54:06 PM11/6/09

to ihab...@gmail.com, google-ca...@googlegroups.com

On 2009/11/04 20:04:42, MikeSamuel wrote:

ping

http://codereview.appspot.com/148045

ihab...@gmail.com

unread,

Nov 9, 2009, 7:39:07 PM11/9/09

to mikes...@gmail.com, google-ca...@googlegroups.com

Some detailed comments inline, but general comments below. I have not
finished going through ScopeAnalyzer.java yet. The broad questions are:

1. ScopeAnalyzer is parameterized by its scope implementation type, <S
extends AbstractScope>. Why should it constrain S to extend
AbstractScope given that it otherwise makes no demands on S? By firing
enterScope and exitScope on S, it requires the listener's type S to be
an AbstractScope, which the listener may not want.

2. That said, it's probably perfectly reasonable that most listeners
will construct *some* S that is an AbstractScope. So then there's the
opposite question. The ScopeAnalyzer is creating all this nifty richly
connected graph with the Symbol and ScopeTree objects. But, alack! --
the listener is forbidden from seeing any of this. It gets created and
thrown away. Is there benefit to making this an output of the
ScopeAnalyzer?

I guess it just seems to me that there are 2 easily understandable
choices -- complete separation or complete connection between Analyzer
and its listeners. The current code seems to be a bit of a mixture.

http://codereview.appspot.com/148045/diff/1/11
File src/com/google/caja/parser/js/scope/AbstractScope.java (right):

http://codereview.appspot.com/148045/diff/1/11#newcode22
src/com/google/caja/parser/js/scope/AbstractScope.java:22: public
interface AbstractScope {
If we *are* going to define such a beast, should it not also include a
slot for a pointer to the ParseTreeNode from which it is generated?

http://codereview.appspot.com/148045/diff/1/11#newcode23
src/com/google/caja/parser/js/scope/AbstractScope.java:23: /**
Blank line before "/*" - all in this file.

http://codereview.appspot.com/148045/diff/1/6
File src/com/google/caja/parser/js/scope/ScopeAnalyzer.java (right):

http://codereview.appspot.com/148045/diff/1/6#newcode93
src/com/google/caja/parser/js/scope/ScopeAnalyzer.java:93: protected
abstract boolean fnCtorsDeclareInBody();
Should this simply be a boolean parameter to the ScopeAnalyzer ctor?

http://codereview.appspot.com/148045/diff/1/6#newcode98
src/com/google/caja/parser/js/scope/ScopeAnalyzer.java:98: protected
abstract boolean fnCtorsDeclareInContaining();
Same.

http://codereview.appspot.com/148045/diff/1/7
File src/com/google/caja/parser/js/scope/ScopeListener.java (right):

http://codereview.appspot.com/148045/diff/1/7#newcode20
src/com/google/caja/parser/js/scope/ScopeListener.java:20: public
interface ScopeListener<S extends AbstractScope> {
This interface reads a little confusingly to me. Is type S the type of
scopes being constructed by the listener? If not, then it must be the
scopes constructed by the scope analyzer, right? If so, then the type of
*these* scopes is fixed and it need not be a type parameter.

To clear this up and separate concerns more crisply, I think you could
get away with not introducing a Scope in the signature of ScopeListener
at all if your callbacks refer, not to a Scope, but to the ParseTreeNode
corresponding to the Scope.

This change would not cause more objects to be created or more code to
be run. The objects of type S being given to this listener are *still*
being created by the scope analyzer during its tree walk, and the
listener is *still* creating its own custom scopes if it wants to.

http://codereview.appspot.com/148045/diff/1/7#newcode29
src/com/google/caja/parser/js/scope/ScopeListener.java:29: void
declaration(AncestorChain<Identifier> id, S scope);
For example:

declaration(..., ParseTreeNode node)

where the node would be (say) a FunctionConstructor, a CatchStmt, an
UncajoledModule, .... In other words, some node that *introduces* a new
scope, but not the scope itself (building that is the job of the
listener, if it wants to).

http://codereview.appspot.com/148045/diff/1/9
File src/com/google/caja/parser/js/scope/ScopeType.java (right):

http://codereview.appspot.com/148045/diff/1/9#newcode34
src/com/google/caja/parser/js/scope/ScopeType.java:34: /**
More spaces before "/*"

http://codereview.appspot.com/148045

Mike Samuel

unread,

Nov 9, 2009, 9:50:28 PM11/9/09

to mikes...@gmail.com, ihab...@gmail.com, google-ca...@googlegroups.com

2009/11/9 <ihab...@gmail.com>:

> Some detailed comments inline, but general comments below. I have not
> finished going through ScopeAnalyzer.java yet. The broad questions are:
>
> 1. ScopeAnalyzer is parameterized by its scope implementation type, <S
> extends AbstractScope>. Why should it constrain S to extend
> AbstractScope given that it otherwise makes no demands on S? By firing
> enterScope and exitScope on S, it requires the listener's type S to be
> an AbstractScope, which the listener may not want.

I don't want the listener to have to typecast the scope objects it
creates every time it handles an event.
Is there another typesafe way to achieve that?

> 2. That said, it's probably perfectly reasonable that most listeners
> will construct *some* S that is an AbstractScope. So then there's the
> opposite question. The ScopeAnalyzer is creating all this nifty richly
> connected graph with the Symbol and ScopeTree objects. But, alack! --
> the listener is forbidden from seeing any of this. It gets created and
> thrown away. Is there benefit to making this an output of the
> ScopeAnalyzer?

Not that I can see.

> I guess it just seems to me that there are 2 easily understandable
> choices -- complete separation or complete connection between Analyzer
> and its listeners. The current code seems to be a bit of a mixture.

Where is the connection? The analyzer merely routes scopes from the
listener back to itself.
It never invokes their methods.

> http://codereview.appspot.com/148045/diff/1/11
> File src/com/google/caja/parser/js/scope/AbstractScope.java (right):
>
> http://codereview.appspot.com/148045/diff/1/11#newcode22
> src/com/google/caja/parser/js/scope/AbstractScope.java:22: public
> interface AbstractScope {
> If we *are* going to define such a beast, should it not also include a
> slot for a pointer to the ParseTreeNode from which it is generated?

It could. I think there is value in at least having a placeholder
type for explanatory purposes, and we can get as rich or as simple as
we like.

> http://codereview.appspot.com/148045/diff/1/11#newcode23
> src/com/google/caja/parser/js/scope/AbstractScope.java:23: /**
> Blank line before "/*" - all in this file.

Done

> http://codereview.appspot.com/148045/diff/1/6
> File src/com/google/caja/parser/js/scope/ScopeAnalyzer.java (right):
>
> http://codereview.appspot.com/148045/diff/1/6#newcode93
> src/com/google/caja/parser/js/scope/ScopeAnalyzer.java:93: protected
> abstract boolean fnCtorsDeclareInBody();
> Should this simply be a boolean parameter to the ScopeAnalyzer ctor?

It could be a parameter or it could be an overridable method.

> http://codereview.appspot.com/148045/diff/1/6#newcode98
> src/com/google/caja/parser/js/scope/ScopeAnalyzer.java:98: protected
> abstract boolean fnCtorsDeclareInContaining();
> Same.
>
> http://codereview.appspot.com/148045/diff/1/7
> File src/com/google/caja/parser/js/scope/ScopeListener.java (right):
>
> http://codereview.appspot.com/148045/diff/1/7#newcode20
> src/com/google/caja/parser/js/scope/ScopeListener.java:20: public
> interface ScopeListener<S extends AbstractScope> {
> This interface reads a little confusingly to me. Is type S the type of
> scopes being constructed by the listener? If not, then it must be the
> scopes constructed by the scope analyzer, right? If so, then the type of
> *these* scopes is fixed and it need not be a type parameter.

The scope analyzer does not construct AbstractScopes.

> To clear this up and separate concerns more crisply, I think you could
> get away with not introducing a Scope in the signature of ScopeListener
> at all if your callbacks refer, not to a Scope, but to the ParseTreeNode
> corresponding to the Scope.

> This change would not cause more objects to be created or more code to
> be run. The objects of type S being given to this listener are *still*
> being created by the scope analyzer during its tree walk, and the
> listener is *still* creating its own custom scopes if it wants to.

And now a whole bunch of listeners have to keep an identity hash map
of ParseTreeNodes to scopes.
But that would not work, because ParseTreeNodes might appear multiple
times in the same AST.

ihab...@gmail.com

unread,

Nov 9, 2009, 9:58:55 PM11/9/09

to mikes...@gmail.com, google-ca...@googlegroups.com

On Mon, Nov 9, 2009 at 6:50 PM, Mike Samuel <mikes...@gmail.com> wrote:

I don't want the listener to have to typecast the scope objects it
creates every time it handles an event.
Is there another typesafe way to achieve that?

No but then ScopeAnalyzer should just be parameterized on a generic class <S> not <S extends AbstractScope>, right?

> 2. That said, it's probably perfectly reasonable that most listeners
> will construct *some* S that is an AbstractScope. So then there's the
> opposite question. The ScopeAnalyzer is creating all this nifty richly
> connected graph with the Symbol and ScopeTree objects. But, alack! --
> the listener is forbidden from seeing any of this. It gets created and
> thrown away. Is there benefit to making this an output of the
> ScopeAnalyzer?

Not that I can see.

I was thinking of what we could do with something that directly connects defining and use occurrences, in particular. That said, with the current way the rewriters are implemented, the tree is walked in a pretty deterministic top-down fashion, so these _ad hoc_ links may be hard to take advantage of without breaking the model.

Ok.

Where is the connection? The analyzer merely routes scopes from the
listener back to itself. It never invokes their methods.

The remaining connection would be in <S extends AbstractScope>.

> http://codereview.appspot.com/148045/diff/1/11
> File src/com/google/caja/parser/js/scope/AbstractScope.java (right):
>
> http://codereview.appspot.com/148045/diff/1/11#newcode22
> src/com/google/caja/parser/js/scope/AbstractScope.java:22: public
> interface AbstractScope {
> If we *are* going to define such a beast, should it not also include a
> slot for a pointer to the ParseTreeNode from which it is generated?

It could. I think there is value in at least having a placeholder
type for explanatory purposes, and we can get as rich or as simple as
we like.

Ok.

> http://codereview.appspot.com/148045/diff/1/6#newcode93
> src/com/google/caja/parser/js/scope/ScopeAnalyzer.java:93: protected
> abstract boolean fnCtorsDeclareInBody();
> Should this simply be a boolean parameter to the ScopeAnalyzer ctor?

It could be a parameter or it could be an overridable method.

Yeah. Making it a method allows you to give each analyzer class a name. Making it a boolean simplifies the machinery a little. Oh well. Ok as-is.

And now a whole bunch of listeners have to keep an identity hash map
of ParseTreeNodes to scopes.
But that would not work, because ParseTreeNodes might appear multiple
times in the same AST.