I've added some variable related features to the treequery prototype. Those are:
- predicate treequery.is_binder(...)
- predicate treequery.is_bound_by(binder)
- function treequery.binder(occurrence)
Telling the difference between binders and occurrences
is_binder(...) can be used to determine whether a node is binder, as opposed to an occurrence (or a non-`Id{ } node). It works by looking at the node's parent.
Example
M> Q = require 'metalua.treequery'
M> ast = +{block:
no1(a);
local a;
yes1(a);
do
yes2(a);
local a = yes3(a);
no2(a);
end;
yes4(a);
local a;
no3(a) }
M> =ast -- AST view of the code block above
{ `Call{ `Id "no1", `Id "a" },
`Local{ { `Id "a" }, { } },
`Call{ `Id "yes1", `Id "a" },
`Do{ `Call{ `Id "yes2", `Id "a" },
`Local{ { `Id "a" }, `Call{ `Id "yes3", `Id "a" } } },
`Call{ `Id "no2", `Id "a" } },
`Call{ `Id "yes4", `Id "a" }, `Local{ { `Id "a" }, { } },
`Call{ `Id "no3", `Id "a" } }
M> Q(ast) :filter(Q.is_binder) :foreach(|_,parent| table.print(parent))
`Local{ { `Id "a" }, { } }
`Local{ { `Id "a" }, { `Call{ `Id "yes3", `Id "a" } } }
`Local{ { `Id "a" }, { } }
As expected, the nodes creating local vars are found, but those merely using them, here in function calls, are ignored. I chose to print parents rather than the nodes themselves: just getting a list of `Id "a" wouldn't have been very exciting.
You might notice that occurrences of the binder in the first "local a" statement are in functions called yesX(), whereas globals and occurrences of other locals are in functions called noX(). This will help checking that scoping tests below work correctly.
Filtering the occurrences of a given binder
is_bound_by(binder) returns a predicate, which in turns returns true if its argument is an occurrence of the binder `Id{ } passed as argument.
Example
M> b = Q(ast) :filter(Q.is_binder) :list() [1]
M> =Q(ast) :filter(Q.is_bound_by(b)) :foreach(|_,parent| table.print(parent))
`Call{ `Id "yes1", `Id "a" }
`Call{ `Id "yes2", `Id "a" }
`Call{ `Id "yes3", `Id "a" }
`Call{ `Id "yes4", `Id "a" }
M>
From now on, the first binder is called b. It will be reused in the next examples.
Retrieving the binder of a given occurrence
binder(occ) returns the binder associated with a given occurrence, or nil in other case (if the argument isn't a local variable)).
This function is tricky, because it is non-local and makes sense out of a query. It relies on a weak table, which keeps track of discovered occurrence/binder relationships until one of the nodes becomes collectible. It means that it won't work on a node which has never been through a query.
An alternative would be binder(occurrence, root_node), which would walk through root_node in o(n). It's conceptually more sound, but asymptotically longer. Also, it would allow to check for variable openness in a given, limited context (useful to move code around in a sound way, e.g. to implement "extract method" refactorings in a sound way). Finally, there's no risk to return garbage if ASTs are modified between queries. To sum up, if I follow my own advice given to David:
I wrote and self-quoted:
Finding is up to treequery, caching is up to the user.
This function is flimsy and untrustworthy because it relies on caching, and can't guarantee cache consistency. The more I think of it, the more I'm convinced I should implement the proper, slower version. It seems easy for users to cache stuff or treat them in parallel whenever needed.
Example
M> require 'metalua.ast_to_string'
M> Q(ast) :filter('Id') :filter(|_,id| id[1]=='a') :foreach(function(node,parent)
printf("parent: %s; binder: %s; is first? %s",
ast_to_string(parent),
table.tostring(Q.binder(node)),
Q.binder(node)==b and "yes" or "no")
end)
parent: no1 (a); binder: nil; is first? no
parent: local a; binder: nil; is first? no
parent: yes1 (a); binder: `Id "a"; is first? yes
parent: yes2 (a); binder: `Id "a"; is first? yes
parent: yes3 (a); binder: `Id "a"; is first? yes
parent: local a = yes3 (a); binder: nil; is first? no
parent: no2 (a); binder: `Id "a"; is first? no
parent: yes4 (a); binder: `Id "a"; is first? yes
parent: local a; binder: nil; is first? no
parent: no3 (a); binder: `Id "a"; is first? no
The 6th result might look wrong at first sight: in "local a=yes3(a)", the a parameter is an instance of the first "a" binder, so we would expect "is first? yes". But that's what the 5th result is about. the 6th result is about the left-hand-side a, the binder. Since it's a binder, it's obviously not bound by any other binder.
This exposes a peculiarity of the walker's traversal order: when walking through "local a=b", b will be visited before a. That's because this order is compliant both with sane scoping rules, and with the execution order: first b is computed, then it is stored into a, which is added to the scope.
Next steps
- rewriting treequery.binder() in a sound way.
- checking that the sound way is an acceptable substitute to the current one.
- implement a conditional :map() functor (more about this in a later post).
- testing those scope-aware services more thoroughly, this is something easy to get subtly wrong.