Determining when an Identifier is a variable reference

63 views
Skip to first unread message

Francisco Tolmasky

unread,
Jan 19, 2015, 12:36:57 AM1/19/15
to esp...@googlegroups.com
So due to the current class/interface structure of the parser API, it is kind of difficult (I think?) to determine what is a variable reference. For example, let's say we are trying to write a function that determines all the globals in a program. To do this we need to note all the references to variables, then remove the ones declared. For the first theoretically trivial task of just noting all variable references, I can't do something like:

enter: function(aNode) { if (aNode.type === "Identifier") { vars.push(aNode.name) }

because Identifiers show in other places such as labels and (non-computed) properties. This is because Identifiers seem to have "multiple" inheritance paths: sometimes its Identifier>Node, sometimes Identifier>Expression. As such, it is unclear when an Identifier is behaving like an Expression (in which case I believe it for sure counts as a variable reference), or whether its behaving as something else (either a label or a property name such as a.*b*). What's left is to comb the entire parser API and special case every node (CallExpression's callee field expects an Expression, thus treat Identifiers in the callee field as if they were Identifier > Expression, in MemberExpression, if computed === true treat Identifier's in the property position as if they were Identifier > Expression, BUT if computed === false, treat identifiers in the property position as if they were Identifier > Node, etc etc etc).

Perhaps (hopefully :) ) I am missing some obvious way to handle this that isn't as error prone as noting what type(s) each field can contain to determine this, but I haven't thought of any specific way yet. This becomes more difficult of course with new syntax additions where there isn't a clear place to find these subtle definitions. Just hyphothetically, if you had something like Identifier > Node, and IdentifierExpression > Expression { identifier: Identifier }, regardless of what new syntax was introduced you'd know that IdentifierExpressions encountered anywhere are variable references (for example in CallExpression you could drop the computed bool which seems to only exist because you can't tell if an Identifier property is an Expression or not. With this system I just outlined you'd either have an Identifier or an IdentifierExpression, no ambiguity).


mfic...@shapesecurity.com

unread,
Jan 20, 2015, 3:12:42 PM1/20/15
to esp...@googlegroups.com
Hey Francisco,

Many of us have run into these same problems on multiple occasions, and I have come to the conclusion that the SpiderMonkey AST format is not well suited for much besides interpretation (and not even great for that). Really, the only thing it has going for it is massive adoption. If you can live without the SpiderMonkey ecosystem for your project (and even if not, we have a converter), I recommend you try using the Shift AST and its associated tools. My team at Shape Security designed the Shift AST, optimising it for type-safe program transformation and reducing the number of invalid programs that may be represented. In fact, we did exactly what you suggest with IdentifierExpression. We've written up an incredibly detailed blog post that describes the differences between the SpiderMonkey and Shift formats, as well as the motivation behind each one. If, after all this, you'd still rather stick with the SpiderMonkey AST format, it seems like what you are doing may be able to make use of estools/escope. Hope that helps.

On a related topic, Shape is having an open house in our office in Mountain View on 29 January, and anyone who wants to come hang out and talk with us about SpiderMonkey or Shift tools is welcome to come. Just RSVP using that link and ask for me when you get here.

Michael Ficarra

Francisco Tolmasky

unread,
Jan 20, 2015, 9:29:18 PM1/20/15
to esp...@googlegroups.com
Hi Michael,

This is great, I think I remember seeing this a few months back but could not remember where it lived. I'm actually using the AST for full ES6 stuff (even await, so I suppose even a little ES7 stuff), hence not being able to use scope (I was using it when we were only dealing with ES5, but scope doesn't support ES6, and due to the nature of the current AST API as you know, adding these new features is *a task* vs largely automatic with things like IdentifierExpression). From my limited perusal it seems Shift isn't 100% ES6 ready, do you guys have a target date for when it might be? Either way would love to get involved and help make that happen.

Thanks,

Francisco

Michael Ficarra

unread,
Jan 22, 2015, 6:09:42 PM1/22/15
to Francisco Tolmasky
Moving esprima list to BCC, as this is getting off topic.

Progress on ES6 support is just starting. We currently have a feature-complete spec (still subject to change) as well as matching AST constructors. You can track ES6 support in the parser in #8, the ES6 tracking issue.

Michael

--
You received this message because you are subscribed to a topic in the Google Groups "esprima" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/esprima/-IGy5rG3ilc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to esprima+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Shape Security is hiring outstanding individuals. Check us out at https://shapesecurity.com/jobs/
Reply all
Reply to author
Forward
0 new messages