ClojureScript analyzer overview document

342 views
Skip to first unread message

Jonas Enlund

unread,
Apr 26, 2012, 4:19:35 PM4/26/12
to cloju...@googlegroups.com
Hi all!

I've been studying the ClojureScript analyzer the last couple of days and I think I've found a few small issues with it (especially with what is put into the :children vector). I've created some bug reports on JIRA (CLJS 195, 205 and 206) and there might be more coming if people think I'm on the right track. To aid my thinking I've created an analyzer overview document[1] (with google docs, as I can't create new pages on dev.clojure.org). The document is still a work in progress and I would appreciate comments and additions to the document. Hopefully it won't be spammed to death!

Jonas



David Nolen

unread,
Apr 26, 2012, 4:20:37 PM4/26/12
to cloju...@googlegroups.com
This is excellent! Thanks for doing this. Will take a closer look soon.



--
You received this message because you are subscribed to the Google Groups "Clojure Dev" group.
To view this discussion on the web visit https://groups.google.com/d/msg/clojure-dev/-/gU2FEfz1y20J.
To post to this group, send email to cloju...@googlegroups.com.
To unsubscribe from this group, send email to clojure-dev...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/clojure-dev?hl=en.

Aaron Cohen

unread,
Apr 26, 2012, 4:25:34 PM4/26/12
to cloju...@googlegroups.com
I want to mention that :children is pretty inconvenient if you intend to do any AST transformations.

Your choice is either to ignore :children and let your transformed child elements get out of sync with the elements in :children or to try to keep them in sync and do a lot of duplicate work.

In CinC, I chose to remove :children from the AST entirely, and instead provide a children multi-method.

--Aaron

PS. CinC is still coming along, I haven't abandoned it. letfn is my last remaining special form to implement, and its something of a bear.

David Nolen

unread,
Apr 27, 2012, 11:32:47 AM4/27/12
to cloju...@googlegroups.com
On Thu, Apr 26, 2012 at 4:25 PM, Aaron Cohen <aa...@assonance.org> wrote:
I want to mention that :children is pretty inconvenient if you intend to do any AST transformations.

What kinds of transformations are you finding useful? 

Aaron Cohen

unread,
Apr 27, 2012, 2:34:30 PM4/27/12
to cloju...@googlegroups.com
In CinC, I chose to do all the logic that the java clojure compiler currently intertwines with emitting code in a separate pass over the AST.

I don't do any structural transformations (yet... I have some grandiose ideas about some optimization opportunities that would involve structural transforms), but I add quite a few annotations onto the tree; currently:
- type information
- referenced vars (so that I can calculate what's closed over and create fields for them in the generated java classes)
- constants
- whether or not boxing is wanted/possible for the current expression 
- transforms symbols that refer to classes into Class constants
- track protocol callsites, again so that I know that I need to emit fields in the enclosing class

There actually is a version of CinC in my git history that just duplicated all these calculations in both the actual child nodes and in the :children nodes. I eventually decided I hated that enough to rip it out.

--Aaron

Chris Granger

unread,
Apr 27, 2012, 2:37:16 PM4/27/12
to cloju...@googlegroups.com
The children thing *is* convenient, but I'm with Aaron, I've found it kind of annoying. If we standardize on a standalone analyzer, the idea of a multi-method for children seems reasonable.

Cheers,
Chris.

--

Ambrose Bonnaire-Sergeant

unread,
Apr 27, 2012, 10:13:56 PM4/27/12
to cloju...@googlegroups.com
On Sat, Apr 28, 2012 at 2:34 AM, Aaron Cohen <aa...@assonance.org> wrote:
On Fri, Apr 27, 2012 at 11:32 AM, David Nolen <dnolen...@gmail.com> wrote:
On Thu, Apr 26, 2012 at 4:25 PM, Aaron Cohen <aa...@assonance.org> wrote:
I want to mention that :children is pretty inconvenient if you intend to do any AST transformations.

What kinds of transformations are you finding useful? 

In CinC, I chose to do all the logic that the java clojure compiler currently intertwines with emitting code in a separate pass over the AST.

Have you been successful running clojure.core through the (Java) analyzer? I think the intertwining of emitting code causes class redefinition errors.

Maybe you can shed more light on whether it is possible.

Thanks,
Ambrose

Jonas Enlund

unread,
Apr 28, 2012, 12:52:37 PM4/28/12
to cloju...@googlegroups.com
The docstring for analyze[1] states that

    "Given an environment, [...], and form, returns an expression
    object (a map containing at least :form, :op and :env keys)"

I've noticed that the :form key is missing for several expression objects. In particular :fn, :do, :recur, :new, :set, :ns, :deftype*, :defrecord*, :dot, :js and :invoke. Should I create an JIRA issue for this (with an accompanying patch)?

David Nolen

unread,
Apr 28, 2012, 1:12:01 PM4/28/12
to cloju...@googlegroups.com
Yes.

--
You received this message because you are subscribed to the Google Groups "Clojure Dev" group.
To view this discussion on the web visit https://groups.google.com/d/msg/clojure-dev/-/mtNZTAtet6kJ.

Chris Granger

unread,
Apr 28, 2012, 1:54:40 PM4/28/12
to cloju...@googlegroups.com
Is this a good time to start talking about pulling the analyzer out of the compiler into its own lib? With the proliferation of Clojure implementations, I'm worried we're going to end up wasting a lot of effort on one-offs.

Cheers,
Chris.

David Nolen

unread,
Apr 28, 2012, 2:19:34 PM4/28/12
to cloju...@googlegroups.com
The ClojureScript jar is 90k. I don't see any reason to break it out into its own lib just yet :)

David

David Nolen

unread,
Apr 28, 2012, 2:44:39 PM4/28/12
to cloju...@googlegroups.com
I personally see no problem with replacing the explicit :children entry with a children multimethod. Seems like a simplification all around especially for people want to do transformations.

Other opinions?

David

On Thu, Apr 26, 2012 at 4:25 PM, Aaron Cohen <aa...@assonance.org> wrote:

Rich Hickey

unread,
Apr 30, 2012, 6:49:36 AM4/30/12
to cloju...@googlegroups.com
I'd like to see a better explanation of the problem.

This would be a move away from data and towards API, one that should be taken with great reluctance.

Rich

David Nolen

unread,
Apr 30, 2012, 9:08:43 AM4/30/12
to cloju...@googlegroups.com
Have no opinions about this myself since the ClojureScript compiler doesn't consume this information at all at the moment.

Since others consuming the analyzer output seem to find it troublesome please speak up.

David

Jonas Enlund

unread,
Apr 30, 2012, 9:17:57 AM4/30/12
to cloju...@googlegroups.com


On Monday, April 30, 2012 1:49:36 PM UTC+3, Rich Hickey wrote:
I'd like to see a better explanation of the problem.


If I understand correctly, only exprssion objects (i.e., maps containing :op, :env, :form etc) should be in the :children vector. Is this correct? At least the analyze docstring suggests it:

     "... If expr has any (immediately) nested exprs, must have :children [exprs...] entry. ..."

If the above assumption is correct then what was in the :children vectors was somewhat inconsistent[1]:

* contents of :env is one of the "children" for the :dot expression object #CLJS-205
* :fn has no :children key #CLJS-205
* The :children for :let has an extra set of brackets #CLJS-195
* Children for :try* contains a {:name mname} entry. In addition the children contains the blocks of :try :catch :finally, and not the expressions themselves
* :recur has no :children

The reason for the multimethod approach is laid out in the second paragraph of CLJS-214[2]. If that is not the correct approach then we should revert those changes. I'm more interested in getting the points above corrected.

Jonas



Aaron Cohen

unread,
Apr 30, 2012, 10:15:47 AM4/30/12
to cloju...@googlegroups.com
The explanation on ticket CLJS-214 describes my problem well.

A somewhat simple example would be in setting the boxing state of the if statement, combined with gathering the constants. For something like:
(if true 1 2)

1) Assume that the if statement needs to be boxed.

2) The set-box function needs to use the knowledge that the "if" is boxed and propagate that to its children. The result is that :then and :else need to be boxed, but :test does not. There is no way to do this using :children, you don't know which values are which.

As a result you get {:op :if :box true :test {:box false ...} :then {:box true...} :else {:box true ...} :children [old-test-without-box old-then-without-box old-else-without-box]}

3) Let's now try to compute the constants that are necessary to emit this if statement
The actual child nodes have a boxed long, so they will calculate that I need 2 constants: {:type java.lang.Long :value 1} and {:type java.lang.Long :value 2}
However, in the :children element, we still have the old values without :box, so there are 2 more constants {:type long :value 1} and {:type long :value 2}

-----------------

An alternative to the multimethod approach that might retain the "data" aspect instead of creating an API would be to change :children to only contain the names of the keys of the child elements, rather than a copy of the values. This avoids the "going out of sync problem", since that means :children is an indirect reference to the actual child nodes. I actually think this is more work for the consumer than the multimethod approach, but it'd be possible to write a general "children" function in terms of the :children element.

For example:

{:op :if :test test-expr :then then-expr :else else-expr :children [:test :then :else]}

--Aaron

--
You received this message because you are subscribed to the Google Groups "Clojure Dev" group.
To view this discussion on the web visit https://groups.google.com/d/msg/clojure-dev/-/7EOsvZiWeM0J.

Aaron Cohen

unread,
Apr 30, 2012, 10:18:40 AM4/30/12
to cloju...@googlegroups.com
Hi Ambrose,

   I haven't actually tried just using the analyzer on core yet, because I haven't finished implementing all the special forms.

   I did run into one somewhat hairy ordering issue in that I can't compute the type of forms that create a class (such as deftype) until they've actually been emitted. I can compute their _name_, however, and that's been enough so far.

David Nolen

unread,
Apr 30, 2012, 12:43:21 PM4/30/12
to cloju...@googlegroups.com
On Mon, Apr 30, 2012 at 10:15 AM, Aaron Cohen <aa...@assonance.org> wrote:
An alternative to the multimethod approach that might retain the "data" aspect instead of creating an API would be to change :children to only contain the names of the keys of the child elements, rather than a copy of the values. This avoids the "going out of sync problem", since that means :children is an indirect reference to the actual child nodes. I actually think this is more work for the consumer than the multimethod approach, but it'd be possible to write a general "children" function in terms of the :children element.

For example:

{:op :if :test test-expr :then then-expr :else else-expr :children [:test :then :else]}

--Aaron

Is ignoring :children for your purposes a problem?

David  

David Nolen

unread,
Apr 30, 2012, 12:47:05 PM4/30/12
to cloju...@googlegroups.com
On Mon, Apr 30, 2012 at 9:17 AM, Jonas Enlund <jonas....@gmail.com> wrote:


On Monday, April 30, 2012 1:49:36 PM UTC+3, Rich Hickey wrote:
I'd like to see a better explanation of the problem.


If I understand correctly, only exprssion objects (i.e., maps containing :op, :env, :form etc) should be in the :children vector. Is this correct? At least the analyze docstring suggests it:

     "... If expr has any (immediately) nested exprs, must have :children [exprs...] entry. ..."

If the above assumption is correct then what was in the :children vectors was somewhat inconsistent[1]:

* contents of :env is one of the "children" for the :dot expression object #CLJS-205
* :fn has no :children key #CLJS-205
* The :children for :let has an extra set of brackets #CLJS-195
* Children for :try* contains a {:name mname} entry. In addition the children contains the blocks of :try :catch :finally, and not the expressions themselves
* :recur has no :children

The reason for the multimethod approach is laid out in the second paragraph of CLJS-214[2]. If that is not the correct approach then we should revert those changes. I'm more interested in getting the points above corrected.

Jonas

Your changes seem sound to me and I don't actually see the problem with ignoring :children for people who are doing more sophisticated transformations.

Lets bring the :children back and drop the the multimethod.

David 

Aaron Cohen

unread,
Apr 30, 2012, 12:51:32 PM4/30/12
to cloju...@googlegroups.com
What's the point of keeping an element that the primary project doesn't use, and the library users can't use? 

David Nolen

unread,
Apr 30, 2012, 12:52:42 PM4/30/12
to cloju...@googlegroups.com
On Mon, Apr 30, 2012 at 12:51 PM, Aaron Cohen <aa...@assonance.org> wrote:
What's the point of keeping an element that the primary project doesn't use, and the library users can't use? 

Just because you're ignoring it doesn't mean other tools will.

David 

Chris Granger

unread,
Apr 30, 2012, 12:58:08 PM4/30/12
to cloju...@googlegroups.com
I'm not sure I understand either. It duplicates parts of the tree and in my experiences working with it, it just confuses things. I would want to work with the tree much more in the way that the compiler itself does and that's through knowledge of how each node is constructed. It seems like the multi-method is a great way of capturing that.

Cheers,
Chris.

--
You received this message because you are subscribed to the Google Groups "Clojure Dev" group.

David Nolen

unread,
Apr 30, 2012, 1:14:02 PM4/30/12
to cloju...@googlegroups.com
On Mon, Apr 30, 2012 at 12:58 PM, Chris Granger <ibd...@gmail.com> wrote:
I'm not sure I understand either. It duplicates parts of the tree and in my experiences working with it, it just confuses things. I would want to work with the tree much more in the way that the compiler itself does and that's through knowledge of how each node is constructed. It seems like the multi-method is a great way of capturing that.

Cheers,
Chris.

I can imagine a tool using core.unify or core.logic or core.match. children as multimethod doesn't really simplify anything for that case.

David 

Aaron Cohen

unread,
Apr 30, 2012, 1:15:20 PM4/30/12
to cloju...@googlegroups.com
I think it's possible to come up with a solution that works for everyone.

What about my idea about putting the child keys in :children rather than duplicating the values?

--
You received this message because you are subscribed to the Google Groups "Clojure Dev" group.

David Nolen

unread,
Apr 30, 2012, 1:19:03 PM4/30/12
to cloju...@googlegroups.com
On Mon, Apr 30, 2012 at 1:15 PM, Aaron Cohen <aa...@assonance.org> wrote:
I think it's possible to come up with a solution that works for everyone.

What about my idea about putting the child keys in :children rather than duplicating the values?

Seems like that just forces everyone else to do more processing, right? If you ignore :children that's no more or less work for you.

David 

Cosmin Stejerean

unread,
Apr 30, 2012, 1:25:30 PM4/30/12
to cloju...@googlegroups.com
On Mon, Apr 30, 2012 at 9:52 AM, David Nolen <dnolen...@gmail.com> wrote:
> Just because you're ignoring it doesn't mean other tools will.

Is anyone currently using :children? Keeping it around just in case
some might, or guessing about how other tools might want to use that
information doesn't seem right.

--
Cosmin Stejerean
http://offbytwo.com

Aaron Cohen

unread,
Apr 30, 2012, 1:26:45 PM4/30/12
to cloju...@googlegroups.com
I'm sorry, I disagree. You're fighting to defend an ugly misfeature in my opinion.

How does it make sense to have the data that all users are going to need, but present it in a form that some subset of users will be: 1) forced to ignore, and then 2) duplicate that same information in a form that works for them.

Whatever floats your boat though I guess.

--Aaron

David Nolen

unread,
Apr 30, 2012, 1:33:44 PM4/30/12
to cloju...@googlegroups.com
On Mon, Apr 30, 2012 at 1:25 PM, Cosmin Stejerean <cos...@offbytwo.com> wrote:
On Mon, Apr 30, 2012 at 9:52 AM, David Nolen <dnolen...@gmail.com> wrote:
> Just because you're ignoring it doesn't mean other tools will.

Is anyone currently using :children?  Keeping it around just in case
some might, or guessing about how other tools might want to use that
information doesn't seem right.

Rich included it probably after considerable amounts of thought. He piped up when he saw we were converting it to a multimethod ignoring whatever thought he'd put into :)

David

David Nolen

unread,
Apr 30, 2012, 1:39:14 PM4/30/12
to cloju...@googlegroups.com
On Mon, Apr 30, 2012 at 1:26 PM, Aaron Cohen <aa...@assonance.org> wrote:
I'm sorry, I disagree. You're fighting to defend an ugly misfeature in my opinion.

How does it make sense to have the data that all users are going to need, but present it in a form that some subset of users will be: 1) forced to ignore, and then 2) duplicate that same information in a form that works for them.

Whatever floats your boat though I guess.

--Aaron

It complicates destructuring, pattern matching, and unification. Those all seem like not only pretty useful, but tried and true ways of dealing with an AST. Do you have a reasoned argument against that?

David 

Chris Granger

unread,
Apr 30, 2012, 1:50:06 PM4/30/12
to cloju...@googlegroups.com
It makes me sad to suggest this, but what about a flag for including or not?

Cheers,
Chris.

--

Chris Granger

unread,
Apr 30, 2012, 1:51:14 PM4/30/12
to cloju...@googlegroups.com
Rich, can you shed some light on those insights?

Cheers,
Chris.

--
You received this message because you are subscribed to the Google Groups "Clojure Dev" group.

Aaron Cohen

unread,
Apr 30, 2012, 2:08:05 PM4/30/12
to cloju...@googlegroups.com
What is "It" here, my proposed :children [keys] idea?

Here is what the current scheme buys you. You can't go any further than "children" using destructuring because the order is undefined.

(let [{:keys [env children]} node)]
   ;; ... do stuff with children
)

--

Doing the same with a multimethod:

(let [{:keys [env]} node
       children (children node)]
   ... do stuff with children
)

An extra line, but it works for all users.

Rich suggests that it's a problem or at least something to think about that this replaces "data" into an API. To be honest, this is a very esoteric argument to me. Is it a worthy goal to preserve "data" that only some users are going to be able to use? I honestly think that utility trumps a philosophical argument here.

I think it'd be a shame to preserve a purity of ideal by keeping "data" here, and ignore the very real cost that anyone who does anything non-readonly to that data is going to find it unusable.

Is wanting to "walk" the AST and change nodes really that uncommon of a use case that we want to ignore it? We provide a namespace in clojure.walk for walking random data structures, why is an AST treated any differently?

--

Doing the same with :children that contains keys instead of copying values.

(defn children [node]
   (vals (select-keys node (:children node))))

(let [{:keys [env]} node
       children (children node)]
   ... do stuff with children
)

If we provide children, this is equivalent to the multimethod, but it preserves some of the idea of "data", I think.

--

I don't know what you mean by using pattern matching or unification without an example. I don't think either are core features of the clojure language, so I don't know why we're catering to them. However, if we can make them work I'd be happy to think about how.

--Aaron

Aaron Cohen

unread,
Apr 30, 2012, 2:09:31 PM4/30/12
to cloju...@googlegroups.com
Whether I ignore it or not isn't a problem to me.

It stinks that if "ignoring :children" is the accepted solution, I have to maintain my own duplicated knowledge about the children of every type of node outside of the canonical analyzer though.

--Aaron

David Nolen

unread,
Apr 30, 2012, 3:13:10 PM4/30/12
to cloju...@googlegroups.com

David Nolen

unread,
Apr 30, 2012, 3:20:19 PM4/30/12
to cloju...@googlegroups.com
On Mon, Apr 30, 2012 at 2:08 PM, Aaron Cohen <aa...@assonance.org> wrote:
I don't know what you mean by using pattern matching or unification without an example. I don't think either are core features of the clojure language, so I don't know why we're catering to them. However, if we can make them work I'd be happy to think about how.

--Aaron

Whether pattern matching or unification are core features of Clojure doesn't have much to do with their utility when writing analyzers of the AST. For example you can easily write declarative queries with core.logic against the AST - the order of children doesn't even matter, you're just searching for :invoke expressions with a matching name.

Analysis was a big part of Rich's talk at the last Conj. Your desire for simple AST transformation is just one (important) use case.

David

Aaron Cohen

unread,
Apr 30, 2012, 3:26:44 PM4/30/12
to cloju...@googlegroups.com
Is it impossible for core.logic to say (children node) rather than (:children node)?

--Aaron 

Jonas Enlund

unread,
Apr 30, 2012, 3:41:16 PM4/30/12
to cloju...@googlegroups.com
The patch http://dev.clojure.org/jira/browse/CLJS-218 removes the multimethod and re-adds the children as a key on the AST. The patch includes fixes for the points I raised in my previous message.

Jonas

David Nolen

unread,
Apr 30, 2012, 3:42:34 PM4/30/12
to cloju...@googlegroups.com
On Mon, Apr 30, 2012 at 3:26 PM, Aaron Cohen <aa...@assonance.org> wrote:
Is it impossible for core.logic to say (children node) rather than (:children node)?

--Aaron 

Certainly not. But it's not particularly declarative. I just see no reason to *prioritize* functional transformation of the AST over declarative queries over the AST. 

David 

Aaron Cohen

unread,
Apr 30, 2012, 4:11:03 PM4/30/12
to cloju...@googlegroups.com
If one solution works for both, and the other solution only works for 1, I really don't see what the argument is.

--Aaron

David Nolen

unread,
Apr 30, 2012, 4:17:32 PM4/30/12
to cloju...@googlegroups.com
Having :children works for both. You're not querying you're transforming so that information is not relevant for your use case. For people who want to query all the necessary information is already present and no more work needs to be done.
--

Aaron Cohen

unread,
Apr 30, 2012, 4:19:52 PM4/30/12
to cloju...@googlegroups.com
No, with :children I have to duplicate the information about which attributes are children is if I want to walk the tree. That falls out of date, can be incorrect. It's a clearly inferior solution.

I'm pretty sure I've expressed myself as clearly as possible here, and it seems pretty clear you've made up your mind, so I'm done with this.

Talk you next time, hopefully more fruitfully,

--Aaron

David Nolen

unread,
Apr 30, 2012, 4:29:51 PM4/30/12
to cloju...@googlegroups.com
Oh right. What about :children-keys?

Aaron Cohen

unread,
Apr 30, 2012, 4:47:39 PM4/30/12
to cloju...@googlegroups.com
You mean have both :children and :children-keys?

I suppose that would work. The duplication still makes me want to poke my eyes out, but I will work on ignoring :children both in the code and mentally.

--Aaron

David Nolen

unread,
Apr 30, 2012, 6:00:17 PM4/30/12
to cloju...@googlegroups.com
Hmm how about this compromise- an add-children multimethod?

(add-children ast)

People who don't plan on transforming the AST can just interact with data - and those who plan on transforming the AST have a standard way to keep nodes up to date.

David

Rich Hickey

unread,
May 2, 2012, 7:08:47 AM5/2/12
to cloju...@googlegroups.com
:children is admittedly half-baked, but remember all things that are baked were once half-baked :)

Here are the objectives:

1) The AST as data. Full stop. That is part of the CLJS experiment. We've seen enough code- and API-based expression tree libraries.

2) Supporting generic code walkers. Code walking has traditionally been a difficult and brittle thing, as everyone encodes the primitives. Thus, adding a new primitive becomes impossible. Plus, that encoded knowledge itself is quite complex and duplicated between code walkers. Given that the AST data is a mix of things that would need to be walked and other details, there needs to be a mechanism to distinguish them. While a multimethod is one way to do make it an open system, it means that to manipulate your new AST node I don't just need the AST, I need your code. What about code walkers and analyzers etc as a service?

So, the problems I see with what we currently have vs these objectives are:

a) Inconsistency in the contents and use of :children.
b) Duplication

I wonder somewhat about (b). Given they are just values, in memory at least, they are shared. But yes, when printed or otherwise reified it's a bear.

:children as a vector of keys seems promising. Next steps in that direction would be to propose a standard way to do that, determine what that would imply for each construct we currently have, and assess the sufficiency for generic code walking. Along the way, consider what other semantic clues the AST might need to provide as data. Only then, implement for what we have.

p.s. to All

Just because you have a question doesn't mean the world needs to stop to answer it right away. Nor, should a discussion not produce an immediate result, was it a failure. Make your points and let it rest. Sleep on it. Please be patient with each other and the design process.

Rich

Aaron Cohen

unread,
May 2, 2012, 12:52:25 PM5/2/12
to cloju...@googlegroups.com
On Mon, Apr 30, 2012 at 6:00 PM, David Nolen <dnolen...@gmail.com> wrote:
Hmm how about this compromise- an add-children multimethod?

(add-children ast)

People who don't plan on transforming the AST can just interact with data - and those who plan on transforming the AST have a standard way to keep nodes up to date.

Let me see if I understand:

Here's a simple example that adds :box true to a child element.

(defn transform [ast]
   (doto ast
        (update-in [:test] (assoc :box true))))

You're saying it would change to

(defn transform [ast]
    (let [result (doto ast
                         (update-in [:test] (assoc :box true)))]
         (add-children result)))

I'd like something like this to be possible ((comp add-children transform1 transform2 transform3) ast), but thinking about it, I'm not sure that it is. I think each intermediate transformation has to call add-children on its own. The interactions get a little complex to think about, but if I do a prewalk into the :children, and have modifications in a postwalk step that alter the actual children nodes, calling add-children at that point will lose some intermediate alterations, I think. I'm not honestly sure if that's a problem, I'm just thinking out loud.

If I understand correctly, I think this could probably be made workable. Is it better than :child-keys though?

Doesn't this also tread close to being an API rather than manipulating data?

--Aaron


David Nolen

unread,
May 2, 2012, 1:11:02 PM5/2/12
to cloju...@googlegroups.com
Going on Rich's feedback I think pursuing the :children as keys approach is best. Clearly just having the keys there isn't sufficient. But I think this approach addresses problems a) *and* b), we get a more human-readable AST as a result.

So what should :children look if it becomes an index of sorts?

David

Aaron Cohen

unread,
May 2, 2012, 2:10:06 PM5/2/12
to cloju...@googlegroups.com
On Wed, May 2, 2012 at 1:11 PM, David Nolen <dnolen...@gmail.com> wrote:
Going on Rich's feedback I think pursuing the :children as keys approach is best. Clearly just having the keys there isn't sufficient. But I think this approach addresses problems a) *and* b), we get a more human-readable AST as a result.

So what should :children look if it becomes an index of sorts?

Yet another possibility: we could just segregate the child nodes into :children as a map.

For instance: {:op :if :env {env} :children {:test test-expr :then then-expr :else else-expr}} 

Advantages:
  No duplication
  Ancillary data is cleanly separated from the AST structure
  Walking is straightforward

Disadvantages:
  ???

--Aaron

Ambrose Bonnaire-Sergeant

unread,
May 2, 2012, 2:17:02 PM5/2/12
to cloju...@googlegroups.com
Does the order of :children matter?

Ambrose

Aaron Cohen

unread,
May 2, 2012, 2:18:10 PM5/2/12
to cloju...@googlegroups.com
On Wed, May 2, 2012 at 2:10 PM, Aaron Cohen <aa...@assonance.org> wrote:
On Wed, May 2, 2012 at 1:11 PM, David Nolen <dnolen...@gmail.com> wrote:

Disadvantages:
  ???
 
Destructuring becomes more complicated:
   (emit :if [{:keys [env test then else]}]) becomes
   (emit :if [{{:keys [test then else]} :children env :env}]). 

At that point I'd probably write it as a let in the function:

   (emit :if [{:keys [env children]}] (let [{:keys [test then else]} children] ...)


David Nolen

unread,
May 2, 2012, 2:20:20 PM5/2/12
to cloju...@googlegroups.com
What about forms like let where :children is a bit complex?

:children (into (vec (map :init bes))
                     (conj (vec statements) ret)) 

David

Aaron Cohen

unread,
May 2, 2012, 2:29:38 PM5/2/12
to cloju...@googlegroups.com
I think that should become: {:op :let :children {:init [inits] :statements [statements] :ret ret}

We might want to discuss whether the inits are truly a child here, at the moment I think they are though.

--Aaron

David Nolen

unread,
May 2, 2012, 2:39:27 PM5/2/12
to cloju...@googlegroups.com
On Wed, May 2, 2012 at 2:29 PM, Aaron Cohen <aa...@assonance.org> wrote:
On Wed, May 2, 2012 at 2:20 PM, David Nolen <dnolen...@gmail.com> wrote:
On Wed, May 2, 2012 at 2:10 PM, Aaron Cohen <aa...@assonance.org> wrote:
On Wed, May 2, 2012 at 1:11 PM, David Nolen <dnolen...@gmail.com> wrote:

What about forms like let where :children is a bit complex?

:children (into (vec (map :init bes))
                     (conj (vec statements) ret)) 


I think that should become: {:op :let :children {:init [inits] :statements [statements] :ret ret}

So if a :children value entry is a vector it's something you need to iterate over? What about :ret? So you'll always need to look at :children if you want to know :ret?

David

Aaron Cohen

unread,
May 2, 2012, 2:58:27 PM5/2/12
to cloju...@googlegroups.com
On Wed, May 2, 2012 at 2:39 PM, David Nolen <dnolen...@gmail.com> wrote:
On Wed, May 2, 2012 at 2:29 PM, Aaron Cohen <aa...@assonance.org> wrote:
On Wed, May 2, 2012 at 2:20 PM, David Nolen <dnolen...@gmail.com> wrote:
On Wed, May 2, 2012 at 2:10 PM, Aaron Cohen <aa...@assonance.org> wrote:
On Wed, May 2, 2012 at 1:11 PM, David Nolen <dnolen...@gmail.com> wrote:

What about forms like let where :children is a bit complex?

:children (into (vec (map :init bes))
                     (conj (vec statements) ret)) 


I think that should become: {:op :let :children {:init [inits] :statements [statements] :ret ret}

So if a :children value entry is a vector it's something you need to iterate over?
 
Yes, if :children entry is a seq, it must be a seq of nodes. For :let, I forgot that this is something else I changed in CinC. I made bindings an :op.

So it should actually look like:
{:op :let :children {:inits [{:op :binding :name x :init init-expr} {:op :binding :name y :init init-expr}] :statements [statements] :ret ret}}

What about :ret? So you'll always need to look at :children if you want to know :ret?

Yep, that's right. Tree traversal would _always_ go through :children, rather than having this separation of "syntax-aware walkers go through the nodes that they know the names of" and "generic walkers go through :children"

--Aaron

David Nolen

unread,
May 2, 2012, 3:06:41 PM5/2/12
to cloju...@googlegroups.com
On Wed, May 2, 2012 at 2:58 PM, Aaron Cohen <aa...@assonance.org> wrote:

So it should actually look like:
{:op :let :children {:inits [{:op :binding :name x :init init-expr} {:op :binding :name y :init init-expr}] :statements [statements] :ret ret}}

Which would become: 
 
{:op :let 
 :children {:inits [{:op :binding :name x :children {:init init-expr}} 
                    ...] 
            :statements [{...} ...] 
            :ret ret}}
 
Right? Perhaps we should start a Confluence page outlining this direction and its implications for the CLJS compiler and other consumers of the AST.

David

Aaron Cohen

unread,
May 2, 2012, 3:25:08 PM5/2/12
to cloju...@googlegroups.com
Yep, you caught me. :)

 I won't have a chance to create a Confluence page this afternoon, but probably not a bad idea to sleep on it anyway unless you get to it first.

--Aaron

Brandon Bloom

unread,
May 2, 2012, 3:40:23 PM5/2/12
to cloju...@googlegroups.com
2) Supporting generic code walkers. Code walking has traditionally been a difficult and brittle thing, as everyone encodes the primitives. Thus, adding a new primitive becomes impossible. Plus, that encoded knowledge itself is quite complex and duplicated between code walkers. Given that the AST data is a mix of things that would need to be walked and other details, there needs to be a mechanism to distinguish them.

Wouldn't any new primitive potentially invalidate any assumptions made by any code walker? New primitives could subtly violate some invariant that a code walker expected. It's not as if having a generic way to walk the AST would suddenly insulate any code walker that naively cared about a subset of primitives.

I'm trying to imagine some use cases, so that the design of this :children thing can be better informed.

I could, for example, imagine a code walker which only cared about tracking down method invokations. It could produce a call graph by walking :children and looking for (= :op :invoke). I'm still not sure if such a walker would be at all useful if some new primitive went unhandled, but maybe that's because I don't have any new primitives in mind.

Aaron Cohen

unread,
May 2, 2012, 3:50:17 PM5/2/12
to cloju...@googlegroups.com
On Wed, May 2, 2012 at 3:40 PM, Brandon Bloom <snpr...@gmail.com> wrote:
2) Supporting generic code walkers. Code walking has traditionally been a difficult and brittle thing, as everyone encodes the primitives. Thus, adding a new primitive becomes impossible. Plus, that encoded knowledge itself is quite complex and duplicated between code walkers. Given that the AST data is a mix of things that would need to be walked and other details, there needs to be a mechanism to distinguish them.

Wouldn't any new primitive potentially invalidate any assumptions made by any code walker? New primitives could subtly violate some invariant that a code walker expected. It's not as if having a generic way to walk the AST would suddenly insulate any code walker that naively cared about a subset of primitives.


There are definitely code walkers that will break with new primitives, but I think there are just as likely to be ones that are fine in the presence of primitives they don't know how to process, as long as they can still traverse them. A clojure lint-er comes to mind, particularly. There's plenty of value in still being able to do code cleanliness checks even if there are some structures the lint-er doesn't understand.

Brandon Bloom

unread,
May 31, 2012, 6:04:13 PM5/31/12
to cloju...@googlegroups.com
:children as a vector of keys seems promising.

I gave a little bit of thought to this and rooted around in the code to see what issues may arise. 

First, let's define:

(defn children [ast]
  (mapcat #(if (coll? %) % [%]) ((apply juxt (:children ast)) ast)))

Given that, most forms have an obvious :children as keys representation. For example, :if forms have children [:test :then :else].

Beyond the simple cases, there are two categories of oddities to deal with:
  1. block-children
  2. interleaved forms
block-children follows around analyze-block in all the forms where it is used. Currently, there are also two or three additional places that can be refactored to call block-children. The result can simply be augmented with the constant :children expression [:statements :ret] inside analyze-block. The one remaining caveat is that this creates nodes in the :children tree which do not have :env, :form, or :op. The (parse 'do) method creates :op :do, with a (list 'do ...) form, but analyze-block doesn't do that automatically. At first blush, we could collapse parsing of 'do with analyze-block, such that any block form would produce an ast node with (= op :do). That seems more consistent to me and would change the children definition I gave above to something like: (defn children [{:keys [op ast] :as ast}] (if (= op :do) (:statements ast) ((apply juxt (:children ast)) ast)))

interleaved forms, however, I'm not sure how to deal with. These occur when establishing bindings or when constructing hash tables. analyze-map produces :keys and :vals, but :children interleaves them. Since order of children matter, I'm not sure what the ast node should look like to ensure an easy lookup from the :children index. My only thought is that :keys and :vals could be replaced by :bindings [key0 val0 .. keyN valN], but that seems less useful. One more thing to note here is that ClojureScript is currently bugged in (emit :map). The ordering of bindings is not interleaved. I filed CLJS-288

Brandon Bloom

unread,
May 31, 2012, 6:52:03 PM5/31/12
to cloju...@googlegroups.com
I cooked up a patch which implements this approach: http://dev.clojure.org/jira/browse/CLJS-289
Reply all
Reply to author
Forward
0 new messages