Proposal: Comments in AST

Steve Morin

unread,

Sep 16, 2018, 12:59:14 PM9/16/18

to elixir-l...@googlegroups.com

Wanted to see what peoples thoughts are about adding comments to the AST.

One use-case is to be able to parse and manipulate Source files outside of just macros but in that case you would want to be able to preserve comments.

E.g. Read a source file, manipulate it, write that source file back to a file.

This would allow people/tooling to use elixir to parse and manipulate source files.

If this might disturb existing consumers of the AST maybe this could be added as a option to existing Macro functions so that I could be turned on for people with this use-case.

Louis Pilfold

unread,

Sep 16, 2018, 1:21:40 PM9/16/18

to elixir-l...@googlegroups.com

Hi Steve

The formatter extracts comments from Elixir source like so: https://github.com/elixir-lang/elixir/blob/9cf118bd123deb945be71bf3ea2b48cd271088f0/lib/elixir/lib/code/formatter.ex#L199-L225

This may help you.

Cheers,

Louis

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/955237D8-31DE-448C-9777-1104F1175DC7%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

José Valim

unread,

Sep 16, 2018, 1:35:50 PM9/16/18

to elixir-l...@googlegroups.com

This is really, really hard to do because it would overcomplicate the language grammar, as it would have to take comments into account at every step. I am not even sure if it is possible. If someone wants to try it out though. :)

That's why most tools end-up handling comments on the side, separated from parsing.

José Valim

www.plataformatec.com.br

Skype: jv.ptec

Founder and Director of R&D

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.

To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.

OvermindDL1

unread,

Sep 17, 2018, 1:28:11 PM9/17/18

to elixir-lang-core

Comments are part of the syntax and should be parsed as such however. Personally I'd opt for an OCaml-style handling of comments where the comments get turned into a meta attribute on the related tag, which for elixir could be as simple as wrapping the related comment part in something like `{:__block__, [comment: "...", line: ..., ...]. [original]}` for any potentially multiple lines worth of comments. With this the comments get directly 'assigned' to something (in OCaml it's the next following expression), which might have meaning to some tools, say documentation or some tests or so.

On Sunday, September 16, 2018 at 11:35:50 AM UTC-6, José Valim wrote:

This is really, really hard to do because it would overcomplicate the language grammar, as it would have to take comments into account at every step. I am not even sure if it is possible. If someone wants to try it out though. :)

That's why most tools end-up handling comments on the side, separated from parsing.

José Valim
www.plataformatec.com.br
Skype: jv.ptec
Founder and Director of R&D

On Sun, Sep 16, 2018 at 6:59 PM, Steve Morin <steve...@gmail.com> wrote:

Wanted to see what peoples thoughts are about adding comments to the AST.

One use-case is to be able to parse and manipulate Source files outside of just macros but in that case you would want to be able to preserve comments.

E.g. Read a source file, manipulate it, write that source file back to a file.

This would allow people/tooling to use elixir to parse and manipulate source files.

If this might disturb existing consumers of the AST maybe this could be added as a option to existing Macro functions so that I could be turned on for people with this use-case.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.

To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

OvermindDL1

unread,

Sep 17, 2018, 1:29:58 PM9/17/18

to elixir-lang-core

In addition, clang and a few other compilers I've worked with also all 'carry' comments in some form (either as their own syntactical node or as metadata on another node), so there is massive precedence.

José Valim

unread,

Sep 17, 2018, 2:17:21 PM9/17/18

to elixir-l...@googlegroups.com

I like the metadata idea a lot, thanks. We still need someone to send a detailed proposal, including what will happen with inline comments, comments inside blocks and comments as the last line of a block with no expression afterwards.

We also need a discussion on what will happen with nodes that do not have a metadata entry. Wrapping those in a block is likely enough (but it will break semantics, for example, in keywords lists). So it would need to be an opt-in feature only.

Once all the details are ironed out, then somebody can go ahead and fully implement it. :)

--

Louis Pilfold

unread,

Sep 17, 2018, 2:37:29 PM9/17/18

to elixir-l...@googlegroups.com

Hey

I'm a little out of touch with this area, but isn't this what we previously did? I remember comments being stored in meta during the creation of the formatter. Or did I make that up?

Cheers,

Louis

--

You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4J4%2BcTqcXH_ESJL2_o7rv5OJxkf_yJDpVAH%3DCjH2mmO7w%40mail.gmail.com.

José Valim

unread,

Sep 17, 2018, 3:10:42 PM9/17/18

to elixir-l...@googlegroups.com

I don't think we ever stored them in meta. But even if we did, we would probably have done it in the formatter, and we should likely move it somewhere public.

José Valim

www.plataformatec.com.br

Skype: jv.ptec

Founder and Director of R&D

On Mon, Sep 17, 2018 at 8:37 PM, Louis Pilfold <louisp...@gmail.com> wrote:

Hey

I'm a little out of touch with this area, but isn't this what we previously did? I remember comments being stored in meta during the creation of the formatter. Or did I make that up?

Cheers,
Louis

On Mon, 17 Sep 2018 at 19:17 José Valim <jose....@plataformatec.com.br> wrote:

I like the metadata idea a lot, thanks. We still need someone to send a detailed proposal, including what will happen with inline comments, comments inside blocks and comments as the last line of a block with no expression afterwards.

We also need a discussion on what will happen with nodes that do not have a metadata entry. Wrapping those in a block is likely enough (but it will break semantics, for example, in keywords lists). So it would need to be an opt-in feature only.

Once all the details are ironed out, then somebody can go ahead and fully implement it. :)
--

José Valim
www.plataformatec.com.br
Skype: jv.ptec
Founder and Director of R&D

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.

To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4J4%2BcTqcXH_ESJL2_o7rv5OJxkf_yJDpVAH%3DCjH2mmO7w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--

You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.

To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CABu8xFAk6UQJbbMcD2xkwJFLPM%3DbASSnOqL6ihSdUQEMpuKNGg%40mail.gmail.com.

Steve Morin

unread,

Sep 17, 2018, 3:24:30 PM9/17/18

to elixir-l...@googlegroups.com

Yes being part of the metadata would be great. Need to look around/research and see what other people have done. Just not losing the data would be great.

To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4LmVcoPq7fRkJY5ErJ4N93Bd8-9erEByNETt%2BLN5DZuYA%40mail.gmail.com.

Steve Morin

unread,

Sep 19, 2018, 1:27:46 PM9/19/18

to elixir-l...@googlegroups.com

Anyone know which Elixir AST expert to talk to, about getting tips on a starting point?

On Mon, Sep 17, 2018 at 12:24 PM Steve Morin <steve...@gmail.com> wrote:

Yes being part of the metadata would be great. Need to look around/research and see what other people have done. Just not losing the data would be great.

On Sep 17, 2018, at 12:10, José Valim <jose....@plataformatec.com.br> wrote:

I don't think we ever stored them in meta. But even if we did, we would probably have done it in the formatter, and we should likely move it somewhere public.

José Valim
www.plataformatec.com.br
Skype: jv.ptec
Founder and Director of R&D

On Mon, Sep 17, 2018 at 8:37 PM, Louis Pilfold <louisp...@gmail.com> wrote:

Hey

I'm a little out of touch with this area, but isn't this what we previously did? I remember comments being stored in meta during the creation of the formatter. Or did I make that up?

Cheers,
Louis

On Mon, 17 Sep 2018 at 19:17 José Valim <jose....@plataformatec.com.br> wrote:

I like the metadata idea a lot, thanks. We still need someone to send a detailed proposal, including what will happen with inline comments, comments inside blocks and comments as the last line of a block with no expression afterwards.

We also need a discussion on what will happen with nodes that do not have a metadata entry. Wrapping those in a block is likely enough (but it will break semantics, for example, in keywords lists). So it would need to be an opt-in feature only.

Once all the details are ironed out, then somebody can go ahead and fully implement it. :)
--

José Valim
www.plataformatec.com.br
Skype: jv.ptec
Founder and Director of R&D

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.

To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4J4%2BcTqcXH_ESJL2_o7rv5OJxkf_yJDpVAH%3DCjH2mmO7w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.

To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CABu8xFAk6UQJbbMcD2xkwJFLPM%3DbASSnOqL6ihSdUQEMpuKNGg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4LmVcoPq7fRkJY5ErJ4N93Bd8-9erEByNETt%2BLN5DZuYA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--

Steve Morin | Hacker, Entrepreneur, Startup Advisor

twitter.com/SteveMorin | stevemorin.com

Live the dream start a startup. Make the world ... a better place.

Steve Morin

unread,

Sep 21, 2018, 12:36:04 PM9/21/18

to elixir-l...@googlegroups.com

Anyone know which Elixir AST expert to talk to, about getting tips on a starting point? Anyone have suggestions on whom to reach out to for quick email or call?

OvermindDL1

unread,

Sep 21, 2018, 1:47:25 PM9/21/18

to elixir-lang-core

IRC would be a great place to ask. :-)

OvermindDL1

unread,

Sep 21, 2018, 1:47:57 PM9/21/18

to elixir-lang-core

#elixir-lang on freenode for note*

Steve Morin

unread,

Sep 22, 2018, 7:10:38 PM9/22/18

to elixir-l...@googlegroups.com

Tried on Slack but I'll try freenode next

To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/5f1a035a-527d-4b60-9d24-12f376835e01%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Arjan Scherpenisse

unread,

Apr 3, 2019, 2:27:18 PM4/3/19

to elixir-lang-core

Hi Steve, just stumbled on this, do you know if anything ever happened to this?

To unsubscribe from this group and stop receiving emails from it, send an email to elixir-l...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/5f1a035a-527d-4b60-9d24-12f376835e01%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

José Valim

unread,

Apr 3, 2019, 3:07:24 PM4/3/19

to elixir-l...@googlegroups.com

Hi Arjan,

I don't believe there was any progress here. Storing it in the AST would be hard because some nodes have no metadata slot, so it would be tricky in situations like this:

# Let's return just an atom for now
:foo

However, we do have the ability to return comments from tokenizer and the formatter use this feature:

https://github.com/elixir-lang/elixir/blob/master/lib/elixir/lib/code/formatter.ex#L205-L206

Note however those are private APIs but we will be glad to add a new API or an option to one of the functions in the Code module that return the AST + comments from a string. Feel free to experiment with the above and submit a proposal as you see fit.

José Valim

www.plataformatec.com.br

Skype: jv.ptec

Founder and Director of R&D

To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/a13ad0ff-4f42-44f9-91f5-8eacaa27450d%40googlegroups.com.

Arjan Scherpenisse

unread,

Apr 3, 2019, 3:55:22 PM4/3/19

to elixir-lang-core

Thanks José,

This is indeed a topic of interest to me. Actually, part of my talk in Prague coming tuesday will be about this. I'll let it simmer for now and maybe we can have a chat during the conference!

cheers, Arjan

Arjan Scherpenisse

unread,

Apr 3, 2019, 4:53:47 PM4/3/19

to elixir-lang-core

One thing that comes to mind for "simple value" cases like this: we could use the existing :__block__ ast node with a single child as nodes for simple values which have metadata attributes. AFAIK __block__ ast nodes now always have > 1 child.

This works:

iex(7)> {:__block__, [], [:a]} |> Macro.to_string

":a"

And this as well:

iex(10)> {:__block__, [], [123]} |> Code.eval_quoted()

{123, []}

José Valim

unread,

Apr 3, 2019, 4:57:04 PM4/3/19

to elixir-l...@googlegroups.com

We could and that's the approach we use in the formatter but we can't use it generally because it would break stuff like keyword lists (as the first element is now a tuple __block__ and no longer an atom).

José Valim

www.plataformatec.com.br

Skype: jv.ptec

Founder and Director of R&D

To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/0ab97ebf-39bf-4bcc-a137-49be7ed03f7c%40googlegroups.com.

Arjan Scherpenisse

unread,

Apr 4, 2019, 3:26:13 AM4/4/19

to elixir-lang-core

Yes so maybe the route would be to define an "extended AST" format, an AST which cannot directly be used to compile or evaluate Elixir expressions, but can always be reduced to the "official" AST format?

This extended AST could include everything that the formatter_metadata: true option currently includes, but also comment(s) for the nodes (trailing comments as well), basically, everything to re-create the original source code (given that the original was formatted, we should not add white space to this AST). As a bonus we could delay the creation of non-existing atoms to the conversion to the "official" AST, so this format could be used for DSL that rely on new identifiers but without the atom leakage.

What do you think?

Arjan

José Valim

unread,

Apr 4, 2019, 3:31:16 AM4/4/19

to elixir-l...@googlegroups.com

For what purpose we would introduce another AST representation?

I am really hesitant to introduce another interpretation of your AST unless it is really justified. I am even concerned with exposing the formatter AST because if there is a formatter bug and we need to change the formatter AST to fix it, then we will suddenly break people's code and be unable to improve the formatter, which is the main reason why the formatter AST exists today in the first place.

José Valim

www.plataformatec.com.br

Skype: jv.ptec

Founder and Director of R&D

To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/db687fa7-7377-4354-b794-7d5be9f6b240%40googlegroups.com.

Arjan Scherpenisse

unread,

Apr 5, 2019, 4:00:38 AM4/5/19

to elixir-lang-core

I understand your hesitance.. basically, what I think would be really valuable is a way to transform an AST back to code. That would enable a whole class of refactoring tools: renaming a function (including its call sites), inlining variables, etc, etc. Maybe I'm just spoiled with having used the IntelliJ IDEs for a while.. :-)

I think the formatter_metadata: true annotated AST is already a good step in this direction, the only thing missing there is a way to attach comments.

Arjan

OvermindDL1

unread,

Apr 16, 2019, 12:57:54 PM4/16/19

to elixir-lang-core

On Wednesday, April 3, 2019 at 2:57:04 PM UTC-6, José Valim wrote:

We could and that's the approach we use in the formatter but we can't use it generally because it would break stuff like keyword lists (as the first element is now a tuple __block__ and no longer an atom).

It's already an AST, and it is already very weird that there are some 'loose' non-ast style things like loose atoms or number or so forth things (especially the weird 2-tuple). Such constructs make it a very non-standard AST that makes it harder to work with to attach and use metadata to all possible parts of the AST as is traditional in most AST's.

José Valim

unread,

Apr 16, 2019, 1:02:23 PM4/16/19

to elixir-l...@googlegroups.com

I am sorry but I cannot understand what you are talking about.

If you are saying that having lists and tuples being literals are weird and that they should be regular ASTs (i.e. three elem tuples), then I recommend you to try doing those changes in an Elixir fork and see how it impacts the language.

José Valim

www.plataformatec.com.br

Skype: jv.ptec

Founder and Director of R&D

--

You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/c54d286d-cf7d-4631-b00a-810b680574c3%40googlegroups.com.

OvermindDL1

unread,

Apr 16, 2019, 1:40:09 PM4/16/19

to elixir-lang-core

Disclaimer: I've worked around all these with various helpers (that I really should package up into a library sometime...), but they are significantly more verbose than if the AST was just regular to begin with, so this is mostly bikeshedding as I can keep working that way, but these are annoying irritations that bug me on a near daily basis.

On Tuesday, April 16, 2019 at 11:02:23 AM UTC-6, José Valim wrote:

I am sorry but I cannot understand what you are talking about.

If you are saying that having lists and tuples being literals are weird and that they should be regular ASTs (i.e. three elem tuples), then I recommend you to try doing those changes in an Elixir fork and see how it impacts the language.

Considering full forms still compile fine:

```elixir

iex(2)> {:{}, [], [{:__block__, [], [:a]}, 42]} |> Macro.to_string
"{:a, 42}"

```

Then the only thing it would really affect are the consumers of AST, which would have a few effects that the major of which include though are not limited to:

1. Matching AST nodes would be more simple as they are always 3-tuples of the form of `{ASTNodeType, ContextMetadata, ASTNodeData}` instead of needing to handle a multitude of different forms, especially the loose ones such as `42`, which you have to match with a guard.

2. Taking keyword lists to macro functions would be more verbose, but a simple helper such as `Macro.keywordify` or so could take constructs of, as a pure example, something like `{:list, [], [{:tuple, [], [{:atom, [], "a"}, {:integer, [], 42}]}]}` and return `[a: 42]`, which makes handling ast-based keyword lists for compile-time data trivial, it could even have the aspect of expanding binding and other things using the `__CALLER__`'s environment.

The first is a huge bonus, having the context/metadata information on every single node is immensely useful in a variety of situations. Giving names to node such as `{:atom, [], "a"}` and `{:integer, [], 42}` is both more regular and standard (both great aspects for an AST) but is also safer for those wanting to parse out user data (like say a language server parser), while keeping distinct the difference between function calls (which only has a list in the data field) and bindings (which only have contextual atoms in the data field as far as I've seen so far, but then I'd argue that those should be in an `{:binding, [], [{:atom, [], "a"}, {:atom, [], "AContextAtom"}]}` setup as well, and thus function calls can be kept top level like `{{:atom, [], "funcname"}, [], [{:utf8, [], "an argument"}]}`). All of this would be *so* much a huge boon when I'm doing macro work as well, so much code would be simplified, so much so that I often convert Elixir's non-regular AST format into a regular format, so I convert things like `42` into `{:__block__, [], [42]}` so I can attach metadata into the middle field for later processing, all without needing to process it back because elixir still consumes it all just fine, and this is a pretty excessive pattern that I do just because all the loose 'stuff' is so irritating to handle and that loose 'stuff' has lost all contextual information, line number, column information if that exists, etc... etc... etc... I can 'try' to rebuild it when I need but it is impossible to absolutely and fully reproduce the original because the current AST format is so lacking.

Compare this to the clang or OCaml AST's, both of which are entirely regular as the equivalent of the 3-tuple of elixir's AST (although with ADT's instead, which are conceptually tagged tuples anyway, I.E. like an erlang record, though I *love* the simplicity of the 3-tuple with one field being a keywordlist, or better yet a map, of the metadata, and the other two fields being a tag, and the tag-specific data).

If I were designing Elixir's AST from scratch then it would be significantly more regular with a number of changes that would make it a lot easier to work with, but even as it is now it is pretty easy to make regular (if not properly explicit because of hacks like `{:__block__, [], [:a]}`, and that really is a hack) without any of this weird special casing of primitives or 2-tuples or lists or so forth, all of which 'seems' (based on usage in the elixir codebase itself) to be purely to make compile-time keyword lists simple, which could easily be fixed with a single function to do the transformation as-needed (like a `Macro.keywordize/2` function).

Right now macro's are pretty irritating to write because of all these inconsistencies, and consuming AST for other purposes like language servers or contextual highlighting is even more irritating.

José Valim

unread,

Apr 16, 2019, 2:03:03 PM4/16/19

to elixir-l...@googlegroups.com

I have to disagree on some points:

1. Traversing the AST is not as complex as it sounds. You literally need four clauses (and the first two can be written as proxy to the third):

def traverse({left, meta, right})

def traverse({one, two})
def traverse([_ | _] = list)

def traverse(other)

2. If we converted everything to 3 element tuples, the issue is not only Macro.keywordify as there are also macros that match on atoms too (and on strings too albeit less common). For instance, every Phoenix application does it . So making everything a three-element tuple would hurt that.

3. Note that {:integer, [], 1} and {:atom, [], "foo"} would be their own AST nodes too, as there are now new rules for what the 3 element actually is. Sure, it is more consistent in terms of metadata, but you are not really reducing the number of nodes. You could make them {:integer, meta, [1]} and similar but that would mean introducing new special forms.

I definitely agree that having a fixed place for metadata would improve certain cases but the trade-offs are much more nuanced than implied. The current AST was not optimized for reconstruction but for developer ergonomics and I believe the proposed standardisation would make certain features much more bureaucratic.

Steve Morin

unread,

Apr 16, 2019, 4:10:37 PM4/16/19

to elixir-l...@googlegroups.com

Jose, RE adding comments to AST meta data. Realized I don’t have time to add it. But what are your feeling on me putting up a bounty on it?

--

You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4JDiCU2-viWn06FOwLAQKTZ-ufTPdihUn8UnaTcC5ZHfw%40mail.gmail.com.

José Valim

unread,

Apr 16, 2019, 5:00:06 PM4/16/19

to elixir-l...@googlegroups.com

Hi Steve, as mentioned in this discussion, we can't really store it in the AST as not all nodes have metadata. My suggestion is to keep it on the side, similar to how we do in the formatter. Thanks for offering the bounty though!

José Valim

www.plataformatec.com.br

Skype: jv.ptec

Founder and Director of R&D

To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/D78F2CE6-1EA1-4C16-8EDE-37FBD7005ED7%40gmail.com.

OvermindDL1

unread,

Apr 16, 2019, 5:09:46 PM4/16/19

to elixir-lang-core

Disclaimer: Prior one still holds. ^.^

On Tuesday, April 16, 2019 at 12:03:03 PM UTC-6, José Valim wrote:

I have to disagree on some points:

1. Traversing the AST is not as complex as it sounds. You literally need four clauses (and the first two can be written as proxy to the third):

def traverse({left, meta, right})
def traverse({one, two})
def traverse([_ | _] = list)
def traverse(other)

Just simply traversing the ast to read it is not a case I often do by itself in macro's, rather I tend to do steps like this:

1. Traverse AST to gather information.

2. Traverse a case to modify and transform nodes based on prior collected information (this is where I have wrap primitives and lists in `:__block__`'s and unpack 2-tuples into the proper generic tuple form so I can store additional metadata on them).

3. Often step 2 has to be repeated a few times for a few other passes, generate code based on multi-level metadata, move AST nodes around, etc...

4. I don't both stripping my added metadata as the engine seems to ignore it, so now I finally just return it.

On Tuesday, April 16, 2019 at 12:03:03 PM UTC-6, José Valim wrote:

2. If we converted everything to 3 element tuples, the issue is not only Macro.keywordify as there are also macros that match on atoms too (and on strings too albeit less common). For instance, every Phoenix application does it . So making everything a three-element tuple would hurt that.

Exactly, those macro's can match on the AST form of an atom or atoms themselves, a macro can even make such matches a simple thing such as added a 'guard' of the form like `AST.is_atom/1` that matches an atom or a supposed `{:atom, [], "a"}`. Or if it always wants it in one form or the other then it can just transform it before working on it.

On Tuesday, April 16, 2019 at 12:03:03 PM UTC-6, José Valim wrote:

3. Note that {:integer, [], 1} and {:atom, [], "foo"} would be their own AST nodes too, as there are now new rules for what the 3 element actually is. Sure, it is more consistent in terms of metadata, but you are not really reducing the number of nodes. You could make them {:integer, meta, [1]} and similar but that would mean introducing new special forms.

Yep yep, I never meant to imply it would reduce node count, quite likely it would even increase it if bindings were made their own as well (which would clarify things so much in some cases, right now don't know if a binding scope has to be an atom or if any term is allowed for example). In general if it were regular as proposed then every single node is as-is, only nodes with the last element being a list need to be recursed into (so function calls, `:tuple`, `:list`, etc...), even the primitives could easily just be a singular `{:prim, [], 42}` node (it would be wonderful to contain the original string representation of it in the metadata/context as well, so we know if an integer was typed as `42` or as `4_2` or whatever, in addition to comments, line/column information, etc...). In general if you wanted to fully traverse it then it would just reduce to:

```elixir

defmodule AST do

def postwalk(ast, f)

def postwalk({t, md, list} = ast, f) when is_list(list) do

f.({t, md, Enum.map(list, &postwalk(&1, f))})

end

def postwalk(ast, f) do

f.(ast)

end

```

On Tuesday, April 16, 2019 at 12:03:03 PM UTC-6, José Valim wrote:

I definitely agree that having a fixed place for metadata would improve certain cases but the trade-offs are much more nuanced than implied. The current AST was not optimized for reconstruction but for developer ergonomics and I believe the proposed standardisation would make certain features much more bureaucratic.

Even for developer ergonomics it is quite a pain to work with though in macro's because of these issues.

Steve Morin

unread,

Apr 16, 2019, 6:54:56 PM4/16/19

to elixir-l...@googlegroups.com

Jose,

Thought that you were mentioning that metadata could be added to those that don't have metadata? You're prior comments made it sound like it could be worked around (potentially) and that you would be supportive of that. Did I miss understand you?

From you're prior reply:

I like the metadata idea a lot, thanks. We still need someone to send a detailed proposal, including what will happen with inline comments, comments inside blocks and comments as the last line of a block with no expression afterwards.

We also need a discussion on what will happen with nodes that do not have a metadata entry. Wrapping those in a block is likely enough (but it will break semantics, for example, in keywords lists). So it would need to be an opt-in feature only.

To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4KWbudLS31NPDcuCVaM4y3DZj58%2BHftj8YJnobQJ4MmWg%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--

José Valim

unread,

Apr 16, 2019, 7:03:18 PM4/16/19

to elixir-l...@googlegroups.com

Yeah, I was pointing out it could be done only partially, and therefore it would be incomplete as a solution.

José Valim

www.plataformatec.com.br

Skype: jv.ptec

Founder and Director of R&D

To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CAPxhEGf%3DFmRMxGqb8QDMNSFT56nuP2%2BeiVgVcKnmLL5ob5g%3Dsw%40mail.gmail.com.

Steve Morin

unread,

Apr 16, 2019, 7:11:23 PM4/16/19

to elixir-l...@googlegroups.com

Jose,

My take away from you're comment is that it would only be a partial solution because there are some " nodes that do not have a metadata entry " but if they did then would you see any reason when it couldn't be implemented?

To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4%2BOh2VjzioKzbNN4bAZBmtUq2eydr76YMZj45Fzm1Z3fw%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

José Valim

unread,

Apr 16, 2019, 7:13:21 PM4/16/19

to elixir-l...@googlegroups.com

If they did, yes, but they don't and they won't as that would be a pretty big change to Elixir's AST. :)

José Valim

www.plataformatec.com.br

Skype: jv.ptec

Founder and Director of R&D

To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CAPxhEGfj5WKsj29NzHOA8zrcazv20447soOdYwJWYXupXCOCRA%40mail.gmail.com.

Steve Morin

unread,

Apr 16, 2019, 7:18:20 PM4/16/19

to elixir-l...@googlegroups.com

Jose,

Thanks for the clarification and maybe a partial solution would be a good first step.

To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4KavKX4kN3rsLZ7ajm_B6yOqUKN0LLRokg_M5vNe%2B8MQQ%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

José Valim

unread,

Apr 16, 2019, 7:25:34 PM4/16/19

to elixir-l...@googlegroups.com

If someone wants to try it out, sure, but at the moment I don't see it being added directly to Elixir.

At the moment I don't have any new information to add to this discussion, so I will bow out and let everyone tinker to their hearts content.

José Valim

www.plataformatec.com.br

Skype: jv.ptec

Founder and Director of R&D

To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CAPxhEGfcLvMfxZZ4PL59FP%2BsD%3Ds1QXCfuwvxSyyD2FNsGurtxA%40mail.gmail.com.

Steve Morin

unread,

Apr 16, 2019, 11:31:51 PM4/16/19

to elixir-l...@googlegroups.com

Fair

To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4JhmOz8vUbJZNA3Ta_-61Rrt%2BpL2ZwVtbvsqg2bBsMD%2BA%40mail.gmail.com.

w...@resilia.nl

unread,

Apr 23, 2019, 6:18:07 AM4/23/19

to elixir-lang-core

What about "hoisting" the comments into the metadata of the surrounding block for the primitive AST nodes?

This means that we need to be able to specify where in a `:__block__` a comment ought to occur, which shouldn't be a problem: The only cases where this information will be out of date is when macro rewriting alters the AST, which is a situation that would probably remove the comments anyway.

(so e.g. a formatter would need to move comments with the code just as is the case currently, but optimization tools would not care about comments anyway).

This 'where' might take the format of a keyword-list, with half-line-numbers as keys and strings (containing the actual comments) as values. 'half-line-numbers ' go from 0 up to and including `2*n_lines_in_block`. Even numbers are comments occuring before (e.g. on the line above) `div(half-line-number, 2)`. Odd numbers are comments occuring after (at the end of the same line as) `div(half-line-number/2)`.

So


quote do
   # one is the lonliest number
  1
   2 # two is the smallest prime
  3
   # I am at the end
end

would compile to

{:__block__, [comments: [0 => "one is the lonliest number", 3 => 
"two is the smallest prime", 6 => "I am at the end"]], [1,2,3]}

~Marten / Qqwy

Arjan Scherpenisse

unread,

Apr 23, 2019, 7:20:36 AM4/23/19

to elixir-lang-core

I actually discussed a solution like this with José in Prague but this will also become complicated very quickly, especially as these primivite nodes can be nested (e.g. keyword lists and normal lists). Maintaining the correct position in the parent is also problematic when you want to do something with the child nodes, for instance if you write an AST transform to swap 2 parameters of a function call, the comments would not get swapped.

For the short term I've put this subject to rest. My opinion is that for tooling that depends on code -> ast -> code transforms, we need more specialized tooling. For instance, we cannot assume that the input source code is already properly formatted; Ideally a source code modification tool should work with all kinds of source code, and a transformation should only affect the local scope, and not, for instance, reformat all of the output. I could even imagine it working with code that is not syntactically correct. That is a subject that the Elixir parser was just not meant to do, and trying to squeeze comments into the AST would not solve this.

I have been looking at the Wrangler source code, which is a set of Erlang refactoring tools, but I have not come very far yet. It is an interesting source of information though.

Arjan

Steve Morin

unread,

Apr 23, 2019, 12:29:20 PM4/23/19

to elixir-l...@googlegroups.com

Arjan Scherpenisse

What is you're use-case?

--

You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/df0438ce-45cb-4c90-b87c-c7a1ee9cc41b%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Arjan Scherpenisse

unread,

Apr 23, 2019, 2:48:19 PM4/23/19

to elixir-lang-core

Yes, sorry, a part of my talk at Elixirconf was about using the AST as the source for "intelligent" refactoring tools; renaming functions, inlining variables, extracting functions, etc.

It should be online soon, I think, slides are here.

Arjan

To unsubscribe from this group and stop receiving emails from it, send an email to elixir-l...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/df0438ce-45cb-4c90-b87c-c7a1ee9cc41b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Steve Morin

unread,

Apr 23, 2019, 3:01:30 PM4/23/19

to elixir-l...@googlegroups.com

Arjan,

So sounds like we both ran in to the same issue of refactoring/code manipulation with the problem of losing comments.

-Steve

To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/6fa656e1-6eec-4ad9-be50-e0f503edf58a%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Arjan Scherpenisse

unread,

Apr 23, 2019, 3:19:09 PM4/23/19

to elixir-lang-core

Yep, that's why I dug up this thread.. but like I said, I'll let it simmer some more now :-)

To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/6fa656e1-6eec-4ad9-be50-e0f503edf58a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward