Enhancement request: parse character offsets for AST node

46 views
Skip to first unread message

Serge Smetana

unread,
Feb 26, 2018, 7:11:42 AM2/26/18
to elixir-lang-core
Hi,

I would like Elixir AST nodes to include begin and end parse character offsets.
This would make it easier to write refactoring tools that modify parts of
Elixir source files.

Usecase:

able to create and update expected values in tests. For example, given the
following:

assert_value 2 + 2 == 2 + 3

our code will find the location of `2 + 3` and replace it with `4`:

assert_value 2 + 2 == 4

Today determining the location of `2 + 3` in the source file is difficult.  For
now, we use a custom-made parser which processes code char by char until it gets
value matching AST:
But if we had the parse offsets it would be much easier.

Proposed interface:

Probably need two sets of offsets, exclusive and inclusive of children.  For
each probably best to store beginning offset and length. Will need reasonable
handling for parentheses and other tokens that do not make it into AST.

Existing implementation:

Elixir AST nodes do have useful info on this already. We use the "line:" which
is very helpful. We don't use "column:", it did not seem useful given our
implementation.  We may be missing something obvious here.

Details:

In Elixir 1.6 compiled code AST has only function line number in meta. Even
"columns: true" in Code.string_to_quoted gives only function starting column
without information about arguments.

Consider the following code:

Code.string_to_quoted!("(41.00 == 42.0000)", columns: true)
#=> {:==, [line: 1, column: 8], [41.0, 42.0]}

From the AST you don't know where to find 41.00 and 42.0000 in a code.
Column information does not help. AST values of 41.0 and 42.0 don't have
information about how original values were formatted.

Thanks,
Serge

Louis Pilfold

unread,
Feb 26, 2018, 7:31:01 AM2/26/18
to elixir-lang-core

Hi there

Given we have the formatter now I would be tempted to avoid editing strings and instead update the AST of the file, render that to a string, format it and write it back to the file. Would be more reliable.

Cheers,
Louis


--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/4dbc37cc-6630-4989-acc0-c4acfd719a5b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Serge Smetana

unread,
Feb 27, 2018, 6:32:47 AM2/27/18
to elixir-lang-core
Hi,

Sounds reasonable but may not work as good. This may change a lot of user formatted code and produce extra diffs in tests. If we could only reformat a part of the AST (one function call in our case) this would be ideal.

Thanks,
Serge.

Martin Svalin

unread,
Feb 28, 2018, 3:59:01 PM2/28/18
to elixir-l...@googlegroups.com
I think you can reformat a part of the AST.

    iex> some_ast_fragment = quote do [h|t] end
    iex> some_ast_fragment |> Macro.to_string() |> Code.format_string! |> Enum.join
    "[h | t]"

Not sure if this would work for you. And the round-trip to a string might not be necessary; I haven't had more than a cursory look at Code.Formatter.

- Martin

Reply all
Reply to author
Forward
0 new messages