[Proposal] Overload capture operator to support tagged variable captures

254 views
Skip to first unread message

Christopher Keele

unread,
Jun 28, 2023, 7:56:18 PM6/28/23
to elixir-lang-core
This is a formalization of my concept here, as a first-class proposal for explicit discussion/feedback, since I now have a working prototype.

Goal

The aim of this proposal is to support a commonly-requested feature: short-hand construction and pattern matching of key/value pairs of associative data structures, based on variable names in the current scope.

Context

Similar shorthand syntax sugar exists in many programming languages today, known variously as:
This feature has been in discussion for a decade, on this mailing list (1, 2, 3, 4, 56) and the Elixir forum (1, 2, 3, 4, 5, 6), and has motivated many libraries (1, 2, 3, 4). These narrow margins cannot fit the full history of possibilities, proposals, and problems with this feature, and I will not attempt to summarize them all. For context, I suggest reading this mailing list proposal and this community discussion in particular.

However, in summary, this particular proposal tries to solve a couple of past sticking points:
  1. Atom vs String key support
  2. Visual clarity that atom/string matching is occurring
  3. Limitations of string-based sigil parsing
  4. Easy confusion with tuples
I have a working fork of Elixir here where this proposed syntax can be experimented with. Be warned, it is buggy.

Proposal: Tagged Variable Captures

I propose we overload the unary capture operator (&) to accept compile-time atoms and strings as arguments, for example &:foo and &"bar". This would expand at compile time into a tagged tuple with the atom/string and a variable reference. For now, I am calling this a "tagged-variable capture"  to differentiate it from a function capture.

For the purposes of this proposal, assume:

{foo, bar} = {1, 2}

Additionally,
  • Lines beginning with # ==  indicate what the compiler expands an expression to.
  • Lines beginning with # =>  represent the result of evaluating that expression.
  • Lines beginning with # !>  represent an exception.
Bare Captures

I'm not sure if we should support bare tagged-variable capture, but it is illustrative for this proposal, so I left it in my prototype. It would look like:

&:foo
# == {:foo, foo}
# => {:foo, 1}
&"foo"
# == {"foo", foo}
# => {"foo", 1}

If bare usage is supported, this expansion would work as expected in match and guard contexts as well, since it expands before variable references are resolved:

{:foo, baz} = &:foo
# == {:foo, baz} = {:foo, foo}
# => {:foo, 1}
baz
# => 1

List Captures

Since capture expressions are allowed in lists, this can be used to construct Keyword lists from the local variable scope elegantly:

list = [&:foo, &:bar]
# == list = [{:foo, foo}, {:bar, bar}]
# => [foo: 1, bar: 2]

This would work with other list operators like |:

baz = 3
list = [&:baz | list]
# == list = [{:baz, baz} | list]
# => [baz: 3, foo: 1, bar: 2]

And list destructuring:

{foo, bar, baz} = {nil, nil, nil}
[&:baz, &:foo, &:bar] = list
# == [{:baz, baz}, {:foo, foo}, {:bar, bar}] = list
# => [baz: 3, foo: 1, bar: 2]
{foo, bar, baz}
# => {1, 2, 3}

Map Captures

With a small change to the parser, we can allow this expression inside map literals. Because this expression individually gets expanded into a tagged-tuple before the map associations list as a whole are processed, it allow this syntax to work in all existing map/struct constructs, like map construction:

map = %{&:foo, &"bar"}
# == %{:foo => foo, "bar" => bar}
# => %{:foo => 1, "bar" => 2}

Map updates:

foo = 3
map = %{map | &:foo}
# == %{map | :foo => foo}
# => %{:foo => 3, "bar" => 2}

And map destructuring:

{foo, bar} = {nil, nil}
%{&:foo, &"bar"} = map
# == %{:foo => foo, "bar" => bar} = map
# => %{:foo => 3, "bar" => 2}
{foo, bar}
# => {3, 2}

Considerations

Though just based on an errant thought that popped into my head yesterday, I'm unreasonably pleased with how well this works and reads in practice. I will present my thoughts here, though again I encourage you to grab my branch, compile it from source, and play with it yourself!

Pro: solves existing pain points

As mentioned, this solves flaws previous proposals suffer from:
  1. Atom vs String key support
    This supports both.
  2. Visual clarity that atom/string matching is occurring
    This leverages the appropriate literal in question within the syntax sugar.
  3. Limitations of string-based sigil parsing
    This is compiler-expansion-native.
  4. Easy confusion with tuples
    %{&:foo, &"bar"} is very different from {foo, bar}, instead of 1-character different.
Additionally, it solves my main complaint with historical proposals: syntax to combine a variable identifier with a literal must either obscure that we are building an identifier, or obscure the key/string typing of the literal.

I'm proposing overloading the capture operator rather than introducing a new operator because the capture operator already has a semantic association with messing with variable scope, via the nested integer-based positional function argument syntax (ex & &1).

By using the capture operator we indicate that we are messing with an identifier in scope, but via a literal atom/string we want to associate with, to get the best of both worlds.

Pro: works with existing code

The capture today operator has well-defined compile-time-error semantics if you try to pass it an atom or a string. All compiling Elixir code today will continue to compile as before.

Pro: works with existing tooling

By overloading an existing operator, this approach works seamlessly for me with the syntax highlighters I have tried it with so far, and reasonable with the formatter.

In my experimentation I've found that the formatter wants to rewrite &:baz to (&:baz) pretty often. That's good, because there are several edge cases in my prototype where not doing so causes it to behave strangely; I'm sure it's resolving ambiguities that would occur in function captures that impact my proposal in ways I have yet fully anticipated.

Pros: minimizes surface area of the language

By overriding the capture operator instead of introducing a new operator or sigil, we are able to keep the surface area of this feature slim.

Cons: overloads the capture operator

Of course, much of the virtues of this proposal comes from overloading the capture operator. But it is an already semantically fraught syntactic sugar construct that causes confusion to newcomers, and this would place more strain on it.

We would need to augment it with more than the meager error message modification in my prototype, as well as documentation and anticipate a new wave of questions from the community upon release.

This inelegance really shows when considering embedding a tagged variable capture inside an anonymous function capture, ex & &1 = &:foo. In my prototype I've chosen to allow this rather than error on "nested captures not allowed" (would probably become: "nested function captures not allowed"), but I'm not sure I found all the edge-cases of mixing them in all possible constructions.

Additionally, since my proposal now allows the capture operator as an associative element inside map literal parsing, that would change the syntax error reported by providing a function capture as an associative element to be generated during expansion rather than during parsing. I am not fluent enough in leex to have have updated the parser to preserve the exact old error, but serendipitously what it reports in my prototype today is pretty good regardless, but I prefer the old behaviour:

Old:
%{& &1}
# !> ** (SyntaxError) syntax error before '}'
# !> |
# !> 1 | %{& &1}
# !> | ^
New:
%{& &1}
# => error: expected key-value pairs in a map, got: & &1
# => ** (CompileError) cannot compile code (errors have been logged)

Cons: here there be dragons I cannot see

I'm quite sure a full implementation would require a lot more knowledge of the compiler than I am able to provide. For example, &:foo = &:foo raises an exception where (&:foo) = &:foo behaves as expected. I also find the variable/context/binding environment implementation in the erlang part of the compiler during expansion to be impenetrable, and I'm sure my prototype fails on edge cases there.

Open Question: the pin operator

As this feature constructs a variable ref for you, it is not clear if/how we should support attempts to pin the generated variable to avoid new bindings. In my prototype, I have tried to support the pin operator via the &^:atom syntax, though I'm pretty sure it's super buggy on bare out-of-data-structure cases and I only got it far enough to work in function heads for basic function head map pattern matching.

Open Question: charlists

I did not add support for charlist tagged variable captures in my prototype, as it would be more involved to differentiate a capture of list mean to become a tagged tuple from a list representing the AST of a function capture. I would not lose a lot of sleep over this.

Open Question: allowed contexts

Would we even want to allow this syntax construct outside of map literals? Or list literals?

I can certainly see people abusing the bare-outside-of-associative-datastructure syntax to make some neigh impenetrable code where it's really unclear where assignment and pattern matching is occuring, and relatedly this is where I see a lot of odd edge-case behaviour in my prototype. I allowed it to speed up the implementation, but it merits more discussion.

On the other hand, this does seem like an... interesting use-case:

error = "rate limit exceeded"
&:error # return error tuple

Thanks for reading! What do you think?

Christopher Keele

unread,
Jun 28, 2023, 8:22:29 PM6/28/23
to elixir-lang-core
An obvious modification to this proposal would be to introduce a new operator for this application, instead of overloading the capture operator. I avoided it because I liked the association with capturing something, I like keeping the language surface area slim, and I'm not competent enough with leex tokenizing or erlang to implement an entirely new construct in the parser, so repurposing & in my prototype got me very far.

However, I would like to make it clear that a new operator is on the table for this proposal. It would allow us to define more precise precedence, binding, and parsing rules without lots of checks in the compiler for what we are capturing.

AFAICT remaining punctuation characters on the standard english QWERTY keyboard that do not correspond to semantically meaningful tokens in Elixir are the tilde (`), the question mark (?), and the dollar sign ($).
  • Using the tilde is interesting/complicated because of its association with macro unquoting in other languages. I don't like how difficult it is to notice visually compared to what this syntax is doing.
  • Using the question mark is problematic because of its association with optionality and null-chaining-avoidance in other languages. I don't like how it semantically overlaps with the convention of ending predicate method names in Elixir with it.
  • Using the dollar sign is promising. It is associated with all sorts of odd jobs in other languages, such as the rarely used global variable sigil in ruby, the very used local variable sigil in php, and the DOM ID accessor in JavaScript browsers. However, I don't like how it is a very localized character, so might be difficult to type on international or regional keyboards. However, in my research on keyboards over the years, most the commonly-used keyboards have it, and because of the prevalence in other language's syntax, most programmers have solutions to typing it regardless.

Paul Schoenfelder

unread,
Jun 28, 2023, 8:41:05 PM6/28/23
to 'Justin Wood' via elixir-lang-core
My thoughts on the proposal itself aside, I’d just like to say that I think you’ve set a great example of what proposals on this list should look like. Well done!

I have an almost visceral reaction to the use of capture syntax for this though, and I don’t believe any of the languages you mentioned that support field punning do so in this fashion. They all use a similar intuitive syntax where the variable matches the field name, and they don’t make any effort to support string keys.

If Elixir is to ever support field punning, I strongly believe it should follow their example. However, there are reasons why Elixir cannot do so due to syntax ambiguities (IIRC). In my mind, that makes any effort to introduce this feature a non-starter, because code should be first and foremost easy to read, and I have yet to see a proposal for this that doesn’t make the code harder to read and understand, including this one.

I’d like to have field punning, but by addressing, if possible, the core issue that is blocking it. If that can’t be done, I just don’t think the cost of overloading unrelated syntax is worth it. I think calling the `&…` syntax “capture syntax” is actually misleading, and only has that name because it can be used to construct closures by “capturing” a function name, but it is more accurate to consider it closure syntax, in my opinion. Overloading it to mean capturing things in a more general sense will be confusing for everyone, and would only work in a few restricted forms, which makes it more difficult to teach and learn.

That’s my two cents anyway, I think you did a great job with the proposal, but I’m very solidly against it as the solution to the problem being solved.

Paul
--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

Christopher Keele

unread,
Jun 28, 2023, 8:45:03 PM6/28/23
to elixir-lang-core
> My thoughts on the proposal itself aside, I’d just like to say that I think you’ve set a great example of what proposals on this list should look like. Well done!

Much appreciated!

> I have an almost visceral reaction to the use of capture syntax for this though.

> I think calling the `&…` syntax “capture syntax” is actually misleading, and only has that name because it can be used to construct closures by “capturing” a function name, but it is more accurate to consider it closure syntax, in my opinion.

This is a very salient point. How do you feel about introducing a new operator for this sugar, such as $:foo?

Zach Daniel

unread,
Jun 28, 2023, 8:49:58 PM6/28/23
to elixir-l...@googlegroups.com
I agree with Paul on the specific operator, however I feel like you've just done a great thing laying out most if not all of the considerations we've had on this conversation to date. I know that these conversations can get long and sometimes not produce fruit, but I feel like we should try to chase this down and come to a conclusion on the matter if at all possible.

My personal suggestion on the proposal would be to use `$` as you've said, and to remove the need for `:` for the case of atoms.

%{$foo, $bar} = %{foo: 10, bar: 10}

%{$"foo", $"bar"} = map

It is a new operator, but it feels expressive to me, and the $ currently has no use in mainstream elixir syntax (its used in ets match specs as a value, not as an operator). That seems like a good solution to me.


To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/2b46232e-04f1-4b21-87e6-9c098741cd36n%40googlegroups.com.

Zach Daniel

unread,
Jun 28, 2023, 8:50:59 PM6/28/23
to elixir-l...@googlegroups.com
Perhaps removing the `:` is a bad idea actually? I see now it removes the consistency on values. Either way, let's not lose the momentum on this! I feel like we can get this over the line (or put the last nails in its coffin for good )


On Wed, Jun 28, 2023 at 8:45 PM, Zach Daniel <zachary....@gmail.com> wrote:
I agree with Paul on the specific operator, however I feel like you've just done a great thing laying out most if not all of the considerations we've had on this conversation to date. I know that these conversations can get long and sometimes not produce fruit, but I feel like we should try to chase this down and come to a conclusion on the matter if at all possible.

My personal suggestion on the proposal would be to use `$` as you've said, and to remove the need for `:` for the case of atoms.

%{$foo, $bar} = %{foo: 10, bar: 10}

%{$"foo", $"bar"} = map

It is a new operator, but it feels expressive to me, and the $ currently has no use in mainstream elixir syntax (its used in ets match specs as a value, not as an operator). That seems like a good solution to me.


Christopher Keele

unread,
Jun 28, 2023, 9:02:01 PM6/28/23
to elixir-lang-core
> My personal suggestion on the proposal would be to use `$` as you've said

> Either way, let's not lose the momentum on this!

I'm working on a prototype and proposal for exactly that as we speak, but for the sake of momentum of the prototype, repurposed the capture operator in my initial experimentation. Perhaps I should have framed this as "new feature: tagged-variable captures" first, and mentioned "overloading the capture operator" as a followup extension of the proposal second, instead of the other way around. I got excited by having a working prototype with the existing operator I suppose:D

Christopher Keele

unread,
Jun 28, 2023, 9:10:37 PM6/28/23
to elixir-lang-core
> My personal suggestion on the proposal would be to use `$` as you've said, and to remove the need for `:` for the case of atoms.

> Perhaps removing the `:` is a bad idea actually? I see now it removes the consistency on values.

I chose to preserve it not only because it is consistent with atom literals and works well with syntax highlighting today, but also because leaving the bareword identifier behaviour open would allow extensions to this functionality down the line. I'm thinking specifically of a struct field access syntax like:

uri = URI.parse("postgresql://username:password@host:5432/database_name?schema=public")
[$uri.host, $uri.port]
#=> [host: "host", port: 5432]

However, I wanted to keep the initial proposal simple and focused on the most commonly desired functionality.

Christopher Keele

unread,
Jun 28, 2023, 9:28:35 PM6/28/23
to elixir-lang-core
> My personal suggestion on the proposal would be to use `$` as you've said.

> It is a new operator, but it feels expressive to me, and the $ currently has no use in mainstream elixir syntax (its used in ets match specs as a value, not as an operator).

I actually am selfishly rooting for this, because then I might be able to do tricky things to allow literal $1 and $$ Elixir syntax in my elixir-to-matchspec compiler.

Paul Schoenfelder

unread,
Jun 28, 2023, 9:36:15 PM6/28/23
to 'Justin Wood' via elixir-lang-core
I do think there is value in proposing the "tagged variable captures" idea separately, but at the same time, your solution for field punning is part of the value proposition there. That said, as you've already noted, it is very easy for the conversation to get bogged down when more than one thing is being discussed at a time.

This is a very salient point. How do you feel about introducing a new operator for this sugar, such as $:foo?

The first thing that sticks out to me is that there are a variety of places where atoms starting with `$` occur in practice (particularly around ETS), so I could see things like `$:$$` appearing in code, which is just...no. Of course, an argument could be made that one should just not do that, but it is something to consider. Obviously, you can't get rid of the `:` for the same reason.

But the idea of an operator more generally? I guess it would really depend on the specific choice. I don't like it in principle, but I'd want to cast my vote with a specific syntax in question, such as those you've proposed. As I mentioned in my previous reply, I really think the best path for Elixir with regard to field punning is to solve the syntax ambiguities that prevent the "obvious" syntax for it, e.g. `%{foo, bar} = baz`, and only focus on supporting atom keys. That may not be possible without backwards-incompatible changes to the grammar, in which case it's something to throw on the wishlist of things that could go in an eventual Elixir 2.0.

I think it's important to cast the feature in a broader context, because I think everyone would agree that field punning is a nice-to-have. But is the tradeoff in complexity for the language really worth it? The more explicit syntax is (perhaps) more annoying to write, but I think the vast majority would agree that it is simple, clear, and easy to reason about. When we're arguing for field punning, we're really arguing for a significant benefit when writing code, but only in the "obvious" syntax I gave an example of above do I think one can argue that there is any benefit in terms of readability, and even then it is a small benefit. It adds cognitive overhead, particularly for new Elixir developers, as one must desugar the syntax in their head. I don't think that cognitive overhead is significant, but it is only one thing amongst many that one must carry around in their head when working with Elixir code - we should aim to reduce that overhead rather than add to it.

Anyway, I don't think I'm adding anything new to the arguments that have been made in the past, so I don't want to derail your proposal here, or add to the noise, particularly with regard to the "tagged variable captures" portion, which deserves its own consideration. I will leave it up to the community at large to decide, but just want to say thanks again for putting so much effort into summarizing the current state of the discussion and implementing a prototype of your proposal - it certainly gives it a lot more weight to me.

Paul

Christopher Keele

unread,
Jun 28, 2023, 10:30:41 PM6/28/23
to elixir-lang-core
I've figured out the tokenizer enough to prototype this as new operator; working title the "tagged variable literal" operator (not in love with that name). I'm using a dollar sign ($) to represent it.

It has the same issues as before, as I've ungracefully wedged it between the capture operator and other precedences, but now is logically separated from the capture operator. Weird stuff still happens without wrapping it in parens in certain contexts, for example; but I think it's enough to continue discussion around this proposal if we want to refocus it around a new operator.

I'm happy to refine the branch further and work on a PR, but would need much guidance, and so would rather leave it as is for now without more feedback on the proposal and related blessings, as I would need more core-team support to implement it than I did defguard. Still sounds really fun to do.

The source code for this new fork of Elixir is available here for experimentation. For convenience, here are the examples in this proposal reworked to use a dedicated $ operator for compile-time tagged variable literals. They all work in iex on my fork, although many obvious usages of it do not without more work:

Bare Tagged Variable Literals

$:foo
# == {:foo, foo}
# => {:foo, 1}
$"foo"
# == {"foo", foo}
# => {"foo", 1}

If bare usage is supported, this expansion would work as expected in match and guard contexts as well, since it expands before variable references are resolved:

{:foo, baz} = $:foo
# == {:foo, baz} = {:foo, foo}
# => {:foo, 1}
baz
# => 1

Tagged Variable Literals in Lists

Since tagged variable expressions are allowed in lists, this can be used to construct Keyword lists from the local variable scope elegantly:

list = [$:foo, $:bar]
# == list = [{:foo, foo}, {:bar, bar}]
# => [foo: 1, bar: 2]

This would work with other list operators like |:

baz = 3
list = [$:baz | list]
# == list = [{:baz, baz} | list]
# => [baz: 3, foo: 1, bar: 2]

And list destructuring:

{foo, bar, baz} = {nil, nil, nil}
[$:baz, $:foo, $:bar] = list
# == [{:baz, baz}, {:foo, foo}, {:bar, bar}] = list
# => [baz: 3, foo: 1, bar: 2]
{foo, bar, baz}
# => {1, 2, 3}

Tagged Variable Literals in Maps

With a small change to the parser, we can allow this expression inside map literals. Because this expression individually gets expanded into a tagged-tuple before the map associations list as a whole are processed, it allow this syntax to work in all existing map/struct constructs, like map construction:

map = %{$:foo, $"bar"}
# == %{:foo => foo, "bar" => bar}
# => %{:foo => 1, "bar" => 2}

Map updates:

foo = 3
map = %{map | $:foo}
# == %{map | :foo => foo}
# => %{:foo => 3, "bar" => 2}

And map destructuring:

{foo, bar} = {nil, nil}
%{$:foo, &"bar"} = map
# == %{:foo => foo, "bar" => bar} = map
# => %{:foo => 3, "bar" => 2}
{foo, bar}
# => {3, 2}

Christopher Keele

unread,
Jun 28, 2023, 10:42:13 PM6/28/23
to elixir-lang-core
Again, for the purposes of the code examples above, assume:

{foo, bar} = {1, 2}

Additionally,
  • Lines beginning with # ==  indicate what the compiler expands an expression to.
  • Lines beginning with # =>  represent the result of evaluating that expression.
  • Lines beginning with # !>  represent an exception.


Christopher Keele

unread,
Jun 28, 2023, 10:48:27 PM6/28/23
to elixir-lang-core
In reference to this discussion:

> This is going to be a big deal for Phoenix. In our channels where we're matching against the message, we have lines like this pervasively:
> def handle_in(event, %{"chat" => chat, "question_id" => question_id, "data" => data, "attachment" => attachment, ...}, socket) do...
> def handle_in(event, %{chat, question_id, data, attachment, ...}, socket) do...

This would allow us to write:

def handle_in(event, %{$"chat", $"question_id", $"data", $"attachment", ...}, socket) do...

Austin Ziegler

unread,
Jun 28, 2023, 11:16:10 PM6/28/23
to elixir-l...@googlegroups.com
On Wed, Jun 28, 2023 at 8:41 PM Paul Schoenfelder <paulscho...@fastmail.com> wrote:
I have an almost visceral reaction to the use of capture syntax for this though, and I don’t believe any of the languages you mentioned that support field punning do so in this fashion. They all use a similar intuitive syntax where the variable matches the field name, and they don’t make any effort to support string keys.

JavaScript only supports string keys. Ruby’s pattern matching which can lead to field punning only supports symbol keys, but since ~2.2 Ruby can garbage collect symbols, making it somewhat less dangerous to do `JSON.parse!(data, keys: :symbol)` than it was previously.

As far as I know, the BEAM does not do any atom garbage collection, and supporting *only* symbols will lead to a greater chance of atom exhaustion because a non-flagged mechanism here that only works on atom keys will lead to `Jason.parse(data, keys: :atom)` (and not `Jason.parse(data, keys: :atom!)`). I do not think that any destructuring syntax which works on maps with symbol keys but not string keys will be acceptable, although if it is constrained to *only* work on structs, then it does not matter (as that is the same restriction that it appears that OCaml and Haskell have).

I think that either `&:key` / `&"key"` or `$:key` / `$"key"` will work very nicely for this feature, although it would be nice to have `&key:` or `$key:` work the same as the former version. Alternatively, the `$` symbol could be used at the beginning of the data structure to indicate that it is performing capture destructuring (e.g., `$%{key1:, key2:}` or `$%{"key1", "key2"}`, but then it starts feeling a little more line-noisy.

I think that the proposal here — either using `&` or `$` — is entirely workable and IMO extends the concept nicely.

-a

Christopher Keele

unread,
Jun 28, 2023, 11:32:20 PM6/28/23
to elixir-lang-core
> Alternatively, the `$` symbol could be used at the beginning of the data structure to indicate that it is performing capture destructuring (e.g., `$%{key1:, key2:}` or `$%{"key1", "key2"}`, but then it starts feeling a little more line-noisy.

I agree that'd be noisy. Also, it might make mixing tagged variable literals, literal => pairs, and trailing keyword pairs even more confusing.

Consider today that we support:
%{"fizz" => "buzz", foo: :bar}
# => %{:foo => :bar, "fizz" => "buzz"}

But do not support:
%{foo: :bar, "fizz" => "buzz"}
# !> ** (SyntaxError) invalid syntax found on iex:5:12:
# !>     ┌─ error: iex:5:12
# !>     │
# !>   5 │ %{foo: :bar, "fizz" => "buzz"}
# !>     │            ^
# !>     │
# !>     unexpected expression after keyword list. Keyword lists must always come last in lists and maps. Therefore, this is not allowed:
# !>
# !>         [some: :value, :another]
# !>         %{some: :value, another => value}
# !>
# !>     Instead, reorder it to be the last entry:
# !>
# !>         [:another, some: :value]
# !>         %{another => value, some: :value}
# !>
# !>     Syntax error after: ','

Supporting $%{key1:, key2:} or $%{"key1", "key2"} obfuscates this situation even further.

Christopher Keele

unread,
Jun 28, 2023, 11:44:44 PM6/28/23
to elixir-lang-core
Posted that last reply early. continued:

Part of the elegance in of making $:foo and &"bar" expand to a valid pair, right before Map expansion handles pairs as {:%{}, [], [...pairs]}, is that it could easily allow us to support mixing tagged variable captures anywhere in the existing syntax constructs: This is not true of my prototype today, though, it would need more work based on how we decide to handle it:

{foo, bar, baz} = {1, 2, 3}

%{$:foo, "fizz" => "buzz", $"bar", fizz: :buzz}
# => %{:fizz => :buzz, :foo => 1, "bar" => 2, "fizz" => "buzz"}

%{$:foo, "fizz" => "buzz", $"bar", fizz: :buzz, $:baz} # !> ** (SyntaxError) invalid syntax found on iex:12:47:
# !>     ┌─ error: iex:12:47
# !>     │
# !>  12 │ %{$:foo, "fizz" => "buzz", $"bar", fizz: :buzz, $:baz}
# !>     │                                               ^
# !>     │
# !>     unexpected expression after keyword list. Keyword lists must always come last in lists and maps. Therefore, this is not allowed:
# !>
# !>         [some: :value, :another]
# !>         %{some: :value, another => value}
# !>
# !>     Instead, reorder it to be the last entry:
# !>
# !>         [:another, some: :value]
# !>         %{another => value, some: :value}
# !>
# !>     Syntax error after: ','

Christopher Keele

unread,
Jun 28, 2023, 11:56:13 PM6/28/23
to elixir-lang-core
%{$:foo, "fizz" => "buzz", $"bar", fizz: :buzz, $:baz}

Personally, my preference would be to disallow this usage, but perhaps with an even more instructive compiler error message.

In fact, I think that we could leverage most existing errors/warnings today, as long as things like the compiler error reporter desugar this feature before reporting, to make it clearer upon error what is actually going on in a variety of circumstances. This would give us something more like:

%{$:foo, "fizz" => "buzz", $"bar", fizz: :buzz, $:baz} # !> ** (SyntaxError) invalid syntax found on iex:12:47:
# !>     ┌─ error: iex:12:47
# !>     │
# !>  12 │ %{:foo => foo, "fizz" => "buzz", "bar" => bar, fizz: :buzz, :baz => baz}
# !>     │                                               ^
# !>     │
# !>     unexpected expression after keyword list. Keyword lists must always come last in lists and maps.
# !>
# !>     Syntax error after: ','

Christopher Keele

unread,
Jun 29, 2023, 12:05:00 AM6/29/23
to elixir-lang-core
An alternative would be to prepend compiler issues with a depiction of the how the sugar expands. Something like:

%{$:foo, "fizz" => "buzz", $"bar", fizz: :buzz, $:baz}
# !> warning: expanding %{$:foo, "fizz" => "buzz", $"bar", fizz: :buzz, $:baz} into a syntactically invalid construct:
# !> %{:foo => foo, "fizz" => "buzz", "bar" => bar, fizz: :buzz, :baz => baz}
# !> iex:12:47
# !> # !> ** (SyntaxError) invalid syntax found on iex:12:47:
# !>     ┌─ error: iex:12:47
# !>  12 │ %{:foo => foo, "fizz" => "buzz", "bar" => bar, fizz: :buzz, :baz => baz}
# !>     │                                               ^
# !>     │
# !>     unexpected expression after keyword list. Keyword lists must always come last in lists and maps.
# !>
# !>     Syntax error after: ','

Christopher Keele

unread,
Jun 29, 2023, 12:13:18 AM6/29/23
to elixir-lang-core
The above is how we dealt with undefined variable references until recently (I think 1.15?): warn about the problematic expansion, error on the expanded syntax.

Christopher Keele

unread,
Jun 29, 2023, 12:16:56 AM6/29/23
to elixir-lang-core
Specifically, we expanded an undefined variable foo to a function call foo(), warned about the expansion, then reported a compile-time error about the missing function, instead.

Justin Wood

unread,
Jun 29, 2023, 12:27:48 AM6/29/23
to elixir-l...@googlegroups.com
This proposal mentions OCaml, Haskell and JS as prior works of art for
this type of feature. I think a key thing to point out is that in those
languages, they did not need to add additional syntax in order to
support this.

In OCaml, the syntax goes from

{ foo = foo; bar = bar }

to

{ foo; bar }

Haskell starts with

C { foo = foo, bar = bar }

and turns into

C { foo, bar }

And lastly, Javascript uses

{ foo: foo, bar: bar }

which can be used as

{ foo, bar }

Note the lack of additional syntax surrounding these features.

> {foo, bar, baz} = {1, 2, 3}
>
> %{$:foo, "fizz" => "buzz", $"bar", fizz: :buzz}
> # => %{:fizz => :buzz, :foo => 1, "bar" => 2, "fizz" => "buzz"}

If I were coming from one of the above languages (or any other language
that supports this feature), I would not look at this syntax and say
"This is field punning". I would have no intuition what is going on.

Speaking as someone that has a decent amount of Elixir experience,
$"bar" looks like it should be closer in functionality to :"bar" than
field punning. Or maybe even similar to using ? to find the codepoint of
a single character. Something to keep in mind, Erlang actually uses $
for the same purpose that Elixir uses ?. I'm not saying Elixir couldn't
use the same token/operator for a different purpose, I just think it is
something that should be considered.

Justin

Christopher Keele

unread,
Jun 29, 2023, 12:40:39 AM6/29/23
to elixir-lang-core
> This proposal mentions OCaml, Haskell and JS as prior works of art for
> this type of feature. I think a key thing to point out is that in those
> languages, they did not need to add additional syntax in order to
> support this.

This is true, and the discomfort extends to Ruby as well.

For reasons explained in Austin's reply, a "barewords" implementation is not viable in Elixir, because of the prevalence of both atom and string key types.

IMO, discussing the nuance of if a barewords representation should prefer atoms or keys is what has been continually holding this feature up for a decade, and that's what this proposal tries to move past.

Perhaps in an ideal Elixir 2.0 future if we get garbage collection of atoms like Ruby, Phoenix can move over to parsing params with atom-based key pairs, we can drop the operator and atom/string differentiation, and move the entire syntax over to barewords. Worth calling out that this proposal (with a new operator, not the capture operator) could remain backwards-compatible with the proposed syntax if we moved into an atom-oriented Phoenix params parsing Elixir 2.0 future.

As Elixir 2.0 may never get released, famously, this is the only clear path I see forward for our production applications today to get field punning, that skirts issues with prior art.

Christopher Keele

unread,
Jun 29, 2023, 1:16:23 AM6/29/23
to elixir-lang-core
> Perhaps in an ideal Elixir 2.0 future if we get garbage collection of atoms like Ruby, Phoenix can move over to parsing params with atom-based key pairs, we can drop the operator and atom/string differentiation, and move the entire syntax over to barewords.

Indeed, the main argument against a "bare" usage of tagged variable literals (like foo = 1; {:foo, baz} = $:foo; bar #=> 1) is that we could not roll back this syntax without 2.0 breaking-changes to only support barewords where applications really want them: within list/map literals only.

Christopher Keele

unread,
Jun 29, 2023, 1:33:28 AM6/29/23
to elixir-lang-core
To update the considerations of this proposal if we adopt a new tagged variable literal operator ($) instead:

Pro: solves existing pain points

Pro: works with existing code

The newly proposed operator results in a compile-time-error on older versions of Elixir. All compiling Elixir code today will continue to compile as before.

Cons: does not work well with existing tooling

A new operator would require dedicated formatter support and updates to downstream syntax highlighters.

Cons: does not minimize surface area of the language

A new operator is more new syntax to learn; although one could readily argue that this is worth not making the capture operator a more complicated syntax to learn.

Pros: does not overload the capture operator

Open Question: charlists

A dedicated operator resolves the previous AST parsing ambiguity compared to using the capture operator. I have implemented charlist tagged variable literal support to my new-operator prototype branch. This would complicate parsing of non-literal expressions like field access in the future nice-to-have feature-set  more difficult, but still much more tractable than re-appropriating the capture operator. 

Christopher Keele

unread,
Jun 29, 2023, 2:44:56 AM6/29/23
to elixir-lang-core
> In reference to this discussion:
>
>> This is going to be a big deal for Phoenix. In our channels where we're matching against the message, we have lines like this pervasively:
>>
>> def handle_in(event, %{"chat" => chat, "question_id" => question_id, "data" => data, "attachment" => attachment, ...}, socket) do...
>> def handle_in(event, %{chat, question_id, data, attachment, ...}, socket) do...
>
>This would allow us to write:
>
def handle_in(event, %{$"chat", $"question_id", $"data", $"attachment", ...}, socket) do...

As a vanity benchmark, this $ operator syntax is a 30% reduction in line length (from 132 characters to 91) compare to the other "barewords"-proposal's nicer 41% reduction (from 132 characters to 79) for this particular usecase. However, it paves the way for field punning today, rather than waiting for atom-garbage collection, a future Elixir 2.0 release, and Phoenix atom-key-params-parsing support.

It also addresses two of the original pain points of this feature that future barewords support would not:

2) Visual clarity that atom/string matching is occurring
4) Easy confusion with tuples

Paul Schoenfelder

unread,
Jun 29, 2023, 2:51:16 AM6/29/23
to 'Justin Wood' via elixir-lang-core
For reasons explained in Austin's reply, a "barewords" implementation is not viable in Elixir, because of the prevalence of both atom and string key types.

IMO, discussing the nuance of if a barewords representation should prefer atoms or keys is what has been continually holding this feature up for a decade, and that's what this proposal tries to move past.

I don't agree that the rationale given by Austin is sufficient to reject a barewords-only implementation of field punning in Elixir. It is not at all clear to me why supporting string keys is critical to the feature, and I especially don't find the argument that people will ignore all of the plentiful advice about avoiding atom table exhaustion just so they can use field punning (e.g. switching to `Jason.parse(.., keys: atoms)`) compelling, at all. There will always be people who find a way to do dumb things in their code, but languages (thankfully) don't base their designs on the premise that most of their users are idiots, and I don't see why it would be any different here.

I've seen this debate come up over and over since the very first time it was brought up on this list, and there is a good reason why it keeps dying on the vine. The justification for field punning is weak to begin with, largely sugar that benefits the code author rather than the reader, and syntax sugar must carry its own weight in the language, and the only chance of that here is by building on the foundations laid by other languages which have it. Doing so means readers are much more likely to recognize the syntax for what it is, it adds no new sigils/operators, and it is narrowly scoped yet still convenient in many common scenarios. If anything, the desire to make this work for string keys is what keeps killing this feature, not the other way around.

I really don't want this thread to devolve into argument like many of the others on this topic, but making statements like "a barewords implementation is not viable in Elixir" is not doing any favors. It is factually untrue, and the premise of the statement is based entirely on an opinion. If this thread is going to have any hope of making progress, broad assertions of that nature better be backed up with a lot of objective data. Make the case why extra syntax is better than the more limited barewords-only implementation, for example, by enabling support for string keys, by offering a syntax construct that can be used in more places, etc. It isn't necessary for your proposal to torpedo other solutions in order to succeed, and has a better chance of doing so if you don't.

Paul
--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

José Valim

unread,
Jun 29, 2023, 3:02:04 AM6/29/23
to elixir-l...@googlegroups.com
Hi Chris Keele, thank you for the excellent proposal. I just want to add that I agree with Paul that we don't need to support both strings and atoms, but it must be clear that it applies to either strings or atoms (if it supports only one of them) and the reason for that is because otherwise it will add to the string vs atom confusion that already exists in the language. Someone would easily write def show(conn, %{id}) and be surprised why it doesn't match.

A couple additional thoughts to the thread:

* : in JS and = in Haskell/OCaml are operators. : in Elixir is not an operator

* &:foo/$:foo as a shortcut for {:foo, foo} is interesting but note that "foo: foo" already work as a shortcut in select places - so we would introduce more ways of doing something similar

* Elixir and Ruby shares a lot syntax wise, it may be worth revisiting what they do and which points arose in their discussions/implementations

Christopher Keele

unread,
Jun 29, 2023, 3:29:16 AM6/29/23
to elixir-lang-core
Honestly, I do not adore the syntax of the proposed solution, in either capture or $ operator incarnation. I would also prefer barewords.

Re: Paul's note:

>  It is not at all clear to me why supporting string keys is critical to the feature

100%, Phoenix params parsing support. This is the major obvious use-case for full-stack devs today of this proposal. If garbage collection of atoms is implemented in erlang, we could deprecate the proposed syntax readily.

Most of my personal Elixir development does not use Phoenix, so I do empathize with the sentiment and prefer atoms/barewords, but have tried to accommodate the outcry for this feature in this proposal, contending with popularity of JS's barewords implementation, concerning fullstack Phoenix development on

> the Elixir forum (123456)

Re: José's note:

> I agree with Paul that we don't need to support both strings and atoms, but it must be clear that it applies to either strings or atoms.

I would also prefer only supporting atoms, or even as a compromise with string confusion, only structs. Previous proposals have flighted this before, and have not succeeded.

I would argue that if we want to support only atoms, but make it clear that the syntax only applies to atoms, before an Elixir 2.0, we must leverage atom literals in the feature. The addition of a new operator (or, overloading of the capture operator in previous incarnations of this proposal) is the only way to accomplish this today.

If we really wanted to drive this home, we could only support atom literals in the proposal, and drop the support for strings; however, I don't see a way to resolve this tension today without employing atom literals in the feature's syntax.

Re: Paul's note:

> I really don't want this thread to devolve into argument like many of the others on this topic, but making statements like "a barewords implementation is not viable in Elixir" is not doing any favors. It is factually untrue, and the premise of the statement is based entirely on an opinion. If this thread is going to have any hope of making progress, broad assertions of that nature better be backed up with a lot of objective data.

I wish there were a data-driven way to approach language design. The only tool I know of is flighting proposals with working prototypes.

> Make the case why extra syntax is better than the more limited barewords-only implementation, for example, by enabling support for string keys, by offering a syntax construct that can be used in more places, etc. It isn't necessary for your proposal to torpedo other solutions in order to succeed, and has a better chance of doing so if you don't.

This proposal makes a case for this syntax being better than a more limited barewords-only implementation. Specifically, it enables support for string keys, and offers a syntax construct that can be used in more places (as a specific example, error = "rate limit exceeded"; $:error # return error tuple. Apologies if it feels like I am trying to torpedo other solutions, that is not my intent at all.

José Valim

unread,
Jun 29, 2023, 3:33:22 AM6/29/23
to elixir-l...@googlegroups.com
> I would argue that if we want to support only atoms, but make it clear that the syntax only applies to atoms, before an Elixir 2.0, we must leverage atom literals in the feature. The addition of a new operator (or, overloading of the capture operator in previous incarnations of this proposal) is the only way to accomplish this today.

As a counter point: Ruby has added this feature as {foo:, bar:}, which would have a direct translation to Elixir. Source: https://bugs.ruby-lang.org/issues/14579

> Apologies if it feels like I am trying to torpedo other solutions, that is not my intent at all.

You are doing great. You defend your proposal and ideas. :)

Christopher Keele

unread,
Jun 29, 2023, 3:49:28 AM6/29/23
to elixir-lang-core
> As a counter point: Ruby has added this feature as {foo:, bar:}, which would have a direct translation to Elixir. Source: https://bugs.ruby-lang.org/issues/14579

As a Rubyist who came to Elixir in the early days for personal projects before that Ruby syntax was implemented, and has only been professionally an engineering team manager of python, JS, and TS applications since: I like the explicitness of Ruby's notation here, but still really hate it how it reads and syntax highlights. :`)

That is just a personal opinion though, out of context of the utility of this proposal. However, I believe that incarnation for Elixir has been proposed before, and I am just searching for alternatives that would still enable field punning sooner rather than later.

> You are doing great. You defend your proposal and ideas. :)

Thank you! It is not easy to defend a language syntax proposal I do not personally adore the syntax of; but I imagine that's what many people felt like for Ruby's equivalent, with {foo:, bar:} (as I did at the time). I earnestly believe that this idea could mitigate pain points with Elixir adoption while reasonably contending with ES6 barewords syntax we are not yet able to adopt. However, I would not be heartbroken if we agreed that waiting for Elixir 2.0 and/or atom garbage collection was the right play here.

José Valim

unread,
Jun 29, 2023, 3:54:28 AM6/29/23
to elixir-l...@googlegroups.com
There is another idea here, which is to fix this at the tooling level.

For example, we could write %{foo, bar} and have the formatter automatically expand it to: %{foo: foo, bar: bar}. So you get the concise syntax when writing, the clear syntax when reading. Since most editors format on save nowadays, it can be beneficial. Executing code with the shortcut syntax will print a warning saying you must format the source file before.

Another idea is to improve Elixir LS itself to suggest the variable name itself after ":". So if I type "%{foo:", it immediately suggests " foo". So, once again, easy to write, easy to read.

Christopher Keele

unread,
Jun 29, 2023, 3:59:39 AM6/29/23
to elixir-lang-core
> Another idea is to improve Elixir LS itself to suggest the variable name itself after ":". So if I type "%{foo:", it immediately suggests " foo". So, once again, easy to write, easy to read.

I think this is part of the popularity of the opinion that some such syntax should only work for structs: with Elixir LS today, starting to type a `key:` in a struct/map literal does indeed suggest from the list of known struct keys. I don't see this being impossible in LS tooling today, but also don't know much about what is possible with the language server protocol today. :)

José Valim

unread,
Jun 29, 2023, 4:03:28 AM6/29/23
to elixir-l...@googlegroups.com
It is probably a path at least worth exploring. It may address the issue, without changes to the language, in a way it also feels natural without impacting code readability.

Christopher Keele

unread,
Jun 29, 2023, 4:05:09 AM6/29/23
to elixir-lang-core
> There is another idea here, which is to fix this at the tooling level.
> For example, we could write %{foo, bar} and have the formatter automatically expand it to: %{foo: foo, bar: bar}.

I do like this notion, but am worried about fragmentation at the tooling level.

I see a syntax addition (even if an ephemeral operator, deprecated with atom garbage collection) as an elegant way to traverse this. If we lean into the syntax addition of this proposal, Paul does make a valid point:

Make the case why extra syntax is better than the more limited barewords-only implementation, for example, by enabling support for string keys, by offering a syntax construct that can be used in more places, etc.

Hence the proposal. However, I think a constructive outcome of this discussion could be proposing exactly that expansion to tooling maintainers. :)

Wojtek Mach

unread,
Jun 29, 2023, 4:14:51 AM6/29/23
to elixir-l...@googlegroups.com
> %{foo, bar}

Just throwing it out there that this notation would make for a very nice MapSet literal. :)

Christopher Keele

unread,
Jun 29, 2023, 4:21:41 AM6/29/23
to elixir-lang-core
> Just throwing it out there that this notation would make for a very nice MapSet literal. :)

:D I am deeply convinced that this is why python, in its walrus operator attempt to keep up with Ruby features, has not offered a response to Ruby's Hash assignment destructuring of {foo:, bar:}: the {} literal syntax in python is already overloaded for both its native associative array literal construct (dicts) and set literal construct (sets), which share {} as delimiters and cause endless confusion already.

José Valim

unread,
Jun 29, 2023, 5:04:16 AM6/29/23
to elixir-l...@googlegroups.com
This is a separate convo but I don’t see value in adding a primary syntax for sets given we can’t really use it in pattern matching. A syntax for only adding and removing set elements is, IMO, not worth it.

Christopher Keele

unread,
Jun 29, 2023, 5:41:39 AM6/29/23
to elixir-lang-core
> This is a separate convo but I don’t see value in adding a primary syntax for sets given we can’t really use it in pattern matching. A syntax for only adding and removing set elements is, IMO, not worth it.

I whole-heartedly agree. The indirection of our set implementation and lack of literals served us well during our transition off of HashSet, towards MapSet, and the motivating Collectable protocol creation to support Enum.into/2 and enable for/1 :into macro support. Mind you, if erlang were to introduce a set literal at any point in the future, I would change my opinion here rapidly.

Austin Ziegler

unread,
Jun 29, 2023, 9:57:35 AM6/29/23
to elixir-l...@googlegroups.com
On Thu, Jun 29, 2023 at 2:51 AM Paul Schoenfelder <paulscho...@fastmail.com> wrote:
For reasons explained in Austin's reply, a "barewords" implementation is not viable in Elixir, because of the prevalence of both atom and string key types.

IMO, discussing the nuance of if a barewords representation should prefer atoms or keys is what has been continually holding this feature up for a decade, and that's what this proposal tries to move past.
I don't agree that the rationale given by Austin is sufficient to reject a barewords-only implementation of field punning in Elixir. It is not at all clear to me why supporting string keys is critical to the feature, and I especially don't find the argument that people will ignore all of the plentiful advice about avoiding atom table exhaustion just so they can use field punning (e.g. switching to `Jason.parse(.., keys: atoms)`) compelling, at all. There will always be people who find a way to do dumb things in their code, but languages (thankfully) don't base their designs on the premise that most of their users are idiots, and I don't see why it would be any different here.

Prior to Symbol garbage collection, people had to be told repeatedly not to use symbol keys for JSON parsing in Ruby so that they would get to use the "cleaner" `foo[:bar]` syntax rather than `foo['bar']` or `foo["bar"]`. This *did* inspire the Ruby core team to figure out how to identify *temporary* symbols (created during runtime) as opposed to *permanent* symbols (created during code parsing) so that temporary symbols could be garbage collected, so languages *can* adapt to many of their users being idiots by reducing some sharp edges.

Atom exhaustion is a particularly sharp edge to Elixir that would require convincing at the BEAM level…and I’m less certain that would pass muster. If there are other benefits that could be obtained by identifying and garbage collecting temporary atoms, then it could be added to the BEAM and eventually Elixir would be able to have bareword map deconstruction.

As Chris suggested, and I completely agree, it’s the pattern matching at the edges — not just Phoenix. I deal with a lot of JSON and CSV parsing, and we almost always try to push those into maps which eventually get transformed into structs. Bareword map deconstruction working with string keys would increase the readability of some of those transformations (but not enough that I think that this is a make-or-break feature for Elixir). Other people want to use bareword map deconstruction working with atom keys only on maps that they have full control over.

Almost everyone agrees that bareword struct deconstruction would be easily accomplished and understood, but that there are some sharp edges where — because structs *are* maps — confusion would enter again (e.g., `%{struct | key}` would need to be specified as `%Struct{struct | key}` if we disallow deconstruction / field punning on maps).

I've seen this debate come up over and over since the very first time it was brought up on this list, and there is a good reason why it keeps dying on the vine. The justification for field punning is weak to begin with, largely sugar that benefits the code author rather than the reader, and syntax sugar must carry its own weight in the language, and the only chance of that here is by building on the foundations laid by other languages which have it. Doing so means readers are much more likely to recognize the syntax for what it is, it adds no new sigils/operators, and it is narrowly scoped yet still convenient in many common scenarios. If anything, the desire to make this work for string keys is what keeps killing this feature, not the other way around.

Having used object deconstruction heavily in JavaScript / Typescript, I disagree that it primarily benefits the code author. It really does help the readability of the code in general, at least for those languages. I’m less convinced that it would be of great benefit in Elixir except for accidental misspellings (e.g., `%{catalog: catelog}` type errors; the field punning here would benefit both writer and reader by reducing accidental errors like this).

I think that your assertion that it is the desire for string keys killing this proposal is incorrect — I have seen complaints about excess similarity to tuple declaration raised in each discussion, too (misreading `%{ok, reason}` as `{ok, reason}`).

I think that if the Elixir community wants bareword field punning, then it should be limited to things which have *fields*: structs (and maybe sugar could be built for Erlang record support). IMO, maps don’t have fields, they have keys with values. Once the rules around struct field punning could be worked around, then this could be introduced and the discussion could be laid to rest. As far as I can tell, neither Ocaml nor Haskell support it for maps, and Typescript will scream about it for string records (e.g., `Record<string, any>` or `type foo = { [key: string]: any }` or `object`) because it can’t do any validation of the value under the key.

Otherwise, I think we’re going to need to accept that such destructuring would need some sort of syntax added.

-a

IMO, it would be too magic to do something like `%{"foo", 'bar', :baz} = %{"foo" => 1, 'bar' => 2, :baz => 3}; {foo, bar, baz} #=> {1, 2, 3}`, but it *would* allow for atom, string, and charlist keys being deconstructed without much excess verbosity.

Austin Ziegler

unread,
Jun 29, 2023, 10:07:23 AM6/29/23
to elixir-l...@googlegroups.com
On Thu, Jun 29, 2023 at 3:02 AM José Valim <jose....@dashbit.co> wrote:
Hi Chris Keele, thank you for the excellent proposal. I just want to add that I agree with Paul that we don't need to support both strings and atoms, but it must be clear that it applies to either strings or atoms (if it supports only one of them) and the reason for that is because otherwise it will add to the string vs atom confusion that already exists in the language. Someone would easily write def show(conn, %{id}) and be surprised why it doesn't match.

A couple additional thoughts to the thread:

* : in JS and = in Haskell/OCaml are operators. : in Elixir is not an operator

`:` isn’t an operator in JS, but it is part of the `?:` ternary operator. In all other contexts (object construction, object deconstruction / aliasing, case statements in switches), `:` is syntax.

* Elixir and Ruby shares a lot syntax wise, it may be worth revisiting what they do and which points arose in their discussions/implementations

In many ways it is a shame that Elixir adopted the `%{"foo": bar}` syntax meaning the same as `%{:"foo" => bar}`, because otherwise we *could* adopt the Ruby key deconstruction approach for atoms, strings, and charlists (`%{foo:, "bar":, 'baz':} = %{:foo => 1, "bar" => 2, 'baz' => 3}`) … and the discussion could be over. :D

-a
 

José Valim

unread,
Jun 29, 2023, 10:24:06 AM6/29/23
to elixir-l...@googlegroups.com
: is not an operator at the user level for JS but it behaves like one syntactically. You can add or remove spaces on either side and it works. That’s not true for Ruby or Elixir as moving the spaces around is either invalid or has a different meaning.

Giorgio Torres (Eugico)

unread,
Jun 29, 2023, 10:50:39 AM6/29/23
to elixir-lang-core
I was catching up with the other discussions about this, and I'm having seconds thoughts about Elixir having this syntax sugar for maps.
It all sums up to what Austin pointed out here.

The bottom line, IMO (and agreeing with Austin), would be to end having something like `%{"foo", 'bar', :baz} = %{"foo" => 1, 'bar' => 2, :baz => 3}; {foo, bar, baz} #=> {1, 2, 3}`, to solve the atom/string key problem, and given that I'd prefer not to have it and leave Elixir as it is today.
I also don't see any other tool (like a formatter or a language server) doing this kind of job. Why would I write something if that's not the language's syntax? If we open this precedent I'm not sure where we may end up with (probably would end up like JS: being transcripted by transpilers).
Plus, the misspelling argument is not appealing enough too, since a good test covered application (which IMO should be our desire and intention) would have that caught.
To add a little more, since we should use atoms with caution due to platform's limitations, we shouldn't make easier to enforce their usage, and since the %{ foo } = %{ foo: "any value" } syntax sugar would favour it, I am convinced enough to not have this on Elixir, contradicting myself in a previous proposal.

Christopher Keele

unread,
Jun 30, 2023, 2:01:45 PM6/30/23
to elixir-lang-core
Here's a summary of feedback on this proposal so far. 

Concerning the actual proposal itself (ie, &:foo # => {:foo, foo}): 2 against re-using the capture operator.

Concerning the followup of a dedicated operator (ie, $:foo # => {:foo, foo}): roughly 2 for and 1 against (because it was not barewords), no major notes.
  • Can support strings, atoms
  • Allows mixing strings/atoms
  • Key typing explicit
  • Works awkward with the pin operator
  • Does not look like tuples
  • Adds new syntax outside of collection constructs, for working with tagged tuples for good/evil
Overall, doesn't seem wildly popular, but I do think it does handle most of the problems of previous discussions well.

It does not cleanly map to anything in other languages, but I view that as a pro because we want something more expressive than what they provide. I feel like that is part of the lack of enthusiasm, though: people want whatever feels most intuitive to them from other languages they've worked with, and as little new syntax to the language as possible. That's a hard needle to thread.


Discussion on this thread about other proposals:
  • "Barewords" (ES6-style): %{foo, bar} People still want this, it is still tricky for the usual reasons, and I don't think much new has been contributed to the discussion in this thread.
    • Can only support strings or atoms
    • No mixing strings/atoms
    • Key typing hidden
    • Confusable with tuples
    • A new thing I've realized in my prototype is that it does work better with the pin operator than others, something I haven't seen discussed.
  • "Unmached pair" (Ruby style): %{foo:, bar:} This has always been on the table but I do see more support for it than in past discussions. Also, personally, my least favorite.
    • Lacks string support
    • No mixing strings/atoms
    • Key typing visually clear
    • Does not look like tuples
    • Pin operator problems
  • "Pair literals": %{:foo, "bar"} Some people seem to find this too magical, some prefer this over my proposed dedicated operator that would make it more explicit.
    • Can support strings, atoms
    • Allows mixing strings/atoms
    • Typing visually clear
    • Does not look like tuples
    • Pin operator problems

New proposals:
  • Adding support at the tooling level: 2 against (adding my voice here). As was well-said, why would I write something that's not the language's syntax?
  • Austin had an idea with using my operator immediately alongside the map % literal to apply it to the whole expression. There are problems with that, but
  • This in turn gave me an idea for a slightly new syntax we could use to support barewords better. I guess call them qualified barewords or something for now:
    • %{foo, bar}: barewords string keys
    • %:{foo, bar}: barewords atom keys
    • %Struct{foo, bar}: barewords struct keys
    • Initial thoughts:
      • Can support strings and atoms
      • No mixing strings/atoms
      • Key typing explicit, visually clear but hidden for strings
      • Sometimes looks like tuples for strings
      • Plays nice with pin operator
      • Still requires exploration with existing tokenization rules, mixing rules, map update syntax

I'd love to see someone champion any one of these syntaxes, put up some prototypes, get into the details of syntax ambiguities, and see dedicated discussions of pros/cons of explicit proposals in individual threads!

I may continue investigating a prototype for pair literals (%{:foo, "bar"}) or qualified barewords (%:{foo, bar}) as I get bolder with the erl source code.

I welcome further discussion about the tagged-variable literal operator ($:foo # => {:foo, foo}) in this thread, but probably won't contribute too much more to discussion on other proposals, as I want to focus on concrete proposals and tinker with prototypes for the more promising follow-ups! Thanks for the discussion, all!
Reply all
Reply to author
Forward
0 new messages