This is a more formal write-up of the discussion
started with José here. Interested in all feedback, potential use-cases, and syntactic edge-cases!
Proposal
Allow `when conditions...` to be placed anywhere within the patterns used in functions, cases, and other such match head constructs—in addition to the suffix-only version we support today.
I believe this can be done by the compiler today as a backwards-compatible enhancement with no change to the parser.
Synopsis
I'm proposing we allow guard clauses anywhere within match patterns, as well as following them, so that the AST for match heads containing guards can be composed easily.
To show rather than tell, assuming you have AST for the following:
X = x when x > 0
Y = y when y > 1
Both X and Y are valid, stand-alone match heads. However, they are not also valid stand-alone match patterns: there is no easy way to combine them to form a new 2-arity match. The naive approach would be:
XY = (x when x > 0, y when y > 1)
While this is not currently a valid match head, once can hoist any internal guards into a valid form like:
XY2 = (x, y) when x > 0 and y > 1
This proposal explores allowing XY as valid Elixir code and rewriting it to XY2 at compile time.
Terminology
To talk about this precisely I'm going to appropriate some ETS terminology to reference Elixir syntax constructs I don't have canonical names for. Let me know if you are aware of the correct terms for these!
- a match specification is defined as a match head -> body pair
- a
match head is defined as a
match pattern optionally followed by
guard clauses- a match pattern is defined as a comma delimited series of expressions allowed in match contexts
- a guard clause is defined as a when conditions expression formulated of functions allowed in guard contexts
Technically, what I'm proposing is to loosen the restriction that guard clauses must follow match patterns, by allowing match heads themselves to recursively be valid expressions in parameters lists. The compiler can extract all guard clauses found within a match head, leaving only a valid match pattern, and combine the extracted guards with any existing ones to create a new valid match head with equivalent semantics.
Parsing
Currently, both of these (and all other variations I can think of) are already valid syntax to the parser:
fn x when x > 0, y when y > 1 -> #... end
def name(x when x > 0, y when y > 1), do: #...
Only the compiler keeps you from running a program with these internal guards; it can be parsed into AST with the exact precedence we want without a hitch.
Grammar
The only ambiguity I can think of would be the following:
fn x when x > 0, y when y > 1 when y < 3 -> #... end
It is not clear where whether the intent was to create an 'or' multi-guard around the entire parameters list, or to guard both the last parameter and the parameters at large with two separate clauses. The parenthesis conventionally always used in defs resolves the ambiguity. I am open to ideas on how to handle this situation, though personally I envision a compile-time warning and treating it as a multi-guard as this is most consistent with the precedence of when. I don't see it coming up too often in generated code.
It is worth observing that while the following technically has the same ambiguity:
fn x when x > 0, y when y > 1 -> #... end
however you decide to treat the guard after the second parameter, the resulting guards post-rewrite will be semantically equivalent.
Rewriting
Since everything up to this point is already valid, I suspect the rewrite could be done in
a single place in the compiler with no further changes to any other code.
The algorithm I have in mind is to simply walk the AST outside-in, removing each set of consecutive guards it finds, and exploding the permutation of each sets' multi-guards out to create new trailing clauses, then anding all terms together in each new guard clause. In the most common case that simply entails prefacing the trailing guards with any interior guard, all anded together.
This approach is intentionally naive about what variables are referenced in which guards where. Anything that produces a valid guard in the end can fly, even if generated code produces them in odd or unexpected places within the params list.
Simple extraction:
def name(%{foo: bar when is_integer(bar), fizz: buzz when is_integer(buzz)})
when bar + buzz > 100
def name(%{foo: bar, fizz: buzz})
when (is_integer(bar) and is_integer(buzz)) and (bar + buzz > 100)
In order to handle mutli-guards correctly we'd have to get a little n^m but I doubt even the most ambitious metaprogramming would ever need to generate code like that.
Matrix of multi-guards with oddly referenced variables:
def name(x, y when y > z when z > x, z)
when is_integer(z)
when is_string(z)
def name(x, y, z)
when y > z and is_integer(z)
when y > z and is_string(z)
when z > x and is_integer(z)
when z > x and is_string(z)
Thoughts
The purpose of this is to make function heads more composable in meta-programming. Initially it would be released with zero fanfare except perhaps a footnote in the guard docs. However, I can see this ability catching on with people coming from strongly-specified typed languages, allowing them to qualify type expectations in parameter lists inline with where variables are defined.
I don't think that would be too problematic since I find the style to be pretty readable and more importantly the code still easy to reason about. However, this is definitely enabling an alternate way to write pretty basic syntax, so if we were really against human beings rather than macros employing doing this, the rewrite process could emit warnings from code not marked as generated.
I have no idea how the formatter should handle this syntax. Perhaps its behaviour already suffices, since it's pretty good at deconstructing long function heads?
I'm interested if there are strong opinions about this one way or the other on the mailing list, as well as if there are any implications I've overlooked in my suggested implementation.
Thanks for reading!
Chris K