defguard with structural matching

652 views
Skip to first unread message

OvermindDL1

unread,
Sep 27, 2017, 12:09:31 PM9/27/17
to elixir-lang-core
As requested at https://github.com/elixir-lang/elixir/pull/5857#issuecomment-332563628 this is created.

Currently there is a `defguard` being created for elixir 1.6.0 that is nothing more than a macro and does no extra functionality there-of.  It (as far as my reading shows) just allows:

```elixir
defguard is_even(value) when is_integer(value) and rem(value, 2) == 0
```

Which can be used like:

```elixir
def steps(n) when n > 0, do: steps(n, 0) defp step(1, step_count), do: step_count defp step(n, step_count) when is_even(n), do: step(div(n, 2), step_count + 1) defp step(n, step_count), do: step(3*n + 1, step_count + 1)
```

Which is really no gain over just using a macro like:

```elixir
defmacro is_even(value), do: is_integer(value) and rem(value, 2) == 0
```
Other than just verify that only proper guards are used (which could just be another macro that verifies that too).

Instead, a while back, I made a library called `defguard`:  https://github.com/overminddl1/defguard

Disclaimer: It is just an example of something that should be, in my opinion, built in to Elixir's `def`/`defp`/`defmacro`/`defmacrop`, right now when it is used it just replaces the built-in `def` calls with its own macro version that just delegates back down after performing expansion.  It does not make a good standalone library and that is why I have very purposefully not finished it to an extent to be generically useful (replacing `def` and so forth is not good form in my opinion).

Now what it does is you can do:

```elixir
defguard is_struct(%{__struct__: struct_name}) when is_atom(struct_name)
defguard is_struct(%{__struct__: struct_name}, struct_name)
defguard is_exception(%{__exception__: true} = exc) when is_struct(exc)
```

Which can then be used like:

```elixir
def blah(any_exc) when is_exception(any_exc), do: any_exc
def blah(specific_struct) when is_struct(specific_struct, Specific), do: specific_struct
def blah(any_struct) when is_struct(any_struct), do: any_struct
```

Note that this is something that you *cannot* do with the current `defguard` proposal in Elixir slated for 1.6.0, that is structural matching, I.E. testing the structure of the values and being able to pull out and test the data inside, which is what you cannot do with the current proposal, notable with maps and structures and other deep constructs become significantly easier.

However, unlike just creating a `defguard` macro that then creates other macro's, that should still be done as my example library does (although it should generate two versions of it I'd say, see below), however the difference is that the definitions of `def` and the others needs to have an extra expansion phase.  In essence they would need to be changed so that if any guard is called that is not one of the valid guard types then it should be expanded (the macro is called) but instead of it called, say for a given `is_struct` for a macro it would normally expand the function named :"MACRO-is_struct" it should instead expand a function named `:"GUARD-is_struct"` or something of that style if it exists, then it takes the return information and mixes in both the structural part into the correct location in the argument and mixes in the guards into the `when` guards section of the head.  If `is_struct` were called in any other place than the function head then it would expand via the normal macro call (perhaps even function? I'd opt for a macro return though) that does the structural test and guard tests and returns true/false as appropriate.

Thus, `defguard` would need to generate at least 2 functions for each guard so it is useful in every possible location, and `def`/`defp`/`defmacro`/`defmacrop`/`case` and maybe `cond` should have their ast updated to handle that expansion into the heads/cases.  This then makes it ubiquitous through-out elixir-the-language and let's people define new guards that are significantly more powerful than what a `defmacro` version of the `defguard` should be capable of otherwise.  I.E. `defguard` should only be added if it actually adds functionality over an equivalent `defmacro` or there is still no purpose to its existence other than just being a `defmacro` that shuffles it's guards into its body, which really gains extremely little (even the readability gain is minor, see the top example and comparison).

OvermindDL1

unread,
Sep 27, 2017, 12:10:50 PM9/27/17
to elixir-lang-core
Also see past discussion (or lack there-of really, only comments about it was someone saying how they would love such functionality) at:  https://elixirforum.com/t/defguard/4052

José Valim

unread,
Sep 27, 2017, 12:44:41 PM9/27/17
to elixir-l...@googlegroups.com
I would just like to clarify that:

defguard is_even(value) when is_integer(value) and rem(value, 2) == 0

is not equivalent to:


defmacro is_even(value), do: is_integer(value) and rem(value, 2) == 0

First of all, because that macro will return true and false at compile time, which is not what we want. :)

But most importantly, it is because the most obvious implementation has some pitfalls:

defmacro is_even(value) do
  quote do
    is_integer(unquote(value)) and rem(unquote(value), 2) == 0
  end
end

The code above unquotes the value twice, which means that when used outside of a guard, such as in "if is_even(some_expr())", some_expr is evaluated twice. One option would be to use bind_quoted or similar but keep in mind we can't define variables inside guards so we need to account for both cases. The final solution would look like this:

defmacro is_even(value) do
  if __CALLER__.context == :guard do
    quote do
      is_integer(unquote(value)) and rem(unquote(value), 2) == 0
    end
  else
    quote do
      value = unquote(value)
      is_integer(value) and rem(value, 2) == 0
    end
end

Which is quite complex compared to:

defguard is_even(value) when is_integer(value) and rem(value, 2) == 0


José Valim
Founder and 
Director of R&D

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/eec36778-b463-4b67-830f-7f572fb59732%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

OvermindDL1

unread,
Sep 27, 2017, 2:30:45 PM9/27/17
to elixir-lang-core
On Wednesday, September 27, 2017 at 10:44:41 AM UTC-6, José Valim wrote:
First of all, because that macro will return true and false at compile time, which is not what we want. :)

Er, yes, I forgot to add the `quote`, which a wrapper macro with a stored variable (since called twice) (perhaps call this macro `defguard`? ^.^) could just do implicitly (like the current proposal into 1.6 is doing).  ^.^
 

On Wednesday, September 27, 2017 at 10:44:41 AM UTC-6, José Valim wrote:
The code above unquotes the value twice, which means that when used outside of a guard, such as in "if is_even(some_expr())", some_expr is evaluated twice. One option would be to use bind_quoted or similar but keep in mind we can't define variables inside guards so we need to account for both cases. The final solution would look like this:

Eyup, it is indeed quite a pain, but a simple macro that creates that macro can handle all that pain for you, which is basically what the current 1.6.0 proposal already does.


On Wednesday, September 27, 2017 at 10:44:41 AM UTC-6, José Valim wrote: 
Which is quite complex compared to:

Entirely so, which is why it should not be done.  However my style handles this *and* adds structural matching abilities, rather than just being a simple, say, library function that adds a `defguard` macro that just does that wrapping for you that you specified (which already exists somewhere I ran across once).

Specifically though, I want to structurally match far *far* more often than I add guards to functions (I often design just for this as pattern matching executes faster than guards do) and thus defguard becomes even more useless.  It still cannot even define something as fairly trivially simple as an `is_struct/1`/`is_struct/2`.

Though yes, I do conceed that defining these structural matches has a more front-end one-time cost (via PR) in that you have to add loaders to various syntax constructs in addition to potentially having to duplicate bodies into different heads (as heads may need to be duplicated at times since you cannot match an 'or' of structures), but that is all still only a one-time cost to be able to define new guards that are *significantly* more powerful than what a comparatively simple macro could do, thus allowing you to make things like `is_struct/1`/`is_struct/2` or `is_exception/...` and others, including for very specific internal information that a library may want to expose a guard for that the current 1.6.0 proposal is entirely incapable of.

José Valim

unread,
Sep 27, 2017, 2:38:26 PM9/27/17
to elixir-l...@googlegroups.com
Though yes, I do conceed that defining these structural matches has a more front-end one-time cost (via PR) in that you have to add loaders to various syntax constructs in addition to potentially having to duplicate bodies into different heads (as heads may need to be duplicated at times since you cannot match an 'or' of structures), but that is all still only a one-time cost to be able to define new guards that are *significantly* more powerful than what a comparatively simple macro could do, thus allowing you to make things like `is_struct/1`/`is_struct/2` or `is_exception/...` and others, including for very specific internal information that a library may want to expose a guard for that the current 1.6.0 proposal is entirely incapable of.

While I agree it would be awesome to have an is_struct/1 guard, the correct solution to this problem is to contribute this feature upstream and allow map access in guards. So I agree your proposal does add new possibilities but that's not how we should go about implementing them, especially because of the complexity it would add to the compiler and the cost in the duplication of clauses (space and time).

Michał Muskała

unread,
Sep 27, 2017, 2:48:55 PM9/27/17
to elixir-l...@googlegroups.com
is_struct is impossible to implement right now. The implementation in the linked library is faulty and will fail as soon as you use the or operator or multiple when clauses. For example, given an is_struct/2 guard when we use it as:

    def foo(arg) when is_struct(arg, Decimal) or is_integer(arg)

There is no way to translate this to valid Elixir syntax. Such a change would require changes to VM itself and expanding the guard functions with a map_get/2 or something similar. As far as I know, allowing for pattern matching or case in guards was already rejected by the OTP team.

Michał.

On 27 Sep 2017, 20:38 +0200, José Valim <jose....@gmail.com>, wrote:
Though yes, I do conceed that defining these structural matches has a more front-end one-time cost (via PR) in that you have to add loaders to various syntax constructs in addition to potentially having to duplicate bodies into different heads (as heads may need to be duplicated at times since you cannot match an 'or' of structures), but that is all still only a one-time cost to be able to define new guards that are *significantly* more powerful than what a comparatively simple macro could do, thus allowing you to make things like `is_struct/1`/`is_struct/2` or `is_exception/...` and others, including for very specific internal information that a library may want to expose a guard for that the current 1.6.0 proposal is entirely incapable of.

While I agree it would be awesome to have an is_struct/1 guard, the correct solution to this problem is to contribute this feature upstream and allow map access in guards. So I agree your proposal does add new possibilities but that's not how we should go about implementing them, especially because of the complexity it would add to the compiler and the cost in the duplication of clauses (space and time).

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4%2BgL%2BRUnA279KPZMsjQkMJOhXr4q6xEVGQb%3DHu5ReaQVg%40mail.gmail.com.

OvermindDL1

unread,
Sep 27, 2017, 4:21:16 PM9/27/17
to elixir-lang-core
On Wednesday, September 27, 2017 at 12:48:55 PM UTC-6, Michał Muskała wrote:
is_struct is impossible to implement right now. The implementation in the linked library is faulty and will fail as soon as you use the or operator or multiple when clauses. For example, given an is_struct/2 guard when we use it as:

Yes it is, I did that on purpose along with a few other limitations in the git version to prevent people from actually using it (I don't want to release/publish anything that overwrite `def` and so forth).

On Wednesday, September 27, 2017 at 12:48:55 PM UTC-6, Michał Muskała wrote: 

    def foo(arg) when is_struct(arg, Decimal) or is_integer(arg)

There is no way to translate this to valid Elixir syntax. Such a change would require changes to VM itself and expanding the guard functions with a map_get/2 or something similar. As far as I know, allowing for pattern matching or case in guards was already rejected by the OTP team.

Actually that would be valid, and would always fail.  Just like doing:

```elixir
def foo(arg) when hd(arg) == :something or is_integer(arg)
```
This would fail as well.  Such head splitting should be 'in' the defined guard only, it should not magically make contradictory code work.  Any `defguard` thing, for all intents and purposes, should be considered failable like `hd` is unless the creator of it takes special handling of course (which something like `is_struct/...` could not really do).  The example you gave should have those in different `when` clauses, not in the same `when` clause with an `or`.



On Wednesday, September 27, 2017 at 12:38:26 PM UTC-6, José Valim wrote:
While I agree it would be awesome to have an is_struct/1 guard, the correct solution to this problem is to contribute this feature upstream and allow map access in guards. So I agree your proposal does add new possibilities but that's not how we should go about implementing them, especially because of the complexity it would add to the compiler and the cost in the duplication of clauses (space and time).


For maps specifically, yes, however structually testing can be significantly easier (say going down 5 deep in a Plug.Conn struct) is far more natural, in addition to things like `map_key` not existing in guards (yet?).

And do not, I'm just trying to spur discussion, not necessarily get this in.  I'm actually, quite significantly actually, for *not* putting the current 1.6.0 proposal of `defguard` in elixir as I think it is wrong in design and it is able to be handled by a library as it is, thus even further making it so that it is not necessary to add it to the kernel (at least until it is played around enough in its library form before it is finalized on after plenty of real-world usage).

OvermindDL1

unread,
Sep 27, 2017, 4:21:59 PM9/27/17
to elixir-lang-core
On Wednesday, September 27, 2017 at 2:21:16 PM UTC-6, OvermindDL1 wrote:
...
And do not

s/not/note/  >.>

Eric Meadows-Jönsson

unread,
Sep 29, 2017, 8:17:17 AM9/29/17
to elixir-l...@googlegroups.com
If is_struct is implemented with head/clause splitting it will either lead to code duplication or you would have to move the code to a separate function that will mess with stacktraces, I wouldn't be surprised if it had other unintended consequences also.

We don't do this kind of code shuffling anywhere else in the language afaik so I don't think it's worth it to just to add is_struct.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/2d959f1d-770a-4df0-9d58-b50bfbc7ae62%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Eric Meadows-Jönsson
Reply all
Reply to author
Forward
0 new messages