Add type guards to structs

501 views
Skip to first unread message

Maciej Kaszubowski

unread,
Oct 4, 2017, 5:08:17 AM10/4/17
to elixir-lang-core
Hello,

Proposed feature

I'd like to propose another improvement on structs. Inspired by @enforce_keys, I'd like to propose adding @guards which can help to validate the types of the fields in the struct. 

Example usage:

defmodule MyStruct do
 
@guards [name: :is_binary]
 
defstruct [:name]
end


which will fail if the given condition is not satisfied:

iex(2)> %MyStruct{name: 5}
** (ArgumentError) The following fields didn't match the guards: struct MyStruct: [{:name, :is_binary, 5}]
    expanding struct: MyStruct.__struct__/1


Notes

  • As the example shows, the behaviour will be similar to @enforce_keys - it will be checked only when creating the struct, not when updating
  • Using module attribute allows to keep this optional and allows to keep backwards compatibility
Possible implementation

With  https://hexdocs.pm/elixir/master/guards.html and Kernel.apply/3, we can modify existing def __struct__(kv) from Kernel:


def __struct__(kv) do
  {map, errors} =
    Enum.reduce(kv, {@struct, {[], @enforce_keys}}, fn {key, val}, {map, {type_errors, key_errors}} ->

      guard = @guards[key]
      if guard && apply(Kernel, guard, [val]) do
        {Map.replace!(map, key, val), {type_errors, List.delete(key_errors, key)}}
      else
        {
          Map.replace!(map, key, val),
          {[{key, guard, val} | type_errors], List.delete(key_errors, key)}
        }
      end

    end)
  case errors do
    {[], []} -> map
    {types, []} ->
      raise ArgumentError, "The following fields didn't match the guards: " <>
        "struct #{inspect __MODULE__}: #{inspect types}"
  end
end



This, of course, needs style improvements (and validation of required fields which is currently removed for the sake of clarity), but this is only a proof of concept to verify that the implementation is possible and quite easy.

Why not use @type?

While it would be cool to be able to verify the types based on typespecs, it would be harder because I think not all types can be easily validated. The suggested approach with guards will be feel more familiar because we can already do this for functions. Adding guard validation for struct fields feels like reasonable step.

What do you think?

I'd be happy to start working on this feature, but I wanted to know what do you all think about this.


Cheers,
Maciej

José Valim

unread,
Oct 4, 2017, 5:37:57 AM10/4/17
to elixir-l...@googlegroups.com
Such changes are not as straight-forward because then most would expect matching on %Foo{} to also validate on those guards and that comes with its own set of problems:

1. we will need changes in the compiler to make this work

2. if we are checking the fields for %Foo{} on every pattern matching, it becomes unnecessary overhead

3. it is unclear how such features will play against other features in the language, such as defguard

Given we need to validate data in the boundaries, my proposal is to keep validating those in the boundaries instead of every time %Foo{} is used.




José Valim
Founder and 
Director of R&D

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/7fcdeb0f-dfcd-405d-bd5c-563648d5f9d3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Maciej Kaszubowski

unread,
Oct 4, 2017, 6:00:36 AM10/4/17
to elixir-lang-core
Thank you for the comment José.

Sure, you're completely right that checking this every time is quite problematic and not intuitive. That's why I think the check should only be performed when creating the struct, not when matching on %Foo{}. I think that explaining properly it in the docs would solve the problems of too high expectations.

And while I agree that having guards in the function should be used when possible, in my opinion, having that validation when creating the struct can be quite useful.

There are multiple use cases when I think this will be really nice. For example, when creating a JSON api, we could create a struct for the expected response, adding the type validations for the fields. Using the fact that structs are in fact maps, we can easily convert the struct to the JSON response and have more confidence that the types are correct. This is a huge advantage compared to using plain maps for views. You could of course use guards in function in your view, but it will be really annoying for JSONs with multiple fields.

Another example is using domain models in your application (but not a database schemas). Imagine model with 10 fields. To create the struct, we could just create and use Foo.new() function, but creating a function with 10 arguments and 10 guard clauses will be neither clear nor readable. And because there's no way to disable creating the struct manually, we cannot be sure that everyone will use `Foo.new()`. Instead, we can just create a typed struct and have the types validated every time we create it. Someone could still have invalid struct by  updating the fields, but I think this is a reasonable compromise.

I am quite confident that this behaviour will be easy to understand, especially because it's quite similar to the way @enforce_keys attribute works. 


What do you think?


Cheers, 
Maciej 

José Valim

unread,
Oct 4, 2017, 6:14:00 AM10/4/17
to elixir-l...@googlegroups.com
Even if we only check on the %Foo{...} and %Foo{x | ...} I would expect the behaviour to raise instead of ignoring the field value was you proposed.

And it still makes me think that you should probably wrap all of your struct creation and modification in a function that does this kind of validation. Guard validations are limited, you wouldn't even be able to validate that some key must be given a struct, which I would say is a fairly common use case.



José Valim
Founder and 
Director of R&D

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.

Maciej Kaszubowski

unread,
Oct 4, 2017, 6:34:01 AM10/4/17
to elixir-lang-core
I think the behaviour I'm proposing is in fact to raise, rather then to ignore the field. The code I showed produces this result when the field doesn't pass the check:

iex(2)> %MyStruct{name: 5}
** (ArgumentError) The following fields didn't match the guards: struct MyStruct: [{:name, :is_binary, 5}]
   expanding struct: MyStruct.__struct__/1


Wrapping the struct creation in a function is a good way (because it helps abstracting the struct implementation), but the problem is that this doesn't prevent using %Foo{key: value} syntax directly which can lead to bugs. And while guards are in fact limited, this is supposed to be a minimal (and optional) check for the data format, not a full blown validation. 

And when it comes to limits, the proposed solution can be quite universal, actually. Because I use Kernel.apply/3, we could allow to pass arbitrary arguments inside @guards, for example:

@guards [active: {:in, [true, false]}]
@guards [age: {:>, [18]}]


Then, 

case @guards[key] do
  {fun, args} ->
    if apply(Kernel, fun, [value | args]) do
      # ok
    else
      # add error
    end
  # ...
end


While the syntax can probably be improved, this is quite flexible solution to validate the types. 


If this doesn't convince you, I guess I trust that you're correct and this is the right decision :)  Thanks!


W dniu środa, 4 października 2017 11:08:17 UTC+2 użytkownik Maciej Kaszubowski napisał:

José Valim

unread,
Oct 4, 2017, 6:41:28 AM10/4/17
to elixir-l...@googlegroups.com
I think the behaviour I'm proposing is in fact to raise, rather then to ignore the field. The code I showed produces this result when the field doesn't pass the check

Yes, definitely. I was commenting in regards to JSON values where you may want to validate the input and report that accordingly instead of just crashing.

Wrapping the struct creation in a function is a good way (because it helps abstracting the struct implementation), but the problem is that this doesn't prevent using %Foo{key: value} syntax directly which can lead to bugs. And while guards are in fact limited, this is supposed to be a minimal (and optional) check for the data format, not a full blown validation. 

We can't guarantee that anyway. You can create the struct and later use Map.put/3 or use %{struct | key: "invalid"}. If you don't force your users to go through a regular API, be it your own function or be it a special %Foo{} syntax.

And when it comes to limits, the proposed solution can be quite universal, actually. Because I use Kernel.apply/3, we could allow to pass arbitrary arguments inside @guards, for example:

Then it is not a guard. :)




José Valim
Founder and 
Director of R&D

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.

Rafał Radziszewski

unread,
Nov 6, 2017, 11:50:18 AM11/6/17
to elixir-lang-core
Hi,
since I see a lot of potential in the feature I'd like to provide a couple of arguments in favour of it. I will use points for ease of discussion.

1. I am not going to talk about specific implementation (through @guard property) as proposed, but generally about giving possibility to create invariants in structs, i.e. possibility to raise on instantiating struct with invalid data.
2. No solution will be perfect (i.e. mentioned possibility of using Map.put/3 or manually creating struct as map through %{__struct__: MyStruct}, using those features is somehow similar to addressing raw memory in C - if you're doing it, you should expect problems), a lot can be achieved by modifying just struct/2, %MyStruct{} and %MyStruct{my_struct | key: val}, as those are prevalent in the codebase.
3. Though there is possibility to define function performing those validations (i.e. it is quite common to see new/1 in a lot of libraries), there is no way to restrict creation of struct to that functions. I believe it to be a good decision, since data is just data, but it makes it quite easy to make a mistake that is hard to pin down, since no warnings or any other suggestions are issued.
4. While it is correct that in general forcing users to use API is the way to go, literal struct creation is very common and in most cases correct, which can lead to confusion. It can also encourage extensive boilerplating of functions like 'new', just in case, which can be detrimental to language clarity.
5. Adding invariants does not breaks the premise of validating data on boundries, it supports it. Ability to restrict field values leads to cleaner code, since it is not necessary to check input validity in every function. Raising error on invalid instantiation actually forces to perform validation on the boundry - in the place where struct is instantiated. It allows for less defensive coding, since programmer can actually assume that struct is correct.
6. Matching on %Foo{} in this scenario doesn't actually cause problems, since the structure won't be even instantiated if it is wrong, unless it's manually put  together from map.

In general I believe that adding more possibilities to express ideas through types could really help the language.

Cheers,
Rafal


Francesco Lo Franco

unread,
May 30, 2019, 10:26:57 AM5/30/19
to elixir-lang-core
Sorry to dig up this very old post, but I thought I wanted to just +1 this proposed feature because I reckon it will be extremely useful for a language such Elixir, which, besides, sees DDD being applied much more than in the past. Is there any update or news about this? I'd be happy to know more or contribute to this if helpful.

Allen Madsen

unread,
May 30, 2019, 1:29:19 PM5/30/19
to elixir-l...@googlegroups.com
You're probably better off using an ecto schema and changeset.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/895a5599-2f91-4abf-aeef-81adcfdffc2d%40googlegroups.com.

Francesco Lo Franco

unread,
May 30, 2019, 1:35:00 PM5/30/19
to elixir-lang-core
Yes, that's the approach I took, in fact. The issue with this approach are:

1. have to use a library to do what, I think, should be provided by the core language (that's my personal opinion)
2. it still does not enforce in any way invariants. I can still do invalid_struct = %MyStruct{} or %MyStruct{name: 5} even if the field in the embedded schema is defined as :string

Using Ecto and changesets to validate the state of the struct does the job in a way, but it does not ensure the struct is valid "at any time", which is what I'd like to achieve instead.


On Thursday, 30 May 2019 18:29:19 UTC+1, Allen Madsen wrote:
You're probably better off using an ecto schema and changeset.

On Thu, May 30, 2019 at 10:27 AM Francesco Lo Franco <lofranco...@gmail.com> wrote:
Sorry to dig up this very old post, but I thought I wanted to just +1 this proposed feature because I reckon it will be extremely useful for a language such Elixir, which, besides, sees DDD being applied much more than in the past. Is there any update or news about this? I'd be happy to know more or contribute to this if helpful.

On Monday, 6 November 2017 16:50:18 UTC, Rafał Radziszewski wrote:
Hi,
since I see a lot of potential in the feature I'd like to provide a couple of arguments in favour of it. I will use points for ease of discussion.

1. I am not going to talk about specific implementation (through @guard property) as proposed, but generally about giving possibility to create invariants in structs, i.e. possibility to raise on instantiating struct with invalid data.
2. No solution will be perfect (i.e. mentioned possibility of using Map.put/3 or manually creating struct as map through %{__struct__: MyStruct}, using those features is somehow similar to addressing raw memory in C - if you're doing it, you should expect problems), a lot can be achieved by modifying just struct/2, %MyStruct{} and %MyStruct{my_struct | key: val}, as those are prevalent in the codebase.
3. Though there is possibility to define function performing those validations (i.e. it is quite common to see new/1 in a lot of libraries), there is no way to restrict creation of struct to that functions. I believe it to be a good decision, since data is just data, but it makes it quite easy to make a mistake that is hard to pin down, since no warnings or any other suggestions are issued.
4. While it is correct that in general forcing users to use API is the way to go, literal struct creation is very common and in most cases correct, which can lead to confusion. It can also encourage extensive boilerplating of functions like 'new', just in case, which can be detrimental to language clarity.
5. Adding invariants does not breaks the premise of validating data on boundries, it supports it. Ability to restrict field values leads to cleaner code, since it is not necessary to check input validity in every function. Raising error on invalid instantiation actually forces to perform validation on the boundry - in the place where struct is instantiated. It allows for less defensive coding, since programmer can actually assume that struct is correct.
6. Matching on %Foo{} in this scenario doesn't actually cause problems, since the structure won't be even instantiated if it is wrong, unless it's manually put  together from map.

In general I believe that adding more possibilities to express ideas through types could really help the language.

Cheers,
Rafal


--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-l...@googlegroups.com.

José Valim

unread,
May 30, 2019, 1:52:28 PM5/30/19
to elixir-l...@googlegroups.com
Hi Francesco!

Some comments inline.

> Using Ecto and changesets to validate the state of the struct does the job in a way, but it does not ensure the struct is valid "at any time", which is what I'd like to achieve instead.

There are no guards, type system, or any other mechanism that can actually guarantee this is true generally. If you are talking about DDD, then we are talking about business rules and those may get quite complex. The best we could do is to guarantee that the struct has certain types in certain fields, but it does not necessarily say the struct *is valid*.

And even if you say, "just checking the types are fine", then we can't provide that without a full-blown type system.

From the original proposal, what makes the most sense is to restrict which modules can create and/or access structs fields so there is at least an implicit guarantee that everyone is going through an existing, pre-defined API. But even if we do add those checks, they can be easily bypassed by doing something as simple as:

def bad(user) do
  %{user | name: 5}
end

I understand the counter-argument is "some check is better than no check" but a check that is part of the language has to provide better guarantees than this. Perhaps someone can do a Credo check that guarantees a struct can only be accessed in the module that defines it and that should cover most of the cases.

José Valim
Founder and Director of R&D
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/db5da3be-0e85-4bb8-9291-926c505fdd81%40googlegroups.com.

Francesco Lo Franco

unread,
May 31, 2019, 9:36:39 AM5/31/19
to elixir-lang-core
Hi Josè, thanks for your reply.

Sorry I have done a poor job explaining myself. What I meant I'd love to have is:

- avoiding "public" write/read direct access to fields in a struct
- forcing users to use an API to "construct" a new struct

I'm pretty sure with these changes achieving the goal "valid struct at 'all' times" would be much easier.


I understand enforcing types for struct fields would be pretty complex. Besides, this could be worked around wrapping even simple fields (binaries or integers) into structs (someone said value objects?).


Referring to your example:


def bad(user) do
  %{user | name: 5}
end

I don't think this is a problem. That is what in OOP I would call a "dirty public setter", which can obviously be used, but it should not. Monitoring against these kind of bad practices is easier anyway.

Also, having "private" structs will give Elixir community a big message such as: "we care about information hiding".

José Valim

unread,
May 31, 2019, 10:28:08 AM5/31/19
to elixir-l...@googlegroups.com
> I don't think this is a problem. That is what in OOP I would call a "dirty public setter", which can obviously be used, but it should not. Monitoring against these kind of bad practices is easier anyway.

Except that most OOP languages allow you to forbid this altogether for certain fields while it is quite hard to provide the same guarantee here.

> Also, having "private" structs will give Elixir community a big message such as: "we care about information hiding".

Just to clarify, I think "information hiding" is the wrong end-goal here. We actually want to *avoid* hiding state (information) because trying to make sense of a system where you have a bunch of small state hidden everywhere is quite hard. A better word would be "we care about defining proper boundaries".

José Valim
Skype: jv.ptec
Founder and Director of R&D
--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

Francesco Lo Franco

unread,
Jun 2, 2019, 6:23:18 PM6/2/19
to elixir-l...@googlegroups.com
Yes, indeed some programming language provides more strict ways of denying modification to internal data structures details than others. But given my background pre-elixir (PHP) I can assert it is possible to "live" with that type of flexibility, although it requires more vigilance.

On the "information hiding" bit, I was actually referring to:
"In computer science, information hiding is the principle of segregation of the design decisions in a computer program that are most likely to change, thus protecting other parts of the program from extensive modification if the design decision is changed. The protection involves providing a stable interface which protects the remainder of the program from the implementation (the details that are most likely to change).
Written another way, information hiding is the ability to prevent certain aspects of a class or software component from being accessible to its clients, using either programming language features (like private variables) or an explicit exporting policy."
(https://en.wikipedia.org/wiki/Information_hiding)

Hope this clarifies my POW.

Anyway, I'm happy to see you agree, on a certain degree, that having some ways of making structs more "protected" is something Elixir should address. Happy to help or join any further discussion on this if it can help.


José Valim

unread,
Jun 3, 2019, 1:48:40 AM6/3/19
to elixir-l...@googlegroups.com
Thanks for posting the link and correcting me. The definition of information hiding is more general than I expected. :)
--
Reply all
Reply to author
Forward
0 new messages