best way to limit explicit creation of struct?

933 views
Skip to first unread message

vadim

unread,
Mar 17, 2016, 6:22:42 PM3/17/16
to elixir-lang-core

Sometime I am running into the situation when a struct should be created using new function. Since %T{} is a popular way to create a struct, use of it becomes a quick road to hell (since it produces unclear messages in unexpected places). For example, if a field needs to be initialized with make_ref, I found no good way to avoid new.

I know, i could make structure @opaque, but it make the whole struct opaque. Is there a good way to make %T{} constructor inaccessible?

Thank you,


/vadim

José Valim

unread,
Mar 17, 2016, 6:28:44 PM3/17/16
to elixir-l...@googlegroups.com
I don't believe there is such a mechanism today. I think this is a reasonable request but I don't have an answer on how to expose it. Thoughts on how such API would even look? Also, how would it affect dynamically created structs? Like struct(T, [])?



José Valim
Skype: jv.ptec
Founder and Director of R&D

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/e38bd60b-f8a7-4183-bb17-ca0a8def19ea%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

vadim

unread,
Mar 17, 2016, 6:40:27 PM3/17/16
to elixir-lang-core, jose....@plataformatec.com.br
Thank you. I hoped to miss something obvious.

I will try to come up with proposal.

/vadim

w.m.w...@student.rug.nl

unread,
Mar 24, 2016, 3:59:46 PM3/24/16
to elixir-lang-core
Am I oversimplifying things, or would a variant of the defstruct macro, called `defstructp` that internally defines `__struct__/0` as a private method be enough?

vadim

unread,
Mar 24, 2016, 11:57:32 PM3/24/16
to elixir-lang-core

Nothing is ever easy. I followed your suggestion, and the result is a bit disappointing:

macro in Kernel:

  defmacro defstructp(fields) do
    quote bind_quoted: [fields: fields] do
      fields = Kernel.Utils.defstruct(__MODULE__, fields)
      @struct fields

      case Module.get_attribute(__MODULE__, :derive) do
        [] -> :ok
        derive -> Protocol.__derive__(derive, __MODULE__, __ENV__)
      end

      defp __struct__() do
        @struct
      end

      fields
    end
  end

definition of the struct:

defmodule StructTest do
  defstructp [:aaa, :bbb]

  def new do
    %__MODULE__{aaa: 1, bbb: 2}
  end
end

produces

== Compilation error on file lib/test_struct.ex ==
** (CompileError) lib/test_struct.ex:5: StructTest.__struct__/0 is undefined, cannot expand struct StructTest
    (elixir) src/elixir_map.erl:58: :elixir_map.translate_struct/4

José Valim

unread,
Mar 25, 2016, 2:51:23 AM3/25/16
to elixir-l...@googlegroups.com
That's a good idea. Vadim, since you are working on a possible patch, it seems, try changing the following line in your Elixir checkout:


Just remove the "def" argument, doing a function call with 3 arguments only, and then call "make erlang". That *may* fix the error you are seeing.



José Valim
Skype: jv.ptec
Founder and Director of R&D

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

vadim

unread,
Mar 25, 2016, 4:41:16 PM3/25/16
to elixir-lang-core, jose....@plataformatec.com.br
Thank you.

I tried, and it solved compilation problem. It works mostly as expected. _mostly_ The snag is, matching does not work :-( 

here is the test (commented pieces do not compile):
```elixir
defmodule PrivateStruct do

  defstructp [:aaa, bbb: 1]

  def new() do
    %__MODULE__{aaa: 2}
  end
end

defmodule PrivateStructTest do
  use ExUnit.Case
  doctest VStructTest

  test "can instantiate" do
    assert %{aaa: 2, bbb: 1} = PrivateStruct.new
  end

  test "can access __struct__" do
    IO.inspect PrivateStruct.new.__struct__
  end

  test "can access members" do
    assert 2 == PrivateStruct.new.aaa
  end

  test "can update" do
    assert %{aaa: 3, bbb: 4} = %{PrivateStruct.new | aaa: 3, bbb: 4}
  end

  test "cannot instantiate directly" do
    # %PrivateStruct{}
  end

  test "cannot match struct" do
    # assert %PrivateStruct{} = PrivateStruct.new
  end

  test "cannot use struct() with alias" do
    struct(PrivateStruct, aaa: 1)
  end

  test "can use struct() with value" do
    struct(PrivateStruct.new, aaa: 1)
  end
end

```

/vadim

José Valim

unread,
Mar 26, 2016, 4:46:25 AM3/26/16
to vadim, elixir-lang-core
Thanks Vadim.

I guess this raises the question of what it really means to be a public/private struct?

In your case, you want to forbid creating structs directly but I can imagine folks also wanting to forbid matching because the structure is private? Think Ecto.Query for example.



José Valim
Skype: jv.ptec
Founder and Director of R&D

w.m.w...@student.rug.nl

unread,
Mar 27, 2016, 12:06:16 PM3/27/16
to elixir-lang-core, vadim....@moz.com, jose....@plataformatec.com.br
Yes, I believe that the second case is what Haskell does when you do not export a type: It becomes impossible to match that type, since basically the name is not known to the outside world. To me, this would seem the proper way to make a struct private

Maybe, to allow Vadim's use-case as well, we'd need a third variant (protected? or is there a better name with maybe less baggage and a more distinct abbreviation?) that would allow pattern-matching, but disallow the %SomeStruct{foo | updated: "value"} use of the I operator and struct(name, values).

That is probably not trivial to implement.

vadim

unread,
Mar 28, 2016, 12:11:09 PM3/28/16
to elixir-lang-core, vadim....@moz.com, jose....@plataformatec.com.br

There is a few aspects of openness (I am trying to list, but I do not mean it to be implemented) in order of openness:

  • direct access to struct fields;
  • creating struct (via %MyStruct{} or struct(MyStruct));
  • matching struct.

We currently have two options defined: defstruct and @opaque. (I was a bit surprised when I experimented with opaque since protection seems to be limited to the type, not to the implementation. I’m not sure what value does it provide.)

I am not sure why, but my reading of match did not include instantiation of struct. After all, struct might include default values of fields that do not participate in the match, or binding of variables (the later could be a syntactic sugar).

defmodule Foo do
  defstruct foo: 123,bar: 345
  def match do
    %__MODULE__{foo: 345} = struct(__MODULE__, foo: 345, bar: 123)
  end
end

From system development standpoint, I can see advantage of open and close by module structs. Intermediate position takes a struct open for application, but closed outside application. The closed struct can be manipulated only by provided functions (e.g., one can think that struct fields are mangled beyond recognition). Interestingly, I would think that %Foo{} = x match would work (without any field specified) being equivalent of x.__struct__ == Foo.

My original question (regarding explicit creation) is modification of normal struct creation. Syntactically I would think more along line of @explicit_new attribute of the defstruct macro than making struct closed. It is difficult to judge merits of it. Too many options is a clearly bad choice for the language. If given a choice of what to implement, I would vote for module/application closed structs over explicit new. Even for myself, the situation looks a bit fragile: we do not allow to create empty struct, but we allow to modify any field afterward.

I would appreciate clarification regarding Ecto.Query. I do not know what is the problem about it.

One more item: I can see situation when we do not want match available. An example use case would be a function which accepts/returns several variants of data, a we do not want its client to rely on particular one. It looks like strong opaque should work (but it is stronger than current opaque).

vadim

unread,
Mar 28, 2016, 12:12:33 PM3/28/16
to elixir-lang-core, vadim....@moz.com, jose....@plataformatec.com.br
Oops:

    • creating struct (via %MyStruct{} or struct(MyStruct));
    • direct access to struct fields;
    • matching struct.

    José Valim

    unread,
    Mar 28, 2016, 12:20:46 PM3/28/16
    to vadim, elixir-lang-core
    @opaque is only for type annotation.

    Regarding Ecto.Query, it would be an example of a "strong opaque" data structure. But, as you said, it is still a "fragile" one. You can still access fields at runtime because they are just maps after all.

    Honestly, I am not sure what is the best solution here.



    José Valim
    Skype: jv.ptec
    Founder and Director of R&D

    vadim

    unread,
    Mar 28, 2016, 1:40:32 PM3/28/16
    to elixir-lang-core, vadim....@moz.com, jose....@plataformatec.com.br

    what if we will set __struct__ to {MyStruct, :opaque}? Can we hide fields without much overhead?

    /vadim

    José Valim

    unread,
    Mar 28, 2016, 1:43:30 PM3/28/16
    to vadim, elixir-lang-core
    We cannot change how __struct__ behaves because it is public API.
    --

    vadim

    unread,
    Mar 28, 2016, 4:49:51 PM3/28/16
    to elixir-lang-core, vadim....@moz.com, jose....@plataformatec.com.br

    Can we extend it? If we introduce a new feature, it does not break existing code. It could breaks only new code.

    I would admit: it is possible that we pass new struct to some preexisting code. It is still new development, and can be decided explicitly by caller.

    /vadim


    Ben Wilson

    unread,
    Mar 28, 2016, 11:33:13 PM3/28/16
    to elixir-lang-core, vadim....@moz.com, jose....@plataformatec.com.br
    Maybe I'm missing something but particularly with the notion of mediating struct fields via functions, how is this not re-producing objects? One of the important principles in erlang and elixir is that you don't conflate data and behaviour. Functions are behaviour, data is data. Structs right now are data. If you hide them behind functions they're now Objects.

    Fortunately for me it doesn't seem particularly possible to enforce, but I'm confused about what we're aiming for here. It was one thing to sort of indicate the "internalness" of a struct by making it harder to construct outside of its own module or something, but we seem to be creeping into stranger land here.

    Peter Hamilton

    unread,
    Mar 28, 2016, 11:57:59 PM3/28/16
    to elixir-lang-core, vadim....@moz.com, jose....@plataformatec.com.br

    Ben: I think the actual design question here is "How do I provide dynamic defaults for data structures?"

    There's also the caveat that these data structures are not valid without these dynamic defaults.

    I don't think that's necessarily conflating with OOP, but it is a tough thing to solve.


    --
    You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

    vadim

    unread,
    Mar 29, 2016, 10:30:57 AM3/29/16
    to elixir-lang-core, vadim....@moz.com, jose....@plataformatec.com.br
    Ben, thank you for a great point. 

    I could argue that concept of data encapsulation is somewhat orthogonal to objects (it goes back to 1972, to David Parnas'  "On the Criteria To Be Used in Decomposing Systems into Modules"). While objects utilize information hiding, other can benefit from it as well. But it is somewhat irrelevant.

    I think my concern is valid: I need to provide valid initialization to the struct (data). `defstruct` does not allow that. If leave `defstruct` uninhibited, someone might use it instead of proper initialization :-( . Tests could help, but the compiler is the very first line of defense. In addition, person who does not know about proper init, might not write proper test.

    If the data openness is concept of the language, I am totally happy. Information hiding is optional thing. I would ask although, why do we have `opaque` types, what is their purpose?

    /vadim

    José Valim

    unread,
    Mar 29, 2016, 11:41:52 AM3/29/16
    to vadim, elixir-lang-core
    If the data openness is concept of the language, I am totally happy. Information hiding is optional thing. I would ask although, why do we have `opaque` types, what is their purpose?

    The problem is that dialyzer support for maps is not complete. However, if you define a tuple to be of opaque type, if you pattern match on that tuple or use elem/2 on it, dialyzer *may* complain about it because you would be violating its opaqueness. If the type is opaque, you should be free to change its inner representation at any time because nobody is supposed to be relying on it.

    So the type is more about a contract of intent because you can't bypass the dynamic nature of the language anyway.

    I assume that if dialyzer had full support for maps and you pattern matched on a map/struct that is of opaque type, dialyzer would complain as well. That's why I am personally inclined to support defstructp that forbids both struct creation and matching. My hesitation in doing so is that it is not the feature that you have asked for and I haven't heard of a use case about it.

    Forbidding only creation seems a bit weird to me, because I can still update it in an incompatible way, but I also understand you cannot easily create a struct without using %Foo{} while you can update it bypassing the %Foo{} syntax completely.

    You have probably already answered this question, so sorry if I am asking again, but why would using defstructp where all expansions are forbidden be a bad thing to you?

    vadim

    unread,
    Mar 29, 2016, 11:57:29 AM3/29/16
    to elixir-lang-core, vadim....@moz.com, jose....@plataformatec.com.br
    Well, we are using pattern matching as an additional contract declaration. (`dializer` is rather raw at this point, and erlangish quite a bit). I understand, we incur runtime cost, but we gain some clarity.

    So, we have `Foo` with dynamic initializer, but want to write `def bar(%Foo{} = foo)`. In fact, we do not encapsulate `Foo` at all, so we use matching to extract struct members.

    I agree, the prohibiting explicit instantiation looks weird. May be the better way would be to modify `defstruct` to allow `do:` part, which would become part of `__struct__` method instead? 

    I can argue pro and con data encapsulation. But I would not argue that initializer is encapsulation issue.

    José Valim

    unread,
    Mar 29, 2016, 2:20:07 PM3/29/16
    to elixir-l...@googlegroups.com, vadim
    I agree, the prohibiting explicit instantiation looks weird. May be the better way would be to modify `defstruct` to allow `do:` part, which would become part of `__struct__` method instead? 

    Can you provide an example of what you would do in such cases? :) Keep in mind that it would likely run at compile time rather than at runtime.



    José Valim
    Skype: jv.ptec
    Founder and Director of R&D

    --
    You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

    Vadim Suvorov

    unread,
    Mar 29, 2016, 2:42:47 PM3/29/16
    to José Valim, elixir-l...@googlegroups.com
    I will work on it tonight.

    /Vadim

    vadim

    unread,
    Mar 29, 2016, 2:50:41 PM3/29/16
    to elixir-lang-core, vadim....@moz.com, jose....@plataformatec.com.br

    The idea is, instead of making new, to allow something like:

    defstruct [:foo, :bar, :baz] do
       foo = make_ref
    end
    

    or

    defstruct [:foo, :bar, :baz],  fn data ->
       data.foo = make_ref
    end
    

    potentially inlining passed function.

    Then

      defmacro defstruct(fields, fun \\ nil) do
    
        quote bind_quoted: [fields: fields] do
          fields = Kernel.Utils.defstruct(__MODULE__, fields)
          @struct fields
    
          case Module.get_attribute(__MODULE__, :derive) do
            [] -> :ok
            derive -> Protocol.__derive__(derive, __MODULE__, __ENV__)
          end
    
    
          def __struct__() do
            @struct
    
            if fun, do: fun.() ## a lot of details here
          end
    
          fields
        end
      end
    

    José Valim

    unread,
    Mar 29, 2016, 3:21:48 PM3/29/16
    to vadim, elixir-lang-core
    Thank you, I see it now. We could make it work but it is not that straight-forward.

    Today __struct__() returns a value and not an AST. That's why you can do Foo.__struct__() at runtime.

    If we simply allowed __struct__() to internally call make_ref(), you would get a static reference when you compile the code, which is not what you want. Instead, you want to get a new make_ref every time the code that calls %Foo{} executes at runtime. However, this is at odds with the definition above, because it would imply __struct__() needs to return an AST with make_ref() in it.

    Of course, we could have both modes by allowing both def/defmacro __struct__() to exist. Then you'd get different executions depending on how it is called. Of course, such would require changes in the compiler.

    This is quite interesting because the second mode is how structs originally worked. The struct fields were taken as ASTs, so if you defined "foo: make_ref()", make_ref() would be injected as is into %Foo{} and invoked at runtime. We moved away from it because it was counter-intuitive, you usually expect literals to have no cost at runtime.

    I completely understand your requirements but, unfortunately, I still do not have a solution. I am not sure when/how/if a solution will come up but I am sure you will understand. :)

    Thanks for the discussion Vadim!


    José Valim
    Skype: jv.ptec
    Founder and Director of R&D

    vadim

    unread,
    Mar 29, 2016, 3:45:29 PM3/29/16
    to elixir-lang-core, vadim....@moz.com, jose....@plataformatec.com.br
    > Today __struct__() returns a value and not an AST. That's why you can do Foo.__struct__() at runtime.

    It was my understanding. So the idea is, the provided `fun` is applied to __struct__ result. I hope, being explicitly specified lambda or do-block would make more intuitive to understand non-static nature of the initialization, much the same way `def foo, do: something` does not create expectation of the compile-time activity.

    Ilja Tollu

    unread,
    Jul 7, 2017, 12:27:08 PM7/7/17
    to elixir-lang-core, vadim....@moz.com, jose....@plataformatec.com.br
    The latest message here was more than a year ago, but the topic is still actual.

    I think, what we need here is the mandatory function which uses whatever values passed to %Struct{}.

    I.e., I need to enforce integrity of a newly created structure - so this mandatory function gets called. Structure is immutable - so when we change it, brand new structure gets created and, again, this function gets called.

    Maybe use callback here? Say, default callback just returns the struct. But being overridden, it can contain any validation logic needed. 
    Reply all
    Reply to author
    Forward
    0 new messages