Defstruct vs Records - an Erlanger's cry for help

3444 views
Skip to first unread message

Adrian Roe

unread,
Aug 2, 2015, 9:26:11 AM8/2/15
to elixir-lang-talk
Dear list,

Is there a reference anywhere I could go explore that discusses the choice between Elixir's defstruct and the (now deprecated) Record capability?  I'm a reasonalby long time Erlanger (5~6 years of commercial experience - 95% of what we do is Erlang the other 5% is C for drivers etc) and have been looking at Elixir with considerable interest for a while now.  There are a bunch of things that are extremely attractive (more consistent standard library, macros vs parse transforms, awesome unicode support - seriously, it rocks! - pipelines, Enums and Streams, much better docs, the list is goes on...) which prompted me recently to write my first "vaguely real" Elixir application to see if we should start to look at production Elixir code over Erlang.

That has been an interesting process and I'm trying to make sure we write idiomatic Elixir, not just Erlang transliterated into Elixir.  Typically that has been an enjoyable journey: where there are major differences between Elixir and Erlang it has been easy to follow the justification.  Any compromises in Elixir have been acknowledged and made explicit.  The reasoning behind the compromises has been clearly articualted either in the docs or in some helpful blog somewhere (the use of full stop in anonymous function calls springs to mind).

The one area I really don't grock is Elixir's seeming allergy to records.  I suspect this is just the Erlanger in me struggling to let go but I'd love to understand the thinking behind it.  With defstruct, Elixir has to jump through hoops to provide compiler errors for undefined key names.  You do get the benefit of built-in introspection (via the __struct__ member), but that introspection is easily delivered using other mechanisms (for example we have an Erlang parse transform that automatically creates type introspection functions as well as mapping functions between any of Records, JSON, BSON, Proplist and Map formats.  The record has to be fully typespec'd of course to enable this introspection and to pick appropriate data types (e.g. when mapping from BSON /JSON), but that is something that production code should be doing out of habit anyway - the benefits of explicitness and Dialyzer goodness should never be underestimated.

So my (Erlang skewed and ignorant) prespective is that defstruct seems to offer the same functionality as records do, just in a way that is less performant and without all of the wider benefits that records bring (e.g. better integrtation with types and Dialyzer).  Adding easier introspection could have been achieved alongside records without the need for using a totally different (bigger and slower) datastructure.  By the way, we are in no way allergic to maps - and use them heavily in Erlang - just not for the job that records do better...

I don't for one moment think that the decision with regards to Records vs Defstruct was stumbled into or made ignorant of the sorts of issues mentioned above, so I suspect I'm about to have an "oh, of course, I should have thought of that" moment, but scanning release notes and searching for articles on the topic has left me still groping for now.  Anyone care to cast light on my darkness?

Thanks for any thoughts,

Adrian
Dr Adrian Roe

José Valim

unread,
Aug 2, 2015, 10:12:41 AM8/2/15
to elixir-l...@googlegroups.com
Hello Adrian,

Those are excellent questions!

Let's start with a clarification. Records in Elixir are not deprecated, the Record module is not going anywhere and it aims to provide the same feature set as records in Erlang. It is completely fine to use them and I will get to when this might be a good idea by the end of this response. To clarify: when we said that Records in Elixir were deprecated, it was Elixir implementation of records which is long gone by now.

Structs offer a mixture between records and maps. Records are compile-time based, maps are runtime based. Structs aim to add record-like compile time checks on top of maps. This is actually faster and conceptually simpler than trying to add runtime features to Records.

So what can we get with structs that we can't get with records? I can't write code with records that say "match on any record that contains this field". For example, imagine this function:

    def name(%{first_name: first, last_name: last}) do
      "#{first} #{last}"
    end

It will work for any struct that contains the two fields above since we are simply relying on the underlying maps. If I have Teacher and Student structs, I can have one function that will nicely suit both as long as they have both fields, a requirement clearly specified in the function definition). With records, because those checks are structural, they cannot be shared. I would need to write two functions, one for each record.

In fact, there is a good amount of research to make Records in other languages flexible, exactly as above. Here is the one we used as foundation when designing structs: http://research.microsoft.com/pubs/65409/scopedlabels.pdf

Another feature that comes with structs is that they are the foundation for polymorphism in Elixir. With structs, we get custom types, and custom types can have their own protocol implementations.

Those are the main two benefits (field-based polymorphism and protocols). However, you may be wondering: couldn't we add those runtime features to Records? In fact, we could and we did! That's how records were used to work in Elixir. We could define a record like:

    defrecord Student, [:first_name, :last_name]

    def name(record) do
      "#{record.first_name} #{record.last_name}"
    end

The function above would also work on any record that contains the fields first_name and last_name. Its implementation relies on the fact we can call a function on a tuple like {Student, "josé", "valim"}. So when you defined a record, we automatically defined functions for all of its fields. This added runtime behaviour to records but, because this relies on a tuple dispatch, it was actually slower than accessing or matching fields in a map!

We have also used records for doing the protocols dispatch but it had a big limitation. For example, every time you called any protocol, whenever we saw a tuple where the first element was an atom, like {Student, "josé", "valim"}, we would try to invoke the Student implementation for that protocol. The issue with this approach is that tuples where the first element is an atom are very, very, very common, so we ended-up trying to call protocol implementations for a bunch of different tuples only to find out there was no implementation, that it was a false positive, and it made the whole thing slow. With structs, because the struct tag is in __struct__, it is very unlikely to have conflicts.

Our record implementation had other issues. For example, the fact it relied on tuple dispatch, made them look a lot like objects and they were abused like that very frequently. Also, once you did "record.first_name", all of dialyzer features were gone too!

For all those reasons, bringing compile time checks to maps as structs is simpler and faster, and that's why it makes sense to promote structs in Elixir as defaults. However, what are the downsides?

You have already touched the first one which is limited dialyzer support. However, it is just a matter of time before they make maps better citizens inside dialyzer. And we can actually have better dialyzer support with maps than we could with the old polymorphic records.

The second one is, inside tight loops, where you absolutely don't care about the polymorphic or runtime features of maps, records are going to be faster. This will always be true and, if that is the case, just use records. That's exactly what we do in Inspect.Algebra where we need to work with different kinds of documents. Using records is simpler and faster because we are asserting on some particular kinds of documents internally (in the linked code though, notice we use macros instead of the Record module due to bootstraping reasons in the compiler).

I hope this clarifies the design decisions behind structs and the few cases where one should still use records in Elixir.



José Valim
Skype: jv.ptec
Founder and Director of R&D

--
You received this message because you are subscribed to the Google Groups "elixir-lang-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-ta...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-talk/1438521967674.a78e2288%40Nodemailer.
For more options, visit https://groups.google.com/d/optout.

Adrian Roe

unread,
Aug 2, 2015, 10:39:12 AM8/2/15
to elixir-l...@googlegroups.com, elixir-l...@googlegroups.com
José

I won’t grace myself with “oh, of course, I should have thought of that” and will instead cling to "I don’t for one moment think that the decision with regards to Records vs Defstruct was stumbled into or made ignorant of the sorts of issues mentioned above”.  Many, many thanks for the exceptionally clear response.  I have bedtime reading for the next few days.

As is sometimes the case, after reading your email I also found https://gist.github.com/josevalim/b30c881df36801611d13 (the original proposal for maps and structs) that articulates the polymorphic struct point well.  For some reason I was totally incapable of finding it before hand.  Perhaps I should send a petition to Kostis / Ericcson to accelerate the fuller support for maps within Dialyzer as it would clearly benefit both languages significantly :)

Once again many thanks for an exceptional response to a newcomer’s question.

Adrian





Robert Virding

unread,
Aug 2, 2015, 6:42:31 PM8/2/15
to elixir-lang-talk
I think one issue with elixir structs is that they are so closely coupled to the module in which they are defined, at the global ones. You can only define one per module which seems a very strange limitation. Unless this has changed lately.

In my flavors implementation I am defining a module per flavor, two for some, so it will be perfectly straight forward to define many flavors in on module. One module is defined at compile-time containing most of the data and one at run-time for those flavors which have instances. Enough is made at compile-time so you distribute pre-compiled systems.

Robert

José Valim

unread,
Aug 2, 2015, 7:08:58 PM8/2/15
to elixir-l...@googlegroups.com
I think one issue with elixir structs is that they are so closely coupled to the module in which they are defined, at the global ones. You can only define one per module which seems a very strange limitation. Unless this has changed lately.

That was pretty much intentional because if you want define a bunch of them, it is very likely you just want maps. The point of the coupling is exactly so we can use them with protocols, which requires a very explicit name so we can perform a dispatch.
 
In my flavors implementation I am defining a module per flavor, two for some, so it will be perfectly straight forward to define many flavors in on module.

If you are defining a module per flavor, don't you have exactly the same limitation? The difference is that your module is defined behind the scenes, while Elixir would ask you to do it explicitly. I.e. it would be the same as someone defining a defstruct/2 in Elixir allowing you to:

    defmodule Foo do
      defstruct Bar, ...fields...
      defstruct Baz, ...field...
    end

It looks like many per module... but it still one per module really.

Robert Virding

unread,
Aug 2, 2015, 7:22:29 PM8/2/15
to elixir-lang-talk, jose....@plataformatec.com.br

Yes, it would be the same, but now it just seems a strange limitation that I can only define one per module and that they are so closely coupled to the module in which they are defined. You get the feeling that they can't exist on their own, which they could if you wished it. At least I get the feeling. :-)

defmodule Bar do
  defstruct ...
end

defmodule Baz do
  defstruct ...
end

It seems like overkill.

 

Peter Hamilton

unread,
Aug 2, 2015, 9:28:33 PM8/2/15
to elixir-lang-talk, jose....@plataformatec.com.br

Sounds like a perfect use case for a new macro.

defmacro mydefstruct(name, fields) do
  quote do
    defmodule unquote(name) do
      defstruct fields
    end
  end
end

Or something like that.


--
You received this message because you are subscribed to the Google Groups "elixir-lang-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-ta...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages