Types, Type Aliases, and Records

564 views
Skip to first unread message

iain mccoy

unread,
Dec 13, 2015, 9:41:08 AM12/13/15
to Elm Discuss
Hi folks,

I've just picked up Elm (coming from twin JS and haskell backgrounds), and I have a question about the language design as it concerns types, type aliases, and records. Apologies if this is the wrong place for this message!

I understand that in 0.15, it was possible to pick up a record `{ name: String }` and use the record update syntax to produce a `{ name: String, age: Int }`. That doesn't seem to be possible anymore with 0.16? Or at least, the compiler was giving me errors when I tried to do that sort of thing.

When that was possible, it made a lot of sense for type aliases to be the main way in which people talked about record updates. The fact that a record matching one type is only on update away from matching another type means tagging the records with constructors (as with discriminated unions) is going to be more trouble than it's worth. But if two similar record types are no longer close together (in terms of being able to move a value from one type to another), the potential trouble from having records tagged with constructors is much less.

I've seen a bit of chatter in the elm-lang Slack about the difference between `type alias` and `type`. I don't know how common record-induced confusion on that matter is, but it might be a point that people get stuck on pretty often? It seems like the sticking point could be removed by tending to use `type` (rather than alias) for records, and my understanding of the language change in 0.16 is such that I think we wouldn't lose much by switching to giving records tags.

I hope that makes sense! If so, please read on to these questions two, and if not, is there anything in particular I can clarify?

1. Would it make sense for Elm's libraries and code style generally to prefer using `type` over `type alias` for records?
2. Would it make sense for records to always need to be tagged, which would enforce consistent use of `type` instead of `type alias`. This would make them more like haskell's records, although I'm not sure that that would be a good thing to be like, given the perennial talk about how haskell's record system needs an overhaul[1].

-Iain
[1] for anyone curious, https://ghc.haskell.org/trac/ghc/wiki/Records is the wiki page about the issue. Apparently an upcoming version of GHC will improve the situation.

Max Goldstein

unread,
Dec 13, 2015, 11:08:29 AM12/13/15
to Elm Discuss
Yes, this is exactly the right place to ask these sorts of questions. Welcome to Elm!

You're correct that 0.16 removed record extension, which is adding a field to a record. (You can still update the value of a field that is already present.) If you have two "similar" records, you essentially have to make one with the fields of the other explicitly.

You mentioned a Haskell background - Elm's type is Haskell's data, and Elm's type alias is Haskell's type, more or less. If that doesn't help, read the section "Type Aliases and Union Types" in this document.

If a union tag has a lot of fields, its reasonable to put them into a record. Here is an example from a third-party library. The Result type is defined in core; the library defines an alias for result with specific types for each type variable. Each of the specific types is a record, and one is essentially the other with more fields. Yet, rather than do some crazy extension magic, the library defines the two records independently, and they just happen to have overlapping fields. This makes things simpler. (As a downside, there's no shorthand for getting fields which are present in both cases without doing case analysis.)

Usually we discourage libraries from exposing large types like these, since it makes them hard to change later. For an example of that, let's look at my own library, elm-animation. The Animation type is "opaque", meaning that it's a union type (created with type, no alias) whose tags are not exported. If you look at the source, you'll find that Animation has only one tag, which I've given the minimal name A so I can focus on the record which holds the actual values. Because there is only one case, I can unwrap the record in the argument list, do my dirty work on the record, rewrap it in A, and pass it back out. Note that even though I define a type alias for AnimRecord, it's not exported, so to client code, it doesn't exist. (Client = someone using a library/package.)

Having Animation be an opaque type is very useful because I can define my own abstractions like DurationOrSpeed, and the ramp field that clients never get or set directly. If I want to add more fields to the record or more tags to the union type, I can do that without doing a major version bump (so long as I update all the functions to work with the new types). Because the Animation type is used mostly for private bookkeeping, opaque (type) is the way to go. By contrast, in elm-check, having all the information about test results in front of me is very useful. I can navigate the tree and pluck out exactly the information I need using Elm's built-in record access and case expression syntax. Notice that both type-aliased records and union types are used. So the better question is not, should we prefer type to type alias, but rather, how should we choose what to export from out modules?

If you're writing an application not meant to be released as a package, feel free to export aliases and use them wherever, since the compiler will detect any bugs. If you're writing a package, it's generally best to keep types opaque so you can change them, and prevent the client from doing nefarious things. Opaque types allow you to define an interface, which is a tool for preventing error. However, if you're in the situation of elm-check where you have a lot of information to pass to the client, and you don't read these values back in, then exporting lots of types can make the library much more useful -- assuming you can commit to these types and not suddenly need to add or hide information.

Hope that helps!

Max Goldstein

unread,
Dec 13, 2015, 2:16:33 PM12/13/15
to Elm Discuss
I'll add one more thing: if large types used for output are a good thing, than small types used for inputs are a bad thing. See this library, which defines four type aliases, all of which are simple and one of which duplicates core. The result is that I constantly have to worry about what these aliases are as I try to understand the library. It would have been much clearer just to use the unaliased types. I'll emphasize again that the client code is responsible for creating values of these types, not consuming them.

The best libraries give a lot for a little. The worst libraries give the impression that by the time you've understood the API and wired everything up, you could have just done it yourself. Things like type vs. type alias and exported vs. opaque are just tools to achieve that.

iain mccoy

unread,
Dec 13, 2015, 9:10:37 PM12/13/15
to elm-d...@googlegroups.com

Hi Max,

Thanks for your detailed reply! First, I want to thank you for your wonderfully clear and beautifully written reply.

I'm afraid, though, that I was trying to aim at what I think is a deeper question, that is: now that Elm's syntax doesn't expose the record modification operators that were (or implied) a sort of type-level function, does it still make sense for the usual shorthand for a record type to be a type alias? And if not, should it even be possible to talk about records that aren't part of a union type?

I'll try to unpack that first question a little - there's a bit of a jump in the middle of it.

The reason I ask is that when elm had record extension, implying a sort of type level function, there was a definite synergy between that feature and choosing to use aliases to refer to record types. This is because the type level function of extending a record works like a sort of algebra that provides addition (when you add a record) and equality (when your code causes the type checker to try to unify two record types), and type equality looks through type aliases in such a way that if you give names to all of the types involved in that sort of manipulation, things Just Work. I'll provide an example, with code that won't work but will hopefully explain the idea:

type alias Person = { name: String }

type alias Parent = { name: String, children: [Person] }

harry_potter = { name = "Harry Potter" }

james_potter_at_hogwarts = { name = "James Potter" } -- this has type Person

james_potter_on_harrys_first_birthday = { name = "James Potter", children: [ harry_potter ] } -- this has type Parent

have_first_child person child = { person | children = [child] } -- this is a Person -> Person -> Parent, because a `Person + { children: [Person] } = Parent` (that's a completely-imaginary type-level expression)

james_potter_on_harrys_first_birthday == have_first_child james_potter_at_hogwarts harry_potter -- this is true, despite the two having been constructed in different ways.

So far, so simple. But what's going on at the type level in have_first_child? The expression that is that function's body takes a record that has some set of fields, and adds a field `children`. So a type goes in, and the type that comes out is that type's fields plus the field `children`. This fits in nicely with type aliases because when you add `children` to the Person type, you get `Parent`. If `Person` and `Parent` were parts of tagged unions, that wouldn't work: you'd have to un-tag to remove the `Person` tag and re-tag the `Parent` tag.

My guess, looking at elm tutorials and source code and stuff, is that this tidiness, this synergy between aliasing and extending records, is the main reason why records are typically addressed by a type alias in the elm world at present. But with the removal of that feature, one of the legs of the tidiness gets knocked out from underneath it. Which I think means the benefit of defaulting to type aliases to refer to records goes away.

This wouldn't mean it made sense to change anything, except that people do seem to be being confused by the distinction in existing code between type for unions and (defaulting to) type alias for records, which is the confusion that sometimes makes emails like your fabulous reply necessary.

So if the thing that made it good doesn't make it good any more, and there's something that's making it bad, maybe it's time for a change. Maybe it makes sense to not default to type aliases as a way of referring to record types. And maybe it makes sense to not even allow records to be defined outside a tagged union?

I want to point out here that I am supposing a bunch of things, and I don't know how true they are:

1. In most elm code, when defining a name to provide a shorthand way of referring to a record type, the default is to type alias.

2. The main reason for this default was a synergy with record extension.

3. Enough people are confused by the type/type alias thing that it's worth talking/thinking about changing (the confusion of the transition may well not be worth it)

There are bound to be other things too, as I am pretty much a lean mean supposition machine.

Anyway! In my newness to the elm world, I felt like I might have spotted an interesting gap between the way elm code is written and what makes most sense for elm-the-language as an easily-learned and generally joyful thing. Hence this email, which I hope (probably in vain) is not quite toooooooooooo long.

-Iain


--
You received this message because you are subscribed to a topic in the Google Groups "Elm Discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elm-discuss/0XbEEb4hkjM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elm-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Max Goldstein

unread,
Dec 13, 2015, 11:31:55 PM12/13/15
to Elm Discuss

your wonderfully clear and beautifully written reply.


Thank you, that's quite a compliment!

Well, first off, I don't think type aliasing should be removed from the language. It's useful for other semantic distinctions (like Float/Time) that don't involve records. So type aliasing a record would have to mean something.

Here's a variation of your example that compiles:

type alias Person = { name: String }
type alias Parent = { name: String, children: List Person }
have_first_child person child = { person | children = [child] } 

What's interesting is the inferred type have_first_child : { b | children : a } -> c -> { b | children : List c }. This may be more general than you expected, since the records are extensible -- I'll come back to that. But, the first record is required to to have a children field, so the annotation Person -> Person -> Parent fails in 0.16. (But not 0.15.)

So instead, let's define have_first_child person child = {name = person.name, children = [child]}. This is inferred to have type { b | name : a } -> c -> { children : List c, name : a }. Notice that whatever extra fields are on the first argument, of type b, are lost. When we annotate this version, the aliases are actually more restrictive than the inferred type, which is fine, and ensures that there isn't any extra information to be lost.

Note that you can still write have_another_child parent child = {parent| children = child::parent.children}. That is, you can still write functions that allow extra information on records, and leave it undisturbed, as long as you're not adding/removing/changing the type of fields.

Maybe it makes sense to not default to type aliases as a way of referring to record types. And maybe it makes sense to not even allow records to be defined outside a tagged union?

By "default", I assume you mean the common convention of programmers, not the way the language works, right? Okay, so what purpose would putting records as union types serve? It would allow us to make them opaque, and sometimes that's a useful thing (see previous post). But not every record type needs to be opaque. Would we not be able to have records without case analysis? And writing a union type with only one tag, and exporting the tag, is just silly.

When you say "allow records to be defined", do you mean record types (as I assumed above), or record values? In theory, one could list every type in the record after the tag, and pattern match on them. It would be tedious and error-prone to rely on N positional arguments, but it could be done. So if records only existed there, what's the point?

Regarding the numbered points, I disagree with #2 and #3. Record extension was rarely-used and confusing; we knew this before deciding to remove it. Because it was rarely-used, that wasn't the main reason for type aliases. Indeed, type aliases (unlike union types) could be removed from the language without losing expression power, but only technically. To a human, type aliases are very handy. The main reason to use a type alias is to refer to a value by its meaning rather than its representation, like Model instead of Int, or FailureOptions instead of a huge record type. The secondary reason is to make annotations shorter. (It may seem that union types also allow referring to values by meaning, but that's only the values that they wrap. Type aliases refer to records themselves, not a record hanging on a tag.)

It's true that not being able to make a type alias opaque without wrapping it in a union type is slightly annoying. Arguably there could be a language feature for this but I don't think it's a big deal. The other thing that could change is purely a performance optimization, and that is that when a union type has only one tag which carries one type (like my animation library), the wrapped type can be passed around unwrapped. This is already on Evan's radar.

Union types, records, type aliases, opaque or exposed.... these are tools. Use them as best fits the problem. If programmers "default" to a certain pattern, its because it works well, and so there's not much value trying to disrupt it. If a language change made a pattern less useful, people would notice and adapt. This would start in code, not in discussions. At the very least, we'd see "look at these two approaches, see how much better the newer one is" posts, with sample code. Let these things happen organically rather than force a philosophy.
 
Reply all
Reply to author
Forward
0 new messages