How to talk about ADTs in a post ADT world

1,155 views
Skip to first unread message

Evan Czaplicki

unread,
Oct 14, 2014, 6:30:03 PM10/14/14
to elm-d...@googlegroups.com
We recently made the decision to go with the following syntax for declaring new types and creating type aliases:

type Maybe a = Just a | Nothing

type alias Point = { x:Float, y:Float }

You can look around for other threads discussing the various tradeoffs that led to this, but when talking with Richard the other day, the question arose, "How will we talk about ADTs when this is the syntax?" We need to call them something but have not actually decided yet. Here is what we arrived at.

Proposal: The new term could be "type union"

I created a Q&A document to experiment with what this will feel like in practice.

Please check it out and let me know what you think!


P.S. If you don't like the term "type union" try answering the Q&A questions with some other term and see if it goes more smoothly. Does the other term convey what the feature does as clearly? If I see "type unions" as a chapter heading, I have a pretty clear idea of what it might be. Before making counter-proposals, consider concerns like this!

Evan Czaplicki

unread,
Oct 15, 2014, 2:11:07 AM10/15/14
to elm-d...@googlegroups.com
Laszlo just told me he prefers the term "union type". I was skeptical at first, but the internet seems to be on the same team. Perhaps the document reads better using that terminology.

Jeff Smits

unread,
Oct 15, 2014, 2:19:23 AM10/15/14
to elm-discuss
I like union type.
I was leaning towards tagged union, because type union begs the question why the syntax doesn't mirror the name. The only problem for me with tagged union was it hints to implementation detail that shouldn't be relevant. Union type is a nice and more general term. It's also not as obscure as sum type from type theory.

--
You received this message because you are subscribed to the Google Groups "Elm Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elm-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dénes Harmath

unread,
Oct 15, 2014, 7:05:16 AM10/15/14
to elm-d...@googlegroups.com
I also support the new syntax and the term "union type".

Joey Eremondi

unread,
Oct 15, 2014, 7:28:21 AM10/15/14
to elm-d...@googlegroups.com
Union type sounds good. I do like Tagged union, though, since it lets us refer to Constructors as Tags, which might cause less confusion for people coming from the OO world. I feel it's a more accurate description of what a constructor is actually doing.

Alexander Berntsen

unread,
Oct 15, 2014, 10:04:01 AM10/15/14
to elm-d...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

What's wrong with saying ADTs? It's well-defined, easy to understand,
and used a lot. I don't see a need to make up new, vague and
non-standard words for this. (I agree with renaming the keywords though.)
- --
Alexander
alex...@plaimi.net
https://secure.plaimi.net/~alexander
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iF4EAREIAAYFAlQ+fs0ACgkQRtClrXBQc7WZ8AEAnjhvbw2IED8pU3Q9nQyoG+1R
bSLOhH6f86EidVYhsBMA/1999T5hOTkaJPFfecF8+ms15hew/mKr6rxuxOZJ7S9L
=YmUG
-----END PGP SIGNATURE-----

Mark Wong-VanHaren

unread,
Oct 15, 2014, 10:26:36 AM10/15/14
to elm-d...@googlegroups.com
Me, too. I prefer this syntax over Haskell's.
-m


El dimecres 15 d’octubre de 2014 13:05:16 UTC+2, Dénes Harmath va escriure:

Joseph Collard

unread,
Oct 15, 2014, 10:38:19 AM10/15/14
to elm-d...@googlegroups.com
My main complaint with union type is that the constructors within cannot be referred to in type signatures. Each of the constructors of T can only be referred to as a T in a type signature. When I hear union type I think of rackets union types: http://docs.racket-lang.org/ts-guide/types.html#%28part._.Union_.Types%29

I agree with Alexander that I don't understand why ADT is not being used.

Joey Eremondi

unread,
Oct 15, 2014, 11:17:19 AM10/15/14
to elm-d...@googlegroups.com
Well defined: not necessarily. Is a record the same construct as an ADT? It is in Haskell, but not in Elm. Are Product types included? You'd think that both sums and products are included in the algebra of data types.

Easy to understand: To someone seeing them the first time, the connection to algebra isn't necessarily clear, especially if that person still thinks of algebra as "solve for x," which is common for people who don't take upper-level abstract mathematics.

Used a lot: not outside of the Haskell community?

Plus there's acronym collision with Abstract Data Type, which most people learn about long before they touch functional programming.

Paul Chiusano

unread,
Oct 15, 2014, 12:27:31 PM10/15/14
to elm-d...@googlegroups.com
Evan, could you clarify in what context you're talking about? Is this just in Elm standard documentation or are you saying that you'd like everyone who talks / writes about Elm to use the same terminology?

People using Elm will be coming from different contexts and writing / speaking for different audiences. If writing for an audience already familiar with FP, using the more standard terms of art like algebraic data type can be entirely appropriate. If writing for newcomers, then I can see wanting to use a different term at first. Personally, I think it is good to at least mention the commonly used terms of art, since if the reader / learner is bound to encounter these terms elsewhere and it's good to be able to tie what they've learned about in Elm to the wider world.

Paul :)

Dobes Vandermeer

unread,
Oct 15, 2014, 2:00:13 PM10/15/14
to elm-d...@googlegroups.com
Hi,

Common usage of "union type" generally refers to a "tagless union type" or "true union type".  These types are distinct from tagged unions in that instead of pattern matching on them, the operations available on them are limited to the ones that apply to all members of the union (ignoring C).  I don't think any language bothers with these much because they aren't as useful as tagged unions.  In languages like Ceylon and Racket you can still using casting to figure out the real type, making them tagged "in practice".  In C you are generally supposed to add your own tag, unless you're using the union to read one type as if it was another type based on a similar memory layout.

Do recall out that the term ADT actually applies to type sums/unions AND products (tuples/records).  Now, it so happens that the syntax we have been using creates a sum type first (the list of tags), and a product type inside each branch of that sum (the fields of each tag).  A type with single tag is just a product type.  So that's why this syntax is referred to as ADT syntax - it creates both sum/union and product/tuple types.

That said, I'm fine with calling this a "union type".  The untagged union isn't very useful in this kind of language.  And tagged product types as created by this syntax aren't as good as using an alias which gives a kind of untagged product type.

Cheers,

Dobes



Sean Corfield

unread,
Oct 15, 2014, 3:04:14 PM10/15/14
to elm-d...@googlegroups.com
I think "union type" works better in that Q&A, having read through it all with both terms.
signature.asc

Evan Czaplicki

unread,
Oct 15, 2014, 4:51:54 PM10/15/14
to elm-d...@googlegroups.com
Okay, so consensus seems to be around "union type" which I think is quite nice! I'm not going to say final decision yet, but lets say this is the plan pending some unforeseen issue. It'll go final once the new type syntax gets released.

Community Standard? Yes!
In terms of context, this would be the standard term in Elm. When talking about something like Maybe or List on the list or in a blog post or anything that's addressed to "Elm users or future Elm users", we'd call them union types. If the target audience is /r/haskell specifically, then yeah, address that community specifically.

I was going to point out some Rust documentation that says "We use enums (also known as tagged unions or algebraic data types)" but it looks like they dropped that in favor of just calling them enums. I think it's fine to point out the connections, but I'd like to have a standard term within the community. I think the ideal way may be to just say "union type" and hyperlink it to the revised version of these docs which will point out the connections to ADTs and tagged unions in a tasteful way.

The point of using a new term is to make it easy to learn, so if we use a bunch of different terms within the community it undermines the primary goal.

Why respect the community standard?
Part of why I created Elm is so that I could reset the culture around ML-family languages, hopefully making something more welcoming to people who aren't in the club or didn't go to a particular university.

Part of reseting culture is how we talk about things, so I'd hope that people interested in Elm's success (or at least respectful of the community or the goal of making an ML-family language mainstream) would respect these choices and see how our experiments play out. If we don't like the results of an experiment, we can try something else! The type syntax was chosen with such an eventuality in mind. A lot of language decisions around types are made with this in mind.

I also think that breaking from tradition is not as big a deal as Haskell people like to think. F# does it quite a bit with quite good results within the .NET world. Furthermore, Elm is not for Haskell programmers, it is for programmers. I think about the fact that there are 9+ million professional Java developers and who knows how many millions of Python and JS developers. I think about the fact that there are tons of Clojure and Erlang people that have a super easy time switching to the mindset of Elm, but are skeptical of types because they "add complexity". I think about the fact that OCaml and Haskell are both 20 years old and still niche. I think about how many programmers who either don't want to learn Haskell or flat out have never even heard of it.

If Elm is going to be big, the bulk of our future users and community members will not have experience with any other ML-family language. For these future users, a lower barrier for them is a huge deal. Saving five minutes or an hour or a day is the difference between a new ML-family friend and an alienated user. To lower that barrier, we have Haskell people pay the minor fee of remembering that "union type = adt" which is 4 seconds for them that probably enhances their understanding anyway.

"What about all the Haskell and OCaml docs?" The number of people who learn Python by reading Ruby resources is not very big. I don't see why it will be different for us.

For arguments like "How will they read all the academic literature?" I think it's pretty straight forward. When people reach a level where they want to learn more, they learn that "union type = algebraic data type" and can read whatever they want. Super simple! If they never reach that point, they never have to pay that cost. This is like applying the concept of laziness to learning. Only pay for what you need right now. Furthermore, I think the number of people who reach the point of "I want to read academic literature on ADTs" is going to be super small.

I don't know if it was necessary to write this, but I feel compelled! I feel really passionately about taking accessibility seriously and I would be really sad to see us lose an opportunity to experiment and maybe find a better way! Community conflict or hijacking really is one of my biggest fears, so I think I respond with too many words when something hits on fears like that :)

Joey Eremondi

unread,
Oct 15, 2014, 5:07:17 PM10/15/14
to elm-d...@googlegroups.com
Evan, I feel like you should make that last message a stick or "read first" for discussions, it summarizes the Elm position quite nicely!

Sean Corfield

unread,
Oct 15, 2014, 5:17:34 PM10/15/14
to elm-d...@googlegroups.com
On Oct 15, 2014, at 1:51 PM, Evan Czaplicki <eva...@gmail.com> wrote:
Why respect the community standard?
Part of why I created Elm is so that I could reset the culture around ML-family languages, hopefully making something more welcoming to people who aren't in the club or didn't go to a particular university.

As someone who did heavy Comp Sci at university (in the early 80's), and did a PhD on functional programming language design and implementation, and who loved Haskell when it appeared and truly hoped it would go mainstream... I can only say a huge "+1" to this line of thinking! Thank you!

I think Elm is beautiful and the more people it can reach, the better - so the fewer terminology barriers there are, the better too.

Sean Corfield -- (904) 302-SEAN
An Architect's View -- http://corfield.org/

"Perfection is the enemy of the good."
-- Gustave Flaubert, French realist novelist (1821-1880)



signature.asc

Richard Feldman

unread,
Oct 15, 2014, 6:28:43 PM10/15/14
to elm-d...@googlegroups.com
Well said! Personally I keep coming back to the reason I know so many people who would love to write Haskell all day and instead write Java or Scala all day: pure functional programming has an adoption problem.

Solving it means we can actually find reliable work coding all day in languages we like, so I am strongly in favor of taking serious steps to solve it. I've heard many newcomers to FP cite frustration with the terminology barrier as a major pain point, so that definitely seems like an area worth improving. +1 for "union types" - here's a fork of that Gist with the updated terminology: https://gist.github.com/rtfeldman/47afd10c12985bb80418

Naturally this list is selected for people who have already climbed over the current barriers newbies face, but that doesn't mean we can't ease the process for the next generation of functional programmers--to our own benefit down the line!

Max Goldstein

unread,
Oct 15, 2014, 10:28:28 PM10/15/14
to elm-d...@googlegroups.com
Love the manifesto! I agree it should be preserved somewhere.

I'm fine with union type and the syntax.

I agree there is a lot of baggage in Haskell and I don't disagree with the attempts to quarantine it. But the other side is just as bad. JavaScript sounds to me like they took every technical word from OOP, threw them in a blender, and hit liquify. Document object model? Function prototype? Functions are objects which can be used as constructors? Bind to an element? The this context? There are plenty of web-devs who have accepted "this is the way things are" but I think there are also a substantial amount who want the sanity of actual semantics, consistent terminology, an abstraction around events (signals), static typing (production teams use JShint and CodeClimate anyway), and the power of the time-travelling debugger. As opposed to a world where array equality and object empty checking are non-trivial and "  \n" == 0. (See also this.)

Bret Victor - I know I'm not the only fan here - talks about tools and platforms. A tool is what the developer (or creator, generally) controls and uses to design the message being sent. A platform is what the recipient is expected to already have, in order to translate from what is received into what the user wants. He blames the flaws in CSS (but I'd add HTML and JS) on trying to be both the tool and the platform. There's been some attempts to have abstractions around JS (CoffeeScript, TypeScript), CSS (SASS, LESS), and even HTML (HAML) but they're all syntactic wrappers, aka leaky abstractions. Elm is the first true tool for webdev, completely independent of the incidental HTML5 platform, that I've encountered. (BTW, I highly recommend setting aside two hours and reading Magic Ink in its entirety.)

It's not enough to just say "it's a union type". I think we need to agree on names for every part of the declaration. I'm happy if we decide on other terms than these, as long as each part of the declaration has a name. These wouldn't necessarily be an explicit part of a tutorial, just something to use consistently as we write them. Anyway, for example:

type Response a = Success a | Waiting | Failure Int String

Response is the type name or just type, in the same category as tokens like Float.
a on the left is a type parameter.
Success Waiting Failure are tags. (Far better than ML's "value constructor", a mangling of OO even worse than JS!)
a (on the right) Int String are type arguments of a tag. When the union is created,* the terms having these types are just arguments.

* I'm trying not to say "instantiated", which would explain why they're called "constructors". Wow. (Besides Elm, I'm a Ruby programmer and therefore an OO purist.)



Evan Czaplicki

unread,
Oct 16, 2014, 4:01:54 AM10/16/14
to elm-d...@googlegroups.com
I moved that stuff over to a page on the website, though I'm not sure where it should go ultimately. Learn page? Community? New tab entirely? I also got excited about having a nice name for union types, so I finally started fixing up the documentation that I've been telling everyone I am just about to improve :P

During that process, it seemed really natural to call things like Just and Nothing tags. "You can put a bunch of types together, but you must tag them so we can still tell the difference." Connects up to the tagged union idea really nicely.

I'm less certain of the other terms though. I'm about to say some stuff, and we'll see how it sounds.

Each tag corresponds to a value or function of the same name. If the tag hold some types, it corresponds to a function takes those types as arguments.

Just : a -> Maybe a
Nothing : Maybe a

Maybe the term "tag type" is useful? Each tag is associated with a particular sequence of types. Those are the tag types?

What is the name of the thing in blue here? (type Either a b = Left a | Right b) Or the green part? We are tagging something, but what is that something? Is each possibility a "case"? Maybe it's a "tagged type" or a "tagged value"?

I do not know if it is good or bad that functions take arguments and types take parameters. Maybe it's good to use a different word? I don't really know. Parameters seems nicer, but that's because I'm used to it I think.

I also wonder if we can go too far in naming each part? I'm not taking a side on this, but maybe there's such a thing as over-specifying? Or maybe it's just hard and that's a convenient excuse? I can imagine that diving into weird terminology on this would actually be less effective than just showing some examples, but that doesn't mean the names shouldn't exist. I don't know really, I think I should go to bed and try again tomorrow!

Jeff Smits

unread,
Oct 16, 2014, 5:06:34 AM10/16/14
to elm-discuss
Good to know the manifesto is being preserved. I think it should go on the community page or in a dedicated tab.

Tag is a good name, although I never disliked constructor because then pattern matching is the deconstructing operation. Also calling a Success an instance of a Response sounds right to me.
Parameter vs argument: When I read parameter I think formal parameter of a function in the function definition. Using argument for use and parameter for definition sounds more reasonable to me than using the two words to distinguish type/value level.

Over-specifying could be a thing..
I think it would be good to have a thesaurus for terminology used in Elm. There we can say stuff like:


union type: The preferred name for types like Maybe below:
type Maybe a = Just a | Nothing
Also called Algebraic Data Type (ADT) in other functional programming communities, or sometimes called Sum Type in type theory (although they are more than that).
The right hand side capitalised words, Just and Nothing, are called Tags. The small letters a can be called arguments or parameters.

Algebraic Data Type: A name used in some other functional programming communities. In Elm please use union type.

ADT: Abbreviation for either Algebraic Data Type or Abstract Data Type. More likely the former in context of functional programming communities.

Tag: The identifying part of one Case of a union type. Used like a function to create a value of the union type in question, and as a pattern to match for that case of the union type.

...

Argument: May also be called parameter. The variable part of Functions, Tags and Types that may be supplied at the time of use.


Every occurrence of these things would of course be cross-references. Note that Argument in this example doesn't specify a preference, but every mention of union type does. Perhaps a wiki format would be nice for setting this up?

Iain Ballard

unread,
Oct 16, 2014, 10:55:50 AM10/16/14
to elm-d...@googlegroups.com
I'm going to have a go at elaborating Max's naming. I agree with Evan that we don't need to strictly define names too deep, but hints could be useful (for consistency in Elm's docs if nothing else)

Treat this as a straw man and point out any bad terms or mistakes :-)

type Choose = Red Green | Purple
    a type with three options

type Response a = Success a | Waiting | Failure Int String
    a generic type with three options, with tags Success, Waiting and Failure.
    Success has a type argument a shared with the Response type. 

type alias x = Success Int
    x is aliased as a concrete Response, with tag Success can hold a value of type Int.

y = Success 3
    y is defined as a concrete Response, with tag Success carrying a value of type Int.

y = Failure 3 "Kaboom"
    y is defined as a concrete Response, with tag Failure carrying values 3 and 'Kaboom' in the type (Int, String) 

case maybe of 
    Just x -> 1 + x
    Nothing

Pattern matching over a Maybe (called maybe), with cases of Just x and Nothing.
Just x is an instance of Maybe with tag 'Just' and carrying an Int named 'x'
Nothing is an instance of Maybe with tag 'Nothing' with no values.

Paul Chiusano

unread,
Oct 16, 2014, 11:41:59 AM10/16/14
to elm-d...@googlegroups.com
Hi Evan,

First of all, I think it is a very worthy goal to try to make sure Elm is as accessible as possible to newcomers. :)

I want to push back a bit, though -- not everyone using Elm will have a goal of evangelizing the language or driving its adoption. And even if they do, not everyone will agree on the best way to make Elm more accessible. Diversity of opinions and approaches is okay, but more than that, it's just how things are, and I think that's something to accept and work with, rather than try to fight against and control.

For instance, some people will be using Elm more as a tool. There's nothing wrong with that perspective, and when people are using a tool they often want to talk about their experiences using the tool. In doing so it is entirely appropriate that they talk about their experience in whatever way, and using whatever terminology and so on they feel is best. I actually think it is overstepping boundaries a bit to even ask that people change how they communicate in spheres that aren't "official" Elm venues. The most you can reasonably do is lead by example and argue passionately for your perspective. In doing so, you may change people's minds about what the best way is to communicate. Or you might not.

For me personally, I like blogging about software development in general, the things I learn, observe, etc, independent of the tool. In fact I don't like getting too attached to any one tool. Sometimes I might write about Haskell, sometimes Scala, other times I might want to write about Elm. The audience is not any one particular language community, it's whoever is reading my blog, and people might be coming from all sorts of perspectives. Changing how I communicate in that venue because some readers might be current or potential Elm users would be strange--I'm not acting as an Elm spokesperson on my blog. I have to communicate in ways that I find authentic. Of course, no one is obligated to read what I write, and others can certainly express their opinion that I'm not talking things in a good way, if they feel that way.

Paul :)

Sean Corfield

unread,
Oct 16, 2014, 12:57:31 PM10/16/14
to elm-d...@googlegroups.com
There's an important point here: if your audience is already familiar with ADTs, Monads, etc, then you can continue using those terms (and just note that "Elm calls ADTs union types") when talking in your own forum (blog etc).

But the intended audience for Elm - the bigger, future audience - is people who are not familiar with those terms and so Elm's choice of "union type" is easier for them.

As you note "not everyone using Elm will have a goal of evangelizing the language or driving its adoption" and indeed the vast majority of a language's (future) users fall into that category and, again, aren't going to be people with the comp sci / FP background that supports those more specialized terms.

As Elm grows, I would expect to see two mailing lists: elm-users - where "union type" etc is common - and elm-developers (or maybe this elm-discuss list will morph into that?) for people who are working on core libraries and the language itself and where precision is more important.

The Scala community talks about "levels" of Scala users. Clojure's community also has two "levels" of users - the 'dev' folks working on the language and libraries and the main users building applications etc.

I think it's a natural split (Scott Meyers used to talk about three levels of C++ user decades ago: private, protected, and public).

Sean
signature.asc

Dobes Vandermeer

unread,
Oct 16, 2014, 1:38:42 PM10/16/14
to elm-d...@googlegroups.com

For the elements of a type union, I've seen variants, cases, and alternatives as good terms.  I prefer variant myself.

Arguments and parameters are basically synonymous, I think it's a bad idea to actually try to give these distinct meanings since they are already used interchangeably most everywhere else.  Although people seem to use the term parameters almost exclusively for types, maybe because they are called "parameterized types", functions are in other fields also considered parameterized and it should be OK to use the term there as well.

Cheers,

Dobes

Max Goldstein

unread,
Oct 16, 2014, 9:17:43 PM10/16/14
to elm-d...@googlegroups.com
Arguments and parameters: I agree with Jeff, in proper usage you supply arguments at the call site and parameters are used in the definition. But in common usage, they're not treated so precisely. I think we can say that Maybe has a type .... ah, screw it. When it's a type, it's a parameter, and while the docs should be consistent we're okay with calling it an argument. (Picking terminology that requires us to be pedantic runs contrary to our goals, and can start many ... wait for it ... arguments.) I would consider a type that takes a parameter to be a parameterized type. I like this better than "generic" because it seems more specific and concrete; it describes how the type works rather than what it is.

I like Iain's word carry. I'd say that a Success carries a term of type a (in general, speaking in types) and that the Success 3 (this particular one) carries a three. I would consider a to be the type parameter and Success a to be the parameterized type. It's important to note that a tag that carries multiple terms (hmm...), such as Failure, does not store them in a tuple, it just carries multiple terms. In Success "Yay", "Yay" is the carried term. If there were more, we could say the first carried term. (Is carried too similar to curried?)

I do like staying away from "constructor" and "instance" though. In my mind, OOP requires that you call methods on objects, rather pass values into functions. There are no "free floating" functions; all methods are attached to an object, avoid JS's issue with this which can be rebound and then break things. It's not bad that Elm isn't OOP, it's just different, and we should leave OOP terms alone.
Reply all
Reply to author
Forward
This conversation is locked
You cannot reply and perform actions on locked conversations.
0 new messages